Video per-frame integrity check - verifying losslessness

Linux howto's, compile information, information on whatever we learned on working with linux, MACOs and - of course - Products of the big evil....
Post Reply
User avatar
peter_b
Chatterbox
Posts: 383
Joined: Tue Nov 12, 2013 2:05 am

Video per-frame integrity check - verifying losslessness

Post by peter_b »

[PROBLEM]
Sometimes you want to verify the integrity of your video material when dealing with lossless codecs.
The only way to be sure that not a single bit of visual information is lost, is by decoding each frame into an uncompressed image and verify that.

[SOLUTION]
Using ffmpeg:

Code: Select all

ffmpeg -i video.avi -f framemd5 video.avi.framemd5
NOTE:
The suffix ".framemd5" is chosen arbitrarily to somewhat identify framemd5 files. The resulting framemd5-file is actually a plain text file.

ffmpeg also offers additional checksum abilities, like:
framecrc
md5
crc
NOTE:
The frame-checksum feature of ffmpeg also generates checksums for the audio "frames". As they are not necessarily sample-wise aligned the same way in a remuxed/transcoded file, it might be a good idea to disable the audio stream for image-checksum comparisons:

Code: Select all

ffmpeg -i video.avi -an -f framemd5 video.avi.framemd5
(The switch "-an" means "audio: no")

Using mplayer:

Code: Select all

mplayer -vo md5sum -o video.avi.framemd5 video.avi
(thanks to Diego's post on the ffmpeg-devel mailing list)
Warning: By default, mplayer seems to convert the colorspace to YV12 4:2:0
Last edited by peter_b on Tue Dec 20, 2016 7:24 pm, edited 4 times in total.
User avatar
peter_b
Chatterbox
Posts: 383
Joined: Tue Nov 12, 2013 2:05 am

Re: Video per-frame integrity check - verifying losslessness

Post by peter_b »

UPDATE:
As the alignment of audio data within each audio frames can change during remuxing, framemd5 cannot be applied to audio without preparation.

Using the audio filter "asetnsamples" helps packing a fixed amount of samples into one frame, therefore making the MD5 sum of each audio frame comparable, too:

Code: Select all

ffmpeg -i video.avi -filter_complex "asetnsamples=n=96000" -f framemd5 video.avi.framemd5 
The "asetnsamples=96000" makes sure that only 96000 bytes are stored in each audio frame, therefore the framemd5 line for such an audio frame looks as follows:
0, 96000, 96000, 96000, 384000, 856c852b100fa835447df9dd48a71fe5
Thanks to Dave Rice, who suggested this on the ffmpeg-user mailing list (August 7th, 2012)
User avatar
peter_b
Chatterbox
Posts: 383
Joined: Tue Nov 12, 2013 2:05 am

Re: Video per-frame integrity check - verifying losslessness

Post by peter_b »

If you're trying to compare remuxed files, you might have to preprocess the framemd5 textfile.
It contains columns, separated by a comma-character, so we can use "cut" to extract the checkum-column:

Code: Select all

$ cat video.mkv.framemd5 | cut -d "," -f 6 > video.mkv.framemd5-only
User avatar
peter_b
Chatterbox
Posts: 383
Joined: Tue Nov 12, 2013 2:05 am

Generate A+V fixity framemd5 in one command/step

Post by peter_b »

Based on some inspiration and input from Keenan J. Troll, here's a recipe that does 4 things in one:
  1. Rewrap (without transcoding) to MKV
  2. Adjust (tech-)metadata here too. This is optional, but very handy. The example here shows setting interlacing flags.
  3. Generate framemd5 for video source
  4. Generate framemd5 for audio source

Code: Select all

$ ffmpeg -i bars.mov -map 0 -c copy -flags +ildct+ilme -field_order bb rewrap.mkv \
-an -f framemd5 bars.video.framemd5 \
-vn -c:a pcm_s24le -af "asetnsamples=n=48000" -f framemd5 bars.audio.framemd5
(Note: When the source file contains a timecode track, "-map 0" has to be removed in order to rewrap to Matroska. The start timecode field will be set then though)
User avatar
peter_b
Chatterbox
Posts: 383
Joined: Tue Nov 12, 2013 2:05 am

Option 2 for audio hashing

Post by peter_b »

Keenan J. Troll also suggested some other interesting method for hashing audio - as one. Without grouping samples together. Just one hash for the whole audio block.
His original command included video container structure data in the output.
I've removed this by setting the output format to raw samples (s24le in this example), and explicitly selecting the 1st audio channel:

Code: Select all

$ ffmpeg -i ../bars.mov -map 0:1 -vn -f s24le - | md5sum
Adapt the "-map" parameter to your audio track/channel situation accordingly. For a single audio track (which is most common), the "-map 0:1" will be fine.
Also, if you have 16 bit audio, you might want to use "s16le" instead.

Here's a list of available audio types (See FFmpeg documentation on Audio Types for more information on this):

Code: Select all

$ ffmpeg -formats | grep PCM
DE alaw PCM A-law
DE f32be PCM 32-bit floating-point big-endian
DE f32le PCM 32-bit floating-point little-endian
DE f64be PCM 64-bit floating-point big-endian
DE f64le PCM 64-bit floating-point little-endian
DE mulaw PCM mu-law
DE s16be PCM signed 16-bit big-endian
DE s16le PCM signed 16-bit little-endian
DE s24be PCM signed 24-bit big-endian
DE s24le PCM signed 24-bit little-endian
DE s32be PCM signed 32-bit big-endian
DE s32le PCM signed 32-bit little-endian
DE s8 PCM signed 8-bit
DE u16be PCM unsigned 16-bit big-endian
DE u16le PCM unsigned 16-bit little-endian
DE u24be PCM unsigned 24-bit big-endian
DE u24le PCM unsigned 24-bit little-endian
DE u32be PCM unsigned 32-bit big-endian
DE u32le PCM unsigned 32-bit little-endian
DE u8 PCM unsigned 8-bit
Thanks Keenan for this idea! :D
User avatar
peter_b
Chatterbox
Posts: 383
Joined: Tue Nov 12, 2013 2:05 am

UPDATE: use ffmpeg "-hash"

Post by peter_b »

UPDATE:
After asking for input on ffmpeg-user mailing list, I was told that ffmpeg already has a built-in content hash muxer. I only knew about frame-hashing until then :shock:
It's actually pretty cool!

Quote Moritz Barsnick:
$ ffmpeg -i input -map 0:a -c:a copy -f hash -
or
$ ffmpeg -i input -map 0:a -c:a copy -hash md5 -f hash -
if you prefer MD5.
Post Reply