Video per-frame integrity check - verifying losslessness

peter_b · Post by **peter_b** » Thu Jan 06, 2011 1:10 am

[PROBLEM]
Sometimes you want to verify the integrity of your video material when dealing with lossless codecs.
The only way to be sure that not a single bit of visual information is lost, is by decoding each frame into an uncompressed image and verify that.

[SOLUTION]
Using ffmpeg:

Code: Select all

ffmpeg -i video.avi -f framemd5 video.avi.framemd5

NOTE:
The suffix ".framemd5" is chosen arbitrarily to somewhat identify framemd5 files. The resulting framemd5-file is actually a plain text file.

ffmpeg also offers additional checksum abilities, like:

framecrc
md5
crc

NOTE:
The frame-checksum feature of ffmpeg also generates checksums for the audio "frames". As they are not necessarily sample-wise aligned the same way in a remuxed/transcoded file, it might be a good idea to disable the audio stream for image-checksum comparisons:

Code: Select all

ffmpeg -i video.avi -an -f framemd5 video.avi.framemd5

(The switch "-an" means "audio: no")

Using mplayer:

Code: Select all

mplayer -vo md5sum -o video.avi.framemd5 video.avi

(thanks to Diego's post on the ffmpeg-devel mailing list)
Warning: By default, mplayer seems to convert the colorspace to YV12 4:2:0

peter_b · Post by **peter_b** » Mon Aug 13, 2012 4:55 pm

UPDATE:
As the alignment of audio data within each audio frames can change during remuxing, framemd5 cannot be applied to audio without preparation.

Using the audio filter "asetnsamples" helps packing a fixed amount of samples into one frame, therefore making the MD5 sum of each audio frame comparable, too:

Code: Select all

ffmpeg -i video.avi -filter_complex "asetnsamples=n=96000" -f framemd5 video.avi.framemd5

The "asetnsamples=96000" makes sure that only 96000 bytes are stored in each audio frame, therefore the framemd5 line for such an audio frame looks as follows:

0, 96000, 96000, 96000, 384000, 856c852b100fa835447df9dd48a71fe5

Thanks to Dave Rice, who suggested this on the ffmpeg-user mailing list (August 7th, 2012)

peter_b · Post by **peter_b** » Sat Mar 09, 2013 11:14 pm

If you're trying to compare remuxed files, you might have to preprocess the framemd5 textfile.
It contains columns, separated by a comma-character, so we can use "cut" to extract the checkum-column:

Code: Select all

$ cat video.mkv.framemd5 | cut -d "," -f 6 > video.mkv.framemd5-only

peter_b · Post by **peter_b** » Wed Jun 19, 2019 4:26 pm

Based on some inspiration and input from Keenan J. Troll, here's a recipe that does 4 things in one:

Rewrap (without transcoding) to MKV
Adjust (tech-)metadata here too. This is optional, but very handy. The example here shows setting interlacing flags.
Generate framemd5 for video source
Generate framemd5 for audio source

Code: Select all

$ ffmpeg -i bars.mov -map 0 -c copy -flags +ildct+ilme -field_order bb rewrap.mkv \
-an -f framemd5 bars.video.framemd5 \
-vn -c:a pcm_s24le -af "asetnsamples=n=48000" -f framemd5 bars.audio.framemd5

(Note: When the source file contains a timecode track, "-map 0" has to be removed in order to rewrap to Matroska. The start timecode field will be set then though)

peter_b · Post by **peter_b** » Wed Jun 19, 2019 4:34 pm

Keenan J. Troll also suggested some other interesting method for hashing audio - as one. Without grouping samples together. Just one hash for the whole audio block.
His original command included video container structure data in the output.
I've removed this by setting the output format to raw samples (s24le in this example), and explicitly selecting the 1st audio channel:

Code: Select all

$ ffmpeg -i ../bars.mov -map 0:1 -vn -f s24le - | md5sum

Adapt the "-map" parameter to your audio track/channel situation accordingly. For a single audio track (which is most common), the "-map 0:1" will be fine.
Also, if you have 16 bit audio, you might want to use "s16le" instead.

Here's a list of available audio types (See FFmpeg documentation on Audio Types for more information on this):

Code: Select all

$ ffmpeg -formats | grep PCM

DE alaw PCM A-law
DE f32be PCM 32-bit floating-point big-endian
DE f32le PCM 32-bit floating-point little-endian
DE f64be PCM 64-bit floating-point big-endian
DE f64le PCM 64-bit floating-point little-endian
DE mulaw PCM mu-law
DE s16be PCM signed 16-bit big-endian
DE s16le PCM signed 16-bit little-endian
DE s24be PCM signed 24-bit big-endian
DE s24le PCM signed 24-bit little-endian
DE s32be PCM signed 32-bit big-endian
DE s32le PCM signed 32-bit little-endian
DE s8 PCM signed 8-bit
DE u16be PCM unsigned 16-bit big-endian
DE u16le PCM unsigned 16-bit little-endian
DE u24be PCM unsigned 24-bit big-endian
DE u24le PCM unsigned 24-bit little-endian
DE u32be PCM unsigned 32-bit big-endian
DE u32le PCM unsigned 32-bit little-endian
DE u8 PCM unsigned 8-bit

Thanks Keenan for this idea!

peter_b · Post by **peter_b** » Sat Jul 27, 2019 12:17 am

UPDATE:
After asking for input on ffmpeg-user mailing list, I was told that ffmpeg already has a built-in content hash muxer. I only knew about frame-hashing until then

It's actually pretty cool!

Quote Moritz Barsnick:

$ ffmpeg -i input -map 0:a -c:a copy -f hash -
or
$ ffmpeg -i input -map 0:a -c:a copy -hash md5 -f hash -
if you prefer MD5.

Das Werkstatt

Video per-frame integrity check - verifying losslessness

Video per-frame integrity check - verifying losslessness

Re: Video per-frame integrity check - verifying losslessness

Re: Video per-frame integrity check - verifying losslessness

Generate A+V fixity framemd5 in one command/step

Option 2 for audio hashing

UPDATE: use ffmpeg "-hash"