Embed metadata into AVI header, using ffmpeg

Embed metadata into AVI header, using ffmpeg

Unfortunately, it's rarely used, but even the AVI container offers ways to embed metadata directly in the file.

I'm using the tool 'FFmpeg' to do this. According to the FFmpeg documentation, the syntax is quite straightforward:

Code: Select all

-metadata key=value
You need to pass the full '-metadata' argument for each key/value pair.
Here's an example:

Code: Select all

ffmpeg -i input.avi -vcodec copy -acodec copy -metadata title="The Title" -metadata=encoded_by="Hans Stahl" output.avi
== You *must* copy the whole file:
Unfortunately, the current FFmpeg implementation cannot simply update the header data of an existing AVI file. It must read/write the whole file at least once, in order to embed the metadata.

== Available metadata fields?
According to FFmpeg's AVI-format source code, FFmpeg’s AVI muxer honors the following metadata keys and maps them to these FourCCs in the file header:

Code: Select all

const AVMetadataConv ff_avi_metadata_conv[] = {
  { "IART", "artist"    },
  { "ICMT", "comment"   },
  { "ICOP", "copyright" },
  { "ICRD", "date"      },
  { "IGNR", "genre"     },
  { "ILNG", "language"  },
  { "INAM", "title"     },
  { "IPRD", "album"     },
  { "IPRT", "track"     },
  { "ISFT", "encoder"   },
  { "ITCH", "encoded_by"},
  { "strn", "title"     },
  { 0 },
(Note: The list in the FFmpeg Metadata article in the Multimedia Wiki is outdated and does not contain all tags)

The above list only contains the metadata fields mapped to a human readable alias. If I read FFmpeg's AVI-format source code correctly, it can handle more fields:

Code: Select all

const char ff_avi_tags[][5] = {
  "IARL", "IART", "ICMS", "ICMT", "ICOP", "ICRD", "ICRP", "IDIM", "IDPI",
  "IENG", "IGNR", "IKEY", "ILGT", "ILNG", "IMED", "INAM", "IPLT", "IPRD",
  "IPRT", "ISBJ", "ISFT", "ISHP", "ISRC", "ISRF", "ITCH",
I couldn't find an official specification of those FourCC attribute names, but thanks to the sourcecode of a "RIFF" handling library, here's a list that looks meaningful:

Code: Select all

     'Archival Location' =>  'IARL', 
     'Artist'            =>  'IART',
     'Author'            =>  'IART', 
     'Comissioned'       =>  'ICSM', 
     'Comment'           =>  'ICMT',
     'Description'       =>  'ICMT', 
     'Copyright'         =>  'ICOP', 
     'Date Created'      =>  'ICRD', 
     'Cropped'           =>  'ICRP', 
     'Dimensions'        =>  'IDIM', 
     'Dots Per Inch'     =>  'IDPI', 
     'Engineer'          =>  'IENG', 
     'Genre'             =>  'IGNR', 
     'Keywords'          =>  'IKEY', 
     'Lightness'         =>  'ILGT', 
     'Medium'            =>  'IMED', 
     'Title'             =>  'INAM',
     'Name'              =>  'INAM', 
     'Number of Colors'  =>  'IPLT', 
     'Product'           =>  'IPRD', 
     'Subject'           =>  'ISBJ', 
     'Software'          =>  'ISFT', 
     'Encoding Application' =>  'ISFT', 
     'Sharpness'         =>  'ISHP', 
     'Source'            =>  'ISRC', 
     'Source Form'       =>  'ISRF', 
     'Technician'        =>  'ITCH' 

Additional references:
Finally, I've found an officially-looking source reference for the metadata abbreviation tags (FourCC attribute names):

On page 90 (pdf-page: 96) of the Exif 2.2 specification paper, Table 29 "INFO List Chunks", shows 23 so called "Channel ID" names a description:

== JEITA CP-3451:
The INFO list chunks currently defined are given in Table 29. These pre-registered chunks are stored as ASCII text strings terminated by NULL (the final byte is '00.H').
== Table 29 INFO List Chunks:
Channel ID / Description
Archival Location. Indicates where the subject of the file is archived.
Artist. Lists the artist of the original subject of the file.
Commissioned. Lists the name of the person or organization that commissioned the
subject of the file.
Comments. Provides general comments about the file or the subject of the file.
Copyright. Records the copyright information for the file.
Creation date. Indicates the date the subject of the file was created.
Cropped. Indicates whether an image has been cropped
Dimensions. Specifies the size of the original subject of the file.
Dots Per Inch. Stores the dots per inch (DPI) setting of the digitizer used to produce
the file.
Engineer. Stores the name of the engineer who worked on the file.
Genre. Describes the genre of the original work.
Keywords. Provides a list of keywords that refer to the file or subject of the file.
Lightness. Describes the changes in lightness settings on the digitizer required to
produce the file.
Medium. Describes the original subject of the file, such as, "computer image,"
"drawing," "lithograph," and so forth.
Name. Stores the title of the subject of the file.
Palette Setting. Specifies the number of colors requested when digitizing an image.
Product. Specifies the name of the title the file was originally intended for, such as
"Encyclopedia of Pacific Northwest Geography."
Subject. Describes the file contents, such as "Aerial view of Seattle."
Software. Identifies the name of the software package used to create the file.
Sharpness. Identifies the changes in sharpness for the digitizer required to produce
the file.
Source. Identifies the name of the person or organization who supplied the original
subject of the file.
Source Form. Identifies the original form of the material that was digitized, such as
"slide," "paper," "map," and so forth.
Technician. Identifies the technician who digitized the subject file.
EXIF metadata in WAV

I just figured out, that these EXIF metadata tags are compatible between AVIs and WAVs:
http://www.sno.phy.queensu.ca/~phil/exi ... s/WAV.html

Extracting the audio of an AVI that contains EXIF metadata into a WAV, using ffmpeg, preserves the EXIF metadata inside the newly created wavefile.
For example, you can use tools like "MediaInfo" to display the contents of the EXIF tags.
