The multimedia evidence community has been really buzzing the last couple of years in regards to how useful FFmpeg and Libav can be for dealing with proprietary video formats. Both tools are extremely useful in several aspects of a forensic DME workflow. With that said, however, whether it’s FFmpeg, Libav or another 3rd party tool, there are limitations and causes for concern when using them to process proprietary video file formats.
Proprietary formats are...proprietary
Seems like a no-brainer, right? But to truly understand what this means one must first understand what a file format is; it is, in gist, the structure and format of the data within the file. Standardized file formats are usually maintained by a standards body, are defined in intricate detail, and they are published so that any geek who may want to learn more about how, where, and what types of data can be stored inside that file type, may do so relatively easily. Proprietary formats are the exact opposite.
A proprietary format is not published, and may only be truly known in intricate detail by the person or company who created it. In addition to this, many use various means of obfuscating the data. Even the geekiest of geeks may never be able to thoroughly reverse engineer the format to a point where they can say, for absolutely certainty, that they are able to extract all of the data AND metadata from that proprietary format.
Often the posts I’ve read and the discussions I’ve heard regarding proprietary formats and FFmpeg are preceded by a brief comment to the effect of “Most proprietary video file formats contain standard multimedia streams.” Experienced DME technicians and analysts can tell you though, that trusting a DCCTV manufacturer to adhere to standards even when they claim to, is risky business.
Re-wrapping and transcoding
When we use FFmpeg, Libav or any other tool to re-wrap or transcode a proprietary multimedia file, these tools are completely ignoring the container itself. So what are we missing? Metadata for sure, but could we be overlooking other data or even entire streams of data that may be important to the proper playback and interpretation of the data? Yes, absolutely.
What about when we use one of these tools to transcode streams pulled from proprietary containers? Could we miss entire frames or new slice data within a frame? If you’ve ever used FFmpeg or Libav and you’ve seen yellow warning text or red error text in the command line window as it cooks through your file, you know the answer to that question is yes.
FFmpeg is NOT the Holy Grail
Those who know me know full well I’m a huge fan of both of these open source multimedia projects. It is imperative, however, that DME professionals understand the capabilities of the tools they use, as well as the limitations of the media they're analyzing, among other variables.
I can’t tell you how many times I’ve worked with proprietary viewers that display their own data incorrectly or incompletely. My recommendation is always validate your tools for the task at hand. If you’re considering using FFmpeg, Libav or another 3rd party tool to process proprietary file formats, where feasible, your results should be compared to the proprietary application that was intended to present that data. Just a word of caution, my friends. 😎