<< Newer Article #234 Older >>

DirectShow, part 2

A while back, when the initial push to start adding laserdisc support to MAME was happening, I discovered that there was something missing in all existing laserdisc captures. If you know anything about the NTSC broadcast standard, you know that a video signal consists of 525 scanlines spread across a 30Hz (actually 29.97Hz) interval. Of those 525 scanlines, 480 are considered to be "visible" scanlines, while the remaining 45 constitute the vertical blanking interval (VBI).

At first, the VBI part of the scan just contained a couple of special signals for synchronization, but over time, people started adding data to them. The most famous example is closed captioning, which is usually encoded as binary data on line 21 of each video field (a single 30Hz frame is sent as two 60Hz fields, interlaced).

Getting back to DirectShow, the way this is handled for video capturing is that the video capture component has a VBI output, which is the raw VBI data, usually sampled at a high rate (2x or 3x the video sampling rate) for improved accuracy. As with everything in DirectShow, there are filters that you can attech the VBI output to, which will decode the line 21 data into binary data, and then further filters that will render the captions as overlays on top of the video stream.

The thing I discovered about laserdiscs, though, is that they encoded several very important bits of metadata on some of the VBI lines. Specifically, line 12 has what is known as a "white flag", line 16 contains some control bits, and lines 17-18 contain information about the current frame number and chapter. Even more importantly, all of this metadata is crucial to the way laserdiscs are controlled and operated in video games.

The problem is, how do you capture that data? Well, you could route the raw VBI data to a file and then attempt to sync it up with your video capture. But an easier way to deal with it would be to simply take the VBI data, reconstitute it as video signal, and render it above the visible video signal, to essentially reproduce a capture of the entire video signal — all 525 lines of it.

Sadly, such a filter didn't exist (or at least I was unable to find such a thing). But thankfully, the DirectShow architecture makes it a fairly simple matter to write your own, which turned into a little side project that I have been working on. Although not quite ready for prime time (synchronization is still a little tricky), the basic filter is now up and running, and we have some actual demo captures with the laserdisc VBI data intact:

(In the current version, I extract the laserdisc metadata and overlay it on the video as well, to see if the information matches up. The frame number that the laserdisc player decoded is in large text at the top, and the information my filter has decoded is displayed on the bottom, with field 1 info on the left and field 2 info on the right.)

In the end, I'll post the source and binary versions of this filter so that it will be possible to get consistent laserdisc captures complete with metadata (on Windows only, though similar techniques can probably be applied to other platforms). And yes, with this problem solved, they should be starting to happen in the coming months.