Articles posted August 2008 |
About Laserdiscs, part 3
As I mentioned previously, laserdiscs directly encode and reproduce the broadcast signal format (either NTSC or PAL). So in order to understand how the information about frame numbers and other details are encoded, it is important to know the basics of the underlying signal format. In this case, I'll talk about NTSC specifically.
An NTSC frame consists of 525 scanlines of data, broadcast 29.97 times per second. As I mentioned before, each frame consists of two fields, which makes each field 262.5 scanlines (yes, it is true that there is an extra half-scanline). Of these 262.5 scanlines, only 240 are normally visible. This is because your classic CRT-based TV set required some time to move the electron beam from the bottom-right of the screen back to the upper-left in preparation for the next field, and 22.5 scanlines' worth of time was decreed to be an appropriate timeframe for this to happen.
These extra 22.5 scanlines each field are known as the Vertical Blanking Interval, or VBI. Traditionally, there is not much of interest transmitted within the VBI area of the field, apart from a few basic control signals that were used for calibrating the picture and synchronizing the timing. Over time, however, people came up with reasons to add useful data to the VBI area. One of the most well-known of these is Closed Captioning, which is encoded on line 21 of each field (the top 22 lines are the VBI, and the following 240 are the visible lines).
Laserdiscs took this a step further, and defined special encodings of their own. Line 11 of each field contains what is known as the "white flag", which is a simple binary indicator of whether the current field is the first field of a frame. Laserdisc players look for this white flag to know how to do a still frame that doesn't consist of a split between two film frames. Since it takes two fields to make one frame, and since many film-based laserdiscs are encoded in a 3:2 cadence, it is not straightforward to ensure this without some additional information.
Even more interesting than the white flag are the "Philips codes" that are encoded on lines 16, 17, and 18 (note that all the laserdisc encodings are on lines different from Closed Captioning, so that laserdiscs could be encoded with Closed Captioning data as well). The Philips codes are special 24-bit digital values that are encoded in the VBI area of each field, and which are used to describe the field. In general, there are up to two Philips codes encoded on three lines, with line 17 generally being a redundant copy of line 18 for added resilience against dropouts or other transmission errors.
So how do you encode a 24-bit binary number into a single scanline of an analog video signal? The answer is to use an encoding technique known as Manchester codes. If you treat bright white as a '1' bit, and black as a '0' bit, then this encoding uses two bits to encode each bit of the code. The reason for two bits is to ensure that the decoded value makes sense, and also to establish a clock for the codes. This works because Manchester codes encode a '0' bit as 10 and a '1' bit as 01. Because of this, you are guaranteed that in the middle of each bit there is a transition from black to white or vice-versa, and this can be used to establish the clock. Once you figure out the clock, then the direction of each transition tells you what the actual value of the bit is.
Below is a picture of the VBI region for a CAV laserdisc. The top few rows are an ugly green because those lines of VBI data were not provided by the capture card. The remaining lines show what the white flag and Manchester-encoded Philips codes look like:

Once the Philips codes are extracted, they can be treated as binary data and evaluated. Unfortunately, it seems as though most of the information about what the various Philips codes mean has been kept relatively secret. However, the most important codes are understood:
$88FFFF | Lead-in code indicates the field is located before the official start of the disc |
$80EEEE | Lead-out code indicates the field is located after the official end of the disc |
$FXXXXX | Frame code specifies the 5-digit frame number (XXXXX) in BCD format |
$8XXDDD | Chapter code specifies the 2-digit chapter number (XX) in BCD format |
$82CFFF | Stop code indicates the player should pause at the current field |
So as the player reads data from the disc, it also needs to detect the white flag and Philips codes, and act on them as necessary. By capturing the VBI data along with the normal active video, we are able to preserve this information, and write a laserdisc simulator that operates off of the same information that the original players did.
About Laserdiscs, part 2
All known laserdisc-based videogames use CAV (constant angular velocity) laserdiscs, primarily because these discs could be manipulated in advantageous ways. Players of the era could seek to specific frames, play at different speeds, still frame, play backwards, and perform all sorts of other operations that were not possible with CLV discs.
The reason this was possible is that on a CAV disc, each track corresponds exactly to one frame of video. There are approximately 54,000 tracks on one side of a CAV laserdisc, giving 54,000 frames of video per side. Now, believe it or not, laserdiscs are actually analog devices, meaning that the signal that is stored on them is analog, and is described exactly by the video transmission data standard of its intended country (NTSC or PAL, depending on where you live). Thus, once the data is read from the laserdisc, it is basically identical to the signal that you would receive over the airwaves for watching regular broadcast TV.
Let's step back a moment and think about how laserdiscs of movies work. A movie is traditionally filmed at 24 frames per second. In contrast, the NTSC broadcast standard (which describes how video is transmitted in North America) describes the TV signal as running at 30 frames per second. Furthermore, each frame is built up of two "fields", which are drawn one after another at a slight offset. This is known as interlacing.
Since there are two fields each frame, and the frames run at 30 per second, the result is that the individual fields are transmitted at twice that rate, or 60 times per second (in reality, it is 29.97 frames/second and 59.94 fields/second, but we'll stick to round numbers).
So the question becomes, how do you reproduce a film, running at 24fps, within the NTSC standard, which runs at 30fps (or more precisely, 60 fields/second)? If you transferred one frame of film to one frame (two fields) of NTSC video, the result would play back at 30fps, or 25% too quickly. If you duplicated each film frame onto two frames (four fields) of NTSC video, the result would play back at 15fps, or 38% too slowly.
The solution that engineers came up with many years ago was to realize that 60 / 24 = 2.5. That is, when you look at it in terms of fields instead of frames, each film frame should conver exactly 2.5 fields of NTSC video. However, since you can't split the video midway within a field, they chose instead to alternate between using 3 fields per film frame and 2 fields per film frame. So the first film frame will be duplicated onto the first 3 fields, while the second film frame will be duplicated onto the next 2 fields, and so on. This is known as 3:2 cadence.
The important thing to realize out of all of this is that there was not always a 1:1 relationship between film frames and NTSC frames (though there often was, for example, when the laserdiscs contained video that was originally produced for broadcast TV). That is, if you wanted to view frame #1000 of the actual film, you could not necessarily just seek to NTSC frame #1000 and end up where you wanted. Instead you actually had to go figure out which NTSC frame corresponded to film frame #1000.
When the laserdisc was designed, this discrepancy was actually taken into account. The designers realized that being able to seek to a given NTSC frame was actually not nearly as useful as being able to seek to a particular film frame, and so they devised a way to encode information about the frame numbers on the laserdisc itself.
Which brings us back around to games. For years, laserdisc emulation has relied on "frame files" and "conversion equations" to determine how to map the film frames (which is what all the seek commands target) to NTSC frames (which is what you get when you capture video). But the information about which NTSC frame corresponds to which film frame was present all along in the video signal itself; we were just not aware of it, nor aware of how to capture it.
In the next article, I'll talk about how this information is encoded.
About Laserdiscs, part 1
Laserdiscs come in two varieties: CAV (constant angular velocity) and CLV (constant linear velocity).
As its initials imply, a CAV laserdisc spins at the same speed no matter where on the disc it is being read from. Since laserdiscs play back at a fixed data rate, this implies that data is packed more tightly near the center of the disc than it is toward the outer edges.
If you remember your basic geometry, the circumference of a circle is directly proportional to the radius (C = K × r). Let's say the disc spins just fast enough so that one rotation holds one video frame. So if you encode one frames' worth of data at r=5, you have to pack it into a linear distance of 5K. If you encode the same amount of data at r=8, you have much more room (8K) to store it in.
In contrast, a CLV laserdisc packs its data at the same density regardless of its location on this disc. This means that in order to read it, a laserdisc player must adjust the rotation speed depending on how far away from the center of the disc it is reading. Compact discs are CLV devices and you probably recognize the disc speed changing as you seek back and forth.
Back to the geometry, in CLV discs data is read at a constant rate (R) per unit of circumference (rate = R × C). Since the circumference is proportional to the radius, this makes the data rate proportional to the radius as well (rate = R × (K × r) = RK × r). This means that at r=5, the disc stores 5RK units of data, and at r=8, the disc stores 8RK units of data.
It's pretty clear that CLV discs pack the data more efficiently than CAV discs. This is because with CLV discs, you can pack the data at the maximum rate the player is capable of reading consistently across the entire disc. Whereas with CAV discs, the data can be efficiently packed at the inner part of the disc, but as you move farther away from the center, it gets increasingly less densly packed. In reality, CAV discs maxed out at about 30 minutes (54,000 frames) of playback per side, while CLV discs generally got about 60 minutes per side.
Given these facts, why on earth would you ever create a CAV disc, when a CLV disc allows you to pack the data more efficiently? Well, because the simplicity of finding information on a CAV disc enabled many special features: still frames, reverse play, slow motion, and — most importantly for laserdisc-based video games — direct access to any frame on the disc by index.
With a CAV disc, each video frame happens to corresponds to one rotation of the disc. Thus, if you want to advance 100 frames ahead, you simply moved the laser up the appropriate amount and read the data there. On the other hand, with a CLV disc, advancing 100 frames involves knowing how fast the data is coming and computing where to seek and how much to adjust the rotation speed to find the target, a much more complicated maneuver.
When it comes to movies, most laserdiscs were produced as CLV, to minimize the number of times you need to flip or change discs. A few special edition and high-end versions of movies were released as CAV, enabling access to nice still frames and other effects.
When it comes to videogames, however, it's all CAV. In the next article, I'll ignore CLV discs entirely, and talk about tracks, frames, and VBI data on CAV discs.