Articles posted July 2007 |
New Computer
I recently upgraded to a new laptop, a Dell Latitude D830 (Santa Rosa chipset) running a 2.4GHz Core 2 Duo with 4GB of RAM and a new DX10 video card, the nVidia Quadro NVS 140M. I went ahead and installed Vista 64-bit on the machine, in order to be able to take full advantage of the 4GB of RAM. Since I do most of my MAME development on this machine, compliation speeds for building MAME are of utmost importance.
Below is a summary of complete build times for MAME using both my previous laptop (Dell Inspiron 8600 running a 2.0GHz Pentium-M with 2GB RAM) and the new one. I tested both debug and optimized builds using both Visual Studio 2005 and gcc.
Old System | New System | Improvement | |
---|---|---|---|
CPU Cores | 1 | 2 | +100% |
RAM | 2GB | 4GB | +100% |
Build with VS2005, DEBUG=1 SYMBOLS=1 | 9:38 | 4:15 | -56% |
Build with VS2005, PM=1 | 11:27 | 4:42 | -59% |
Build with VS2005, PM=1 MAXOPT=1 | 20:48 | 10:51 | -48% |
Build with gcc, DEBUG=1 SYMBOLS=1 | 10:20 | 5:02 | -51% |
Build with gcc, PM=1 | 17:05 | 6:49 | -60% |
Since the new machine is a dual-core system, I was curious to find out if running a make with -j3 was better than running with -j2. I had heard that it is best to run make with a -j parameter equal to the number of CPUs you have available, plus 1 to help take advantage of extra CPU power while waiting for I/O to complete on other threads. Using the Visual Studio 2005 SP1 build as a test case, it turns out that -j3 does win slightly over -j2, producing a complete MAME build in 4:15 (-j3) versus 4:25 (-j2). Interestingly, when I tried this on my old single-core machine, it was slightly slower to use -j2 versus -j1.
I was also curious to see how much slower it was to do an optimized build using the Visual Studio compilers. A basic optimized build is not that much slower than an unoptimized build (4:42 versus 4:15). However, turning on link-time code generation -- which reduces the final output EXE size by a large amount and provides a marginal additional speedup -- more than doubles the build time to 10:51.
Comparing Visual Studio builds to gcc builds, it seems that the difference is not as pronounced as the last time I measured it (5:02 versus 4:15 for the debug build). This is partly due to the extra overhead of Visual Studio 2005, which is noticeably slower than Visual Studio .NET 2003, which I haven't tried installing under Vista. In the end, it doesn't much matter since debugging with Visual Studio is a huge win over trying to debug with gdb, so I'd still use it a lot even if it were slower!
Concert Review: Rush
It's hard to believe I used to not like these guys. Back when I was a kid, my best friends were all aspiring rock musicians (I was too for a while). And if you were an aspiring rock musician in the late 70's/early 80's, you just had to be into Rush. I think I was mostly just trying to be different, but I resisted all attempts by my friends to get me hooked on the band. Then I went to college and met this very nice girl — who was totally into Rush. Well, geez, once a girl is involved everything changes. (And seriously, how many girls are really into Rush?)
I caved.
I spent most of my college years getting acquainted with their back catalog, listening to them while I shelved books at the Crerar Science Library. I think I ended up being a bigger fan than even the girl who finally turned me onto them (turns out I married her).
So, with that background out of the way, Vera and I travelled down to White River Amphitheater to check out Rush on their latest tour in support of the Snakes & Arrows album. It was quite the ordeal to get down to the venue on a Friday night, heading through some of the worst congestion in the area. Adding to the fun was the weather, which had turned rainy and nasty — thank god we didn't have lawn seats. So by the time we finally got there, three hours after I left to pick up Vera from work, we were really hoping it was going to be worth all the effort.
Thankfully, if there's one thing you can count on, it is an awesome show from Rush. The opened on a decidedly crowd-pleasing note with "Limelight", followed by "Digital Man" and the first surprise of the evening, "Entre Nous". With a catalog as deep as these guys have, it is always a great pleasure to see them not just play the same greatest hits each tour. This time around was especially cool because they ended up playing fully 2/3 of Permanent Waves, which is definitely one of my favorite Rush albums. Another surprise from the first half of the concert was hearing them play "Circumstances" from Hemispheres, another classic I hadn't seen them play live before.
The high point of the first half of the show (or was it the second half?) was definitely "Subdivisions". I can't get enough of that one. Watching it live you are able to really see each band member contribute to the overall song (especially since they had three giant screens above the stage that would often split their focus one for each band member so you could watch them play). This time, I was particularly struck by just how exquisite the drumming is in that song. It is so fundamental to the pacing and urgency of the music and yet it really is quite intricate. I still get chills just thinking about it.
Another high point of the first set was "The Main Monkey Business", the new instrumental from Snakes & Arrows that gave everyone in the band a chance in the spotlight. Finally, they closed the first half with "Dreamline" off of Roll the Bones (another personal favorite) and left for a 25-minute intermission.
I noted to Vera during the intermission that they had so far only played a couple of tracks from their new album. They must have heard me mentioning it, because when they picked up again after the intermission, they immediately launched into 5 straight songs off of that album. I was particularly struck with how concert friendly they all were. They all had great hooks and kept the crowed interested, even though I suspect that a lot of people weren't yet that familiar with them. In particular, "Far Cry" was a great opener for the second half, and "Spindrift" rocked.
After all that new material, the band decided to trot out of a couple of classics that I don't think they've played as much recently. First, we got an incredibly intense rendition of "Natural Science" from Permanent Waves, which was the closest we got last night to any of their monumental multi-part songs of the 70's (yes, there was no "2112" or "Xanadu" in evidence). And then we were treated to a nice pyrotechnic opening to "Witch Hunt" from Moving Pictures, which was an unexpected but welcome surprise.
Neil's drum solo was next. Having heard many of the same bits and pieces over the years in concert and on live albums, I was expecting it to be pretty uneventful. Thankfully, I was wrong. It was pretty much a completely new set and further elevated my opinion of his awesome abilities. When you watch him (or any of the guys in the band) play, it's hard to believe they're all 50+! After the drum solo, Alex got a chance to solo by playing his new piece "Hope" from Snakes & Arrows on acoustic 12-string.
Heading into the final stretch, we got "Distant Early Warning" from Grace Under Pressure with an awesome light show. And then you knew that "Tom Sawyer" was coming on to finish things up, but we got an unexpected and hilarious opening to that song thanks to the kids from South Park. (I suspect that must have come about due to Geddy and Alex playing "O Canada" for the South Park movie.) For an encore, we got "One Little Victory" from Vapor Trails, followed by another surprise, "A Passage to Bangkok" from 2112, and then "YYZ" finished things off.
In the end, it was over 3 hours of music. The band was having a great time playing together, and sounded incredible. The crowd was really into it. Sure there were some songs I would have loved to see them play (one guy on the bus back just wouldn't shut up about them not playing "Working Man"), but at the same time I can't think of anything from the show I would have wanted them to leave out. You can tell these guys are complete pros who know how to put on a great show.
Oh, and since I complained about poor merchandise in my last concert review, let me state that these guys got my money. In fact, there were so many cool designs it was hard to pick just one. I could tell by watching some of the other people that I was not alone in this opinion. :)
Overall, a truly awesome experience.
The Universal Platform, Part 2
In the previous article, I described how the Galaxian video hardware was designed around the concepts of a tilemap and sprites. This article goes into more details about how the hardware renders the tilemap, and where column scrolling fits in.
To recap: a tilemap is a two-dimensional array of memory that describes how the video system displays the screen. Each "tile" in a tilemap is a fixed size (traditionally 8x8 pixels) and so to cover a 256x256 pixel screen, you need an array of 32x32 tiles.
Now, on the Galaxian video hardware, the visible area of the screen is actually smaller than 256x256 — some of the top and bottom pixels are "blanked" to reduce the overall screen height to 224 pixels. This doesn't affect the underlying tilemap, which is still 32x32 tiles. But it does mean that some of those tiles are not visible.
Now think about how the video signal is transmitted to the monitor. First, keep in mind that the video signal is generated one row at a time from top to bottom, left to right, in order. This means that in order to make the tilemap visible on the screen, the video hardware must, at each pixel location, look up which tile is specified by the tilemap RAM corresponding to that pixel (this is known as the tile index). Once it knows which tile index to display, it then must look up the actual tile graphics data in the tile ROMs, and output the appropriate pixel from the 8x8 tile graphics.
This sounds like a lot of work, and it is. In fact, it's too much work to really do all of that on each pixel. So the hardware was optimized to be able to do it in realtime, by making some fundamental assumptions.
On Galaxian hardware, each pixel of each tile can be one of four colors, requiring two bits of ROM data. The data for graphics can be stored in many different ways, but on Galaxian, they opted for a logical arrangement where each bit is stored in a separate ROM, and the bits are ordered in the same left-to-right, top-to-bottom order that the screen is rendered. The advantage of storing the graphics this way is that each row of tile graphics (remember, the tiles are 8x8 pixels) is exactly 8 bits wide, or one byte. And you can read one byte from a ROM all at once.
Thus, on each scanline, the Galaxian tilemap hardware gets away with only looking up the tile index from the 32x32 tilemap RAM once every 8 pixels. It then uses that tile index to look up the appropriate row of tile graphics data once every 8 pixels, reading an entire row of data from the tile ROMs in one operation. Then, over the course of the next 8 pixels, it spools the data out one bit at a time from an internal data store. While it is doing that, it is also taking the time to figure out which tile is coming next so that it can immediately start outputting that tile once the current tile is finished.
So, we are down to only needing to read tilemap RAM once every 8 pixels as we work our way across a given scanline. Further, we only need to look up the tile graphics once every 8 pixels as well. In order to do this, we compute which entry in tilemap RAM to look up by dividing the X and Y coordinates by 8 and rounding down, so that we fetch the tilemap entry at (X/8, Y/8). Once we read the tile index from RAM, we then need to look up the tile graphics from the appropriate row in the tile ROMs. The row number is just the remainder from dividing Y by 8. For example, if we are at pixel location (48, 17), then we would fetch the tilemap entry at (48/8, 17/8), or (6, 2). And we would fetch tile graphics from row number (remainder(17/8)) = 1 of the tile.
Because of the way this works, the hardware designers realized that it was very easy to allow you to specify a number (let's call it the vertical scroll value) to be added to the value of Y before looking up the tile index and tile graphics. Doing this adds just a little bit of hardware, but gives you the ability to control the vertical scrolling of the tilemap. Take the example above again, but this time, let's add a vertical scroll value of 1 to all the Y coordinates. We are still at pixel location (48, 17) on the screen, but we will add 1 to 17 and use Y=18 in our calculations. So we look up the tilemap entry at (48/8, 18/8), or (6, 2) yet again — same as last time. But when we compute the remainder of 18/8, we get 2 instead of 1, meaning that we will display row 2 of the tile. This has effectively produced a scrolling effect of shifting that tile upwards by 1 pixel.
The first question you might ask is, what happens when you have Y=255 and you add 1? You will end up with Y=256 and then where will you access the data for the tilemap? Without going into the finer details, the answer is that the value of 256 "wraps" around back to 0. Thus, if you slowly increase the vertical scroll value by 1 each frame, the screen will scroll upwards one pixel each frame and you will eventually see what used to be at the top of the screen appear at the bottom. This is due to the wrapping effect, where values above 256 have 256 subtracted from them. Because of this wrapping, it is often true that you don't want to have the entire tilemap visible, because once you scroll, you will immediately see the top appear at the bottom. This is why it is nice to have the extra non-visible tiles in the blanked out region of the screen.
Now, having a vertical scroll value for the whole tilemap is pretty nifty, but on Galaxian they went a step further, and allowed you to specify a different vertical scroll value for each group of 8 pixels across. The reason this works well is clear if you look once again at what the video hardware is doing. It has to look up a new tilemap index and new tile graphics for each 8 pixel group as it scans horizontally across as scanline. Since it has to do these computations each time anyway, it doesn't add much complexity to pick a different vertical scroll value for each group, and it gives you a lot of flexibility. This is called "column scroll" because each column of the tilemap can have its own independent scroll value.
However, even column scroll on a tilemap doesn't really provide enough flexibility for a complete game. So, in part 3 of this article, I'll dive into the sprites.