NES Nametable Compression

I was working on developing some NES games. One of the things I needed was the compression of “nametables”, especially for static title screens that have many unique tiles. To that end I researched how efficient several compression formats were.

Gathering of test data

a NESdev forum user by the name of CartCollector collected a set of tile data called the NESdev Corpus so that others can test out the size and speed efficiency of various tile compression methods. A couple years after that, Tepples posts the quite efficient PB53 codec. So I too had gathered a set of data to test for nametable compression. The set consists of 64 title screens usually with relatively large graphics. I also chose screens constructed not only with background tiles but sprites as well. In the future I'll have to look into how to efficiently store sprite positioning.

Previous nametable compression formats

For NES nametables, PackBits is a commonly chosen algorithm. It's a good choice that performs well because nametables often contains many runs, sometimes even large ones spanning multiple rows of 32 bytes.

If you were to look at the nametable data from NES picture drawing applications and generators, you might notice that there's multiple sequences that count upward by 1. This happens because the generator simply assigns the next available index for each unique tile. Several new schemes were developed that added a command suited for those sequences. File #08 is a good example of a picture that has these types of incrementing runs.

But even with that technique we have hard to encode pictures with long repeating patterns like with file #09. Joel Yliluoma, while working on his translation project, developed a compression scheme that handles repeating runs of 2 bytes well. It was then adapted for Bregalad's Compress Tools project.

Another nametable compression format

I could try out LZSS based algorithms, but the NES is a memory constrained system, and I'm looking for simple formats. Efficiency isn't only about the compressed data size, it's also about the size and speed of decompression code. So considering all of this I decided to design a nametable compression format of my own. I gave this format a uncreative code name of "NTD" short for “Nametable format 'D'”.

000nnnnn ...
32-N literal bytes.
End stream.
00111110 xx yy
Set PPU_ADDR to yyxx *
001nnnnn xx yy
For 32-N times, emit alternately X and Y.
toggle BG transperency. (initially opaque) *
010nnnnn xx
32-N run of X.
01111111 xx
Set BG to X. (initially BG = 0)
011nnnnn xx
32-N incrementing run starting at X.
128-N Run of BG.

*: Not present in lite variant.

Nametable format 'D'

In NTD I assigned most of the code points to runs of a background index, that way I retain the full benefits of PackBits but won't waste bytes by repeating the same index. Next I evenly split the remaining code points to the 4 different run types. The reason for negated numbers is so that 0 can never be a valid number. Also in assembly the run length can be retrieved with a simple ora #$e0, as long as the count is incremented to 0. Finally I got creative with re-purposing the codes points that would have had the same effect as other codes point. The assembled size of the lite decoder variant comes out to 110 bytes.

Some statistics

To get some statistics from the various compression formats, I wrote encoders for each in python. 2 other compression formats (pb53 and LZMPi) are included as well. I wanted to also see how much of a effect sorting the tiles according to appearance in nametables would have. For LZMPi I used the LZMPi encoder that Joel Yliluoma programed.

unsorted /w attr65536310463103129602252072484324632217951999118793
sorted /w attr65536300562998428589237662342722588209721904718256
Totals for all 64 nametables.

looking the “unsorted with attributes” row, the new NTD compression saves on average about 69% or 710 bytes per nametable. That's compared to PackBits which saves about 55% or 562 bytes per nametable. PB53 is not really meant for nametable data, but the NESdev wiki page says “it works fairly well on real tile data and OK on nametable data.”

I anticipate finding additional ways to reduce space, especially for data files #03 and #41. The result would likely be a very different format with things being packed into bits like in LZMPi.