In this post I’m going to describe two ways to to find the pixel address of a screen location. Both of these solutions address the same problem: Given a pixel y address (0..191) in register
B and a character x address (0..31) in register
C, calculate the screen address that represents those coordinates and return it in
HL. It’s assumed the subroutine will trash all registers.
The two approaches to solving this problem, are calculating the address programmatically and using a look up table. Once we’ve covered the implementations, we’ll talk about the relative performance and trade-offs in storage, time and complexity.
Calculating a screen address
The Spectrum’s screen memory starts at #4000 so the most significant three bits of our address will always be
010. The 5 least significant bits will always be the
X (column) address. The 8 bits from 5 - 12 represent the pixel
Y, but not in the way you might imagine.
The first two bits of the y address (Y0 ) and Y1) have been picked up and dropped into the middle of the other 6 bits of the address. This is part of the reason why the spectrum screen address calculation is a strange beast.
However putting the first two bits of the screen Y coordinate into the first two bits of the upper byte of the address, is why adding #100 to the address of a character moves down one character row on the screen.
The subroutine to calculate the address from the coordinates as set out above, is:
|ld a,b||4||1||; Work on the upper byte of the address|
|and %00000111||7||2||; a = Y2 Y1 y0|
|or %01000000||7||2||; first three bits are always 010|
|ld h,a||4||1||; store in h|
|ld a,b||4||1||; get bits Y7, Y6|
|rra||4||1||; move them into place|
|and %00011000||7||2||; mask off|
|or h||4||1||; a = 0 1 0 Y7 Y6 Y2 Y1 Y0|
|ld h,a||4||1||; calculation of h is now complete|
|ld a,b||4||1||; get y|
|and %11100000||7||2||; a = y5 y4 y3 0 0 0 0 0|
|ld l,a||4||1||; store in l|
|and %00011111||7||2||; a = X4 X3 X2 X1|
|or l||4||1||; a = Y5 Y4 Y3 X4 X3 X2 X1|
|ld l,a||4||1||; calculation of l is complete|
For a total of 105 T-States in 26 bytes of memory.
Looking up a screen address
Instead of calculating the screen address every time we need it, a better alternative may be pre-calculating the results and placing them in a lookup table.
In current programming terms we store the address of the first pixel in each screen row, in an array (let’s call it
screen_map). We then calculate the address as
screen_map[y*2] + x. The multiplier of 2 is because it is an array of bytes and the addresses are words.
I remember writing a basic program to print the hex addresses for the first pixel of each screen row and write it to the Sinclair printer. Then spinning up my assembler (from tape) and entering in the values by hand.
The code to perform our address translation (remember
B is the
Y coordinate and
C the character
|ld h, 0||7||2|
|ld l, b||4||1||; hl = Y|
|add hl, hl||11||1||; hl = Y * 2|
|ld de, screen_map||10||3||; de = screen_map|
|add hl, de||11||1||; hl = screen_map + (row * 2)|
|ld a, (hl)||7||1||; implements ld hl, (hl)|
|ld h, (hl)||7||1|
|ld l, a||4||1||; hl = address of first pixel from screen_map|
|ld d, 0||7||2|
|ld e, c||4||1||; de = X|
|add hl, de||11||1||; add the char X offset|
|ret||10||1||; return screen_map[Y*2] + X|
|screen_map: .defw #4000, #4100, #4200, #4300, #4400, #4500, #4600, #4700, #4020, #4120, #4220, #4320|
That’s 99 T-States and 401 bytes of memory (17 bytes of code and 384 bytes for the lookup table.
I’m not a mean spirited guy. If you want to play along at home here’s a link to a gist that contains the code and more importantly the lookup table!
Space time trade off
So which one of these approaches is the best? The answer is, as usual, it depends. Let’s compare the results
|Approach||Lines of code||T-States||Total memory|
The calculated approach is slower (about 6%) more complex (61% longer method) but really efficient in memory (14 times less memory!). So for space constrained applications like ROMs, calculation is the approach to take.
However for games, well, speed is king. Even small margins make a difference, especially in crucial areas like screen rendering. So it’d be a rare spectrum game that didn’t use techniques like this.
As a simple example of the difference in timing of these two approaches I coded a race. On the left the contender is calculated addresses and on the right; lookup tables. Each function is tested by filling the screen with pixels many times, alternating the border colour after each iteration.
You can see that by the end of the test, the lookup table function was in the lead by about a second. Winer, Winer, Chicken Dinner :-)