“halt-synchronization” is a Game Boy (and Game Boy Color) assembly programming trick. I have seen it referenced in many places, but rarely explained thoroughly. So here it is.
Let’s say your game needs to copy some data to the Game Boy Video-RAM (maybe some tiles, or background maps, palettes, OAM data, whatever). But the Video-RAM is locked during rendering, and unlocks only at specific times.
So the code needs to wait for the Video-RAM to be unlocked. A simple approach is to loop until the relevant hardware register signals that the VRAM is accessible:
.wait
ld a, [rSTAT] ; read the LCD status register
and a, STATF_BUSY ; check if the "VRAM busy" bit is set
jp nz, .wait ; if set, jump to .wait and try again
; otherwise the VRAM is unlocked, and we can now copy the data
This is a busy-loop: it waits until the hardware is in the correct state. This is not ideal, at is uses CPU time and energy, and prevents us from executing other code meanwhile – but this is not the main issue.
The problem is that timing issues may cause the code to miss a few cycles. For instance, what if the console unlocks VRAM access during the bit comparison:
.wait
ld a, [rSTAT] ; Read the LCD status register; it returns "busy".
and a, STATF_BUSY ; Here `rSTAT` switches to "not busy" –
; but we're still comparing with the previous value.
jp nz, .wait ; We think VRAM is still locked, so jump to .wait and try again.
; We'll start copying data only on the next loop :(
Due to the timing at which VRAM was unlocked, the code may have to loop again, and miss a few cycles. It might not be an issue – but in cases where we need really tight timing (for instance when racing the beam), these cycles can be crucial to copy all the data we need.
Especially, when copying data between each scanline, we only have a short 51-cycles interval before VRAM becomes locked again. Those cycles are precious, and we can’t miss them.
This problem is well-known – so much that there is an entire gbdev.io article on the timing of LYC STAT handlers, with a complete description of the issue, and several ways to work around it, depending on the use-case.
But here is another solution, also well-known, but not mentioned in the article above: “halt-synchronization”.
halt-synchronizationThe Game Boy has a halt instruction, that will pause the CPU until a hardware interrupt is triggered.
And hardware interrupts are configurable: we can disable some of them, so that they don’t fire, and don’t affect the halt instruction.
So the trick is:
halt instruction.The CPU will then pause, and resume only when the next interrupt fires – which will signal that the VRAM is now unlocked.
In code, it looks like this:
; Configure the STAT interrupt to fire when the VRAM is unlocked
ld a, STATF_MODE00
ldh [rSTAT], a
; Enable the STAT interrupt (and disable all others)
ld a, IEF_STAT
ldh [rIE], a
; Disable the interrupt handlers (so that our code is executed right after `halt`)
di
; Now pause the CPU until the STAT Mode 0 interrupt is triggered
halt
; Code execution resumes, as soon as VRAM is unlocked, without any delays.
; We can now copy the data.
; Cleanup once the data is copied.
; 1. Manually mark the STAT Mode 0 interrupt as serviced
; (as the interrupts handlers were disabled, it wasn't done automatically by the CPU)
ld hl, rIF
res IEB_STAT, [hl]
; 2. Re-enable interrupt handlers
ei
With this technique, the copy routine can execute right when the VRAM becomes available, and the code will always use all available cycles.
Of course there are several caveats and things to be taken care of:
ei or reti instructions).halt instruction executes, the infamous Game Boy halt bug may cause the next instruction to be executed twice. In that case following the halt by a nop instruction will fix the issue (at the expense of 1 cycle).halting (like a timer), you will need a strategy to catch up or ignore the missed interrupt.halt-synchronization is a clever optimization technique when a handful of cycles can make the difference. It is also a powerful way to shoot yourself in the foot, and can interact in tortuous way with the rest of your game code.
Its uses should be limited to times when the VRAM unlock window is very small (like the Hblank period), and probably to technical demos.
halt-synchronization has been used is countless of demos – but I specifically had to research it while making the pico8-boot demo: https://github.com/kemenaran/pico8-boot