With some recent interest in undocumented opcodes and microcode on modern Intel chips, I decided to do a proper writeup on what I found out while researching and playing around with the venerable 80286 ("Beige Unlock"?).
The documentation for the '286s
LOADALL instruction - which Intel only made available under NDA back in the day - briefly mentions how it is used during automated testing of every produced chip (and is thus guaranteed to work). But its other purpose was kept secret: to support In-Circuit Emulation (ICE).
An ICE is a very expensive device that plugs into the CPU socket and "emulates" the chip while providing debugging functionality. This is not at all like the kind of software emulation familiar today, or even using a modern microcontroller to emulate 30+ year old hardware: it needs to run at the same speed and interact with external hardware in exactly the same way as the chip it replaces, using technology available at the time when the 286 was still in production.
Not-so-shockingly, the way they did it was to use an actual 286 chip to "emulate" itself, with some extra pins to allow the debugging hardware to monitor it and take control. This debug interface uses the 5 pins left unused on the 286 package. The only public description of these comes from a patent.
Putting the pieces together:
ICE20(output, pin 55/56)
Output the length in bytes of the current instruction, with each half of the bits multiplexed on even/odd cycles (every instruction takes at least two to execute). This allows the external breakpoint logic to keep track of the instruction queue.
ICEBP#(input, pin 58)
Signals a breakpoint. The CPU will save its state and enter a mode in which debugging code can run from its own isolated address space. The
LOADALL instruction will restore the saved state and exit from ICE mode.
ICES0#(output, pin 2/3)
These are the control lines for ICE bus cycles, with the normal S1/S0 idle. The address and data lines used are the same, but the ICE hardware will disconnect them from the main bus.
On most 286 chips you can find, these pins are not bonded to the pads on the die. For some reason, Intel was very concerned about not making this functionality available for anyone else to use. In an amendment to their second-source agreement with AMD in 1984, they required them to not make it available either!
Then ICE mode evolved into SMM, and AMD got sued when they produced their own 486 chips which exposed this now documented feature.
Opcode 0F 04
LOADALL became one of the most well-known "secret" opcodes. But there was another one right next to it that remained mysterious. In some old textfiles, you can find the claim that 0F04 is "likely to be an alias for
LOADALL" (0F05), and also that F1 is either an alias for
LOCK, or the
ICEBP instruction as on later chips.
This theory was based on the observation of how the 8086 handled invalid opcodes - it would interpret them as some other instruction, because Intel made some bits in the instruction decoder "don't care"s; this might have been the easiest way to make the CPU behave predictably if it encountered one of these opcodes, and there wasn't enough space in microcode ROM to cause an exception instead. And F1 on that chip was an alias for the
LOCK prefix (F0).
However, next came the 80186 (rarely found in PCs), which already had an invalid opcode exception, and did trigger it on opcode F1! So, if the 286 doesn't do that, it must have become a valid-but-undocumented opcode. And it doesn't cause interrupt 1 either, so not ICEBP?
And 0F04 definitely does something different than
LOADALL. There are plenty of working examples (including production code) using the
LOADALL instruction, and substituting opcode 0F04 in them will always have the same effect: it causes the processor to lock up until it is reset.
This may look like a similar case as the Pentium F00F bug, but 0F04 is a privileged instruction that can only be executed in ring 0 or real mode. And being right next to the also undocumented
LOADALL, it might be related. Perhaps doing the opposite and saving the entire CPU state, except for some reason it doesn't work?
Why it hangs
Now remember that
LOADALL not only restores the processor's state, but also exits from ICE mode. So if its counterpart instruction were to enter ICE mode, what would happen? After completing the instruction, the CPU would start to fetch code from the separate ICE address space, using the two extra bus control pins that aren't connected to anything (and likely can't be since they aren't exposed). It would simply wait forever for a response.
Regaining control is possible thanks to the keyboard controller, which has a command to pulse the CPU reset line. It is also notoriously slow, which works in our favour here: after sending the reset command, the CPU still has time to do something before it gets reset.
I was quite hopeful when I first tried this out... but nothing got written to memory.
Thinking more about it, it makes sense that the CPU state would also be saved to ICE memory.
LOADALL can of course be used outside of ICE mode, and will then load the state from the normal address space, but if it always did that, it would defeat the purpose of ICE mode being isolated from the code that is being debugged.
So, if that opcode does in fact enter ICE mode, it would do so before saving the state, and would thus be unusable on a normal chip. Sad!
By your powers combined...
But what about the other undocumented opcode F1? It doesn't seem to do anything, but acts like a prefix to whatever instruction follows it. So I got an idea: could it have an effect on 0F04?
Well, it still hangs - but it does indeed dump the CPU state to memory first, including internal registers in the 10 "unused"
I first posted about this on the VCFed.org forums in 2019, and someone there pointed out that there is one Intel document - about emulating the 286
LOADALL opcode using the different one provided on the 386 - that mentions the existence of a "
STOREALL" instruction, but not its opcode. So apparently that is its official name.
There is another document which describes F1 as a no-op prefix, but does not give it a name (it also brazenly lies about opcode D6 doing nothing).
What F1 appears to be is the equivalent of
UMOV on later processors, a way to access user memory from ICE mode. Maybe call it
Outside of ICE mode, it does nothing - with the single exception of
STOREALL, and that is likely an unintended effect that Intel wasn't even aware of.
Sometimes when running my test code, it didn't work and froze up so badly that the CPU didn't respond to reset anymore. On the machine I first tried it, this happened randomly and sometimes it did also write to both memory and I/O space, turning on the speaker or changing the timer tick frequency.
On another machine I had available, it always locked up when the prefix was present (with just 0F04 the reset did work, but of course nothing would be written to memory).
I finally got it to work reliably on both by disabling DRAM refresh during the critical time. My guess is that since the ICE hardware would normally disconnect it from the bus, the CPU doesn't bother to do that itself if anything else requests access while it is saving the state. Maybe I was lucky it didn't literally Halt And Catch Fire.
How were these opcodes intended to be used?
I couldn't find any code for Intel's ICE product, but very recently I stumbled on a reference to the HP 64000 series in an old AMD 286 manual, and found a CD image on bitsavers.org containing firmware for the different chips it can be used with. The files are in a custom binary format with records somewhat similar to Intel Hex. To make it more confusing, the monitor code which would be run in ICE mode is embedded inside of the host firmware (non-x86 from the looks of it) using a slightly different format.
That took a few hours to make sense of :)
Disassembling the code confirmed that ICE mode is entered at F000:FFF0, the same as after a normal CPU reset. The F1 prefix, as expected, is used to access user memory.
STOREALL, on the other hand, is used for a surprising purpose: as part of the sequence to exit from ICE mode! The reason is that it is the only instruction capable of exiting protected mode, acting very much like a reset.
Why would the monitor code need to exit from protected mode? Well, it has to be able to inspect and modify user memory anywhere, not just in the first megabyte (-ish) accessible from real mode. Since it can't use
LOADALL to load arbitrary segment bases (as that instruction also exits ICE mode), it has to run in protected mode.
So, the "return to user code" subroutine first checks if user was also in protected mode. In that case it can return using
LOADALL immediately. Else, it executes
STOREALL in order to re-enter the monitor in real mode, setting a flag that tells the entry code to do the return instead of entering protected mode.
The state saving of
STOREALL is not at all desired for this, and makes it necessary to copy the actual user state to another buffer first. "Braindead chip" indeed.