I’ve been playing around with running Windows on top of a VM and it’s got some strange looking opcodes I can’t figure out.
Windows 98 has opcodes that have redundant prefixes. For those of you that don’t muck around in assembly, x86 has prefixes to do things like access the 16-bit version of an instruction instead of the 32-bit version of the instruction. It’s legal to tack on the same prefix multiple times (and every processor I’ve tried even does the same thing if you add contradictory prefixes). I doubt any Win98 era compiler generated redundant prefixes, so those prefixes must have been added on purpose. But why? It’s clearly not for alignment reasons in Windows, and it even mucks up the alignment in a few places. Even if alignment were an issue, adding redundant prefixes would have been bad way to go about it at the time – decoding extra prefixes was expensive prior to Athlon and Banias on Intel and AMD processors, respectively.
It’s at least possible that an assembly coder would add a new unsupported prefix manually, and then forget to remove the extra prefix when the assembler is updated, but Vista has me completely baffled: it issues 0x0F0D opcodes. That’s a NOP on my Conroe, but it’s a 3dNow! prefetch opcode on older AMD processors, and it causes an undefined opcode exception on Intel processors up to and including Prescott. As in the previous case, this doesn’t seem to be for alignment purposes. AMD supports Intel’s prefetch opcodes, so it’s not because you need to issue both AMD specific and Intel specific preftech opcodes. I can see why you’d want to NOPs in a debug build (since they let you place breakpoints in places you might not otherwise be able to) but surely the Vista release build isn’t a debug build.
Even if you want a NOP, why use 0x0F0D? 0x90 works fine, and won’t cause problems on older processors. If you want a longer NOP, you can always add prefixes (which are fast now). Is it only used in the Core2 codepath? I tried spoofing the CPUID to match Prescott, and I still saw this mysterious new NOP, but I was only spoofing family, model, and stepping. Maybe Windows looks at the feature flags to really determine which CPU is really being used, in which case it would have still gone down the Core2 codepath. Even if I spoofed those, who knows what else MS uses to identify a CPU? I’m tempted to buy an old Prescott system off ebay just to see if Vista takes the same codepath, causing the CPU to throw undefined opcode exceptions all the time.
Considering all of the effort that MS goes through to improve Windows performance, they must know they’re doing this. It’s unlikely that it’s an accidental last minute change
UPDATE: After experimenting with some cache tests, we can see that 0x0F0D is actually causes prefetches on Conroe (and newer) processors, even though the current version of the Intel Architecture Manual claims that the opcode doesn’t do anything. What other 3dNow! opcodes has Intel adopted?
UPDATE 2: Raymond Chen gives one reason you might want odd prefixes on things: to work around Intel CPU bugs. It turns out that work-around was pulled before the OS was released, but perhaps there’s a similar explanation for the odd opcodes I was seeing.