Anti-disassembly on ARM (IDA, specifically)

ARM is known as being somewhat prettier and less full of quirks than X86/64, but that doesn't mean we can't have some fun, right?

So I was looking through the ARM arch manual and ran into this instruction:
ARM PLD Instruction pt1 ARM PLD Instruction pt2

Since this is a function that does (almost) nothing to begin with, I thought I'd try using something addressable and seeing how IDA interprets it.

__asm__ volatile (".byte 0x00, 0xF0, 0xDF, 0xF5\n");  

This will PLD 2 bytes into the instruction following the next instruction.
ARM antidisam simple

But there's more!

So, let's say you're paranoid messing around with the cache.

According to the specification, if you feed it invalid data, it won't cause a data abort. It won't cause any other exceptions either, valid or not. In an exception case, it'll fall back to NOP. IDA however, doesn't know that.

In my previous example, even though an instruction was malformed because instructions are every 4 bytes under my flavor of ARM - it picks right back up on a valid instruction.

If you feed PLD malformed data and IDA is completely unable to parse the (kinda sorta) valid instruction, and it freaks out!

    __asm__ volatile(
        ".byte 0x02, 0xDF, 0xDF, 0xF5\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
        "NOP\n"
    );

I added NOPs for dramatic effect (to show IDA does not resume disassembly, at least not for the function that the instruction is present in.

ARM antidisam advanced

Ta-da!

So what else is cool about this is, this is just one combination of invalid bytes that creates a PLD instruction the processor can ingest. There's all sorts of combinations that will cause this same thing to happen.

August 31st update:

After a bit of research after work I was able to figure out exactly why IDA doesn't consider this a valid instruction, and why the CPU executes it normally.

Consider these:
ARM antidisasm explained 1 ARM antidisasm explained 2

Making a simple program I was able to discover my byte fiddling with the PLD instruction corrupted the 4 bit field after "addr_mode" (among other things).

Creating an instruction using this example program and slapping the results into GCC gave me the desired effect. I can now (using the gist above) create many variants of this instruction.

You can fiddle with the other instruction data (such as addr_mode itself, etc.) to create variants of this instruction that accomplish the same task.

Andrew Artz

Read more posts by this author.