Welcometo the second part of my multi-part series on virtualization-based obfuscators. If you didn't read the first post where I analyzed Tigress, I recommend reading that first as it introduces many concepts that are important for understanding this write-up. There's also a great Terminology section that defines a couple of frequently-used terms. Here's a link to the post if you want it. I've also published all of the IDA databases, binaries, and diagrams from this writeup here.
VMProtect is a commercial obfuscator for software protection and is widely considered to be one of the best. While VMProtect does offer a trial version, it applies much simpler obfuscations that are different from the commercial version, so I ended up choosing to reverse the full version of VMProtect. In this write-up, I will be analyzing a simple "Hello World" binary virtualized with VMProtect v3.5.0.1274 (the latest version at the time of writing).
Even though VMProtect supports all kinds of Windows, Linux, and macOS binaries (and even .NET/C# applications), I decided to just make a simple C "Hello World" Windows application to ease analysis. Inside the VMProtect software, I selected hello_world.exe as my target and started to mess around with the settings. It is required to select the individual functions that you want VMProtect to obfuscate, so I went ahead and added helloWorld. I also made sure to only enable virtualization for the binary as I didn't want to deal with mutation. After this, I turned off all of the extra protection such as import protection and packing in the options menu. These extra protections are not the main focus of this write-up and would only hinder our analysis of the virtual machine.
I put the string "Main Function" inside the source code to be able to quickly locate the main function. By looking at the IDA string view, it took about 3 seconds to find it. Taking a peek at the assembly, it looks fairly normal; I renamed some of the functions to make it easier to read. The "Main Function" string is printed by printf, and then the helloWorld function is called. I knew I needed to check in there for any signs of obfuscation.
One of the most obnoxious parts about reversing VMProtect is the number of static obfuscations that they apply to the binary. Even with mutation disabled, VMProtect still implements dead store code, opaque branching, jump obfuscation, code duplication, and more to protect the internals of their virtual machine. While some of these are fairly difficult to eliminate without drastic measures, dead store code can be manually replaced with nop's and hidden in IDA. I created an IDA Plugin called NOPnHIDE for this exact purpose. For the rest of the analysis, I will be using this plugin to eliminate dead store code.
Another annoying static obfuscation that is much harder to remove is control flow obfuscation. The most obvious form of this is how VMProtect separates small blocks of code into even smaller blocks of code connected by jmp instructions (opaque branching). This seriously clutters the IDA graph view and makes it much more difficult to analyze. Also, while the general architecture of VMProtect is incredibly hard to analyze in IDA (simply due to the amount of jumps it incorporates), they also use push+ret jump obfuscations to break IDA's control flow graphs. While I won't talk about either of these obfuscations in-depth, just know that they do hinder analysis and are fairly difficult to remove.
Looking back at the de-obfuscated code, the first thing to occur after entering the VM is all of the registers being pushed onto the stack. This is done so that the registers can be restored to their original states after the VM exits.
After the registers are pushed, the value 0x7FF695820000 is moved into rcx and pushed onto the stack. It seems to be some sort of base address that is used to compute jumps with RVA's. Keep this value in mind as it will have a use on the stack later. A pointer to the virtualized bytecode is also moved into rsi, which is now our instruction pointer for the VM. The pointer is then decrypted by a series of subtraction, negation, addition, and another negation. This transformation sequence is randomized by VMProtect for each different VM. Moving along, the value 0x7FF695820000 (in rcx) and then 0x100000000 is added to the decrypted pointer. While these seem like random constants at the moment, it does provide us with the decrypted pointer to the bytecode, so something is working ?.
Now that the pointer to the bytecode has been decrypted, the current location of the stack is moved into r10. After this, 0x180 bytes are allocated on the stack for the VM. About 0x40 of these bytes are for the virtual stack and the other 0x140 bytes are for the virtual register space. The register r10 is now our virtual stack pointer since it points to the location underneath the allocated bytes and rsp is now the pointer to the top of the virtual register space. I will describe these structures in greater detail later. Finally, an and operation is performed on rsp which aligns the address to a 16-byte boundary. These operations also show that VMProtect chooses a specific physical register to store pointers to VM context data.
Now that the VM has successfully initialized the virtual stack and register space, the current value of rsi gets moved into r9 and some seemingly random calculations happen to it. While it may not seem apparent at the moment, this register now stores the self-modifying encryption key. The numerical value that is stored here has no significance, as VMProtect could use any value it wanted as the initial decryption key, but it seems like they decided to use some numbers that were already on hand. We will see how this encryption key gets used in a moment.
To understand the next section of code, I will quickly explain how a significant portion of the VMProtect 3 architecture works. The most prominent difference between VMProtect and other virtual machines is the fact that it does not use opcodes or a handler table. Instead, it uses an offset that is decoded with the self-modifying decryption key and added to the address of the current handler. This may be slightly hard to visualize, so I'll show the next code segment to hopefully clear things up.
This routine is the Fetch, Decrypt, Jump routine. I will refer to this routine as FDJ from now on. It is incredibly important and can be found in every single handler in the VM, so pay attention ?. It starts by moving the address of the current handler into rdi. This address will be used in a while, so just remember that it's there. After this, the bytecode pointer is... reduced by four? I was initially confused by this, but after some investigation, it seems that this VM interprets the bytecode backward in memory. I also found that this is randomized by VMProtect, so other VM's it generates may store it normally, but this VM specifically stores it backward. Either way, the bytecode pointer is now pointing to a 4-byte value included with every instruction: the handler jump offset. This value is stored into ecx which marks the start of the decryption sequence.
Now, the self-modifying decryption key stored in r9 is xor'd with the encrypted handler jump offset in ecx. After this, four random transformations are applied to the handler jump offset. In this case, it is negated, rotated left by one, incremented by one, and then negated again, but these transformations seem to be randomized for each handler. After all of this, the self-modifying encryption key itself (stored in r9) is xor'd again with the final decrypted handler jump offset. This is why it is self-modifying: after it finishes decrypting a value, it xor's the decrypted value with itself. While this seems incredibly convoluted, this self-modifying encryption key serves multiple purposes. Starting with the most obvious, it makes reversing the VM incredibly confusing. While this is the ultimate goal of VMProtect, this specific encryption routine had me stumped for a while. Other than that, it also protects the VM bytecode from being modified or hooked in any way as the self-modifying encryption key will be thrown out of order if any instructions are added, modified, or removed. This design from VMProtect is impressive and adds another layer of protection to an already incredibly complex VM.
After the handler jump offset is decrypted, it is added to rdi. From the start of the routine, we know that rdi stores the base address of the previous handler. After the handler jump offset is added to the previous handler address, we now have the address of the next handler, which is jumped to immediately. This routine is repeated after every instruction handler until the VM exits
Now that we understand the FDJ routine, we can start analyzing some of the actual instruction handlers. The handler pictured above is the very first handler that the virtualized helloWorld function executes after initializing. It starts by moving the value that r10 points to into rbp. If you remember from earlier, r10 is the virtual stack pointer register, meaning this instruction is reading from the virtual stack. After, r10 gets increased by 8, meaning the stack pointer goes down on the stack. These instructions clearly popped a value off the virtual stack. After this, rsi is subtracted by 1 and a byte is moved into eax. This is the operand for this instruction. Directly afterward, we can see a very familiar sequence of instructions. First, al (the byte register for eax) is xor'd with r9. Since r9 is the self-modifying encryption key, we can assume that this is a decryption sequence for the operands. The xor, not, neg, and rol instructions afterward are the transformations I talked about earlier. After decrypting the operands, x9 is xor'd with al, which again shows the self-modifying nature of the encryption key. As you can see, this encryption key is used in more places than just the handler jump offset decryption; it also decrypts the operands for the instructions.
3a8082e126