Yesterday, MalwareTech posted two shellcode challenges. I spent some time going through the first challenge and wrote a walkthrough of it here. I recommend reading it first, as some of the code is similar. This write-up focuses on the second challenge using IDA Pro and Python.
Here is part 2 which follows on from shellcode1 (and turns up the heat a little).https://t.co/n26hPHyiaS
— MalwareTech (@MalwareTechBlog) May 24, 2018
Just like the first challenge, the ZIP archive shipped with a README and the binary. The README explains that the challenge should be solved statically—no debuggers—and that running the binary will output the MD5 hash of the flag.
Jumping into IDA, the start function looks similar to challenge one. Because I know the binary prints an MD5 hash via a message box, I scan near the bottom and see the decrypted flag ends up in var_28 before it is displayed.
Moving back to the top, the first thing the binary does is fill var_28 with junk data—even though we know it will eventually hold the decoded flag.
Next it allocates heap space and pushes four items into the structure: LoadLibrary, GetProcAddress, var_28, and the integer 36.
With the heap primed, the binary allocates executable memory, copies in a shellcode blob, and jumps to it—exactly like challenge one.
Stepping into the shellcode, things get more interesting. The blob is much larger and makes up the majority of the program's behavior. After breaking it apart, it falls into three sections: dynamic imports, file operations, and the decoder.
The dynamic import section lives at the top. It moves characters one by one into memory to form strings—a common trick to hide them from basic static analysis. Decoding them reveals msvcrt.dll, kernel32.dll, fopen, fread, fseek, fclose, GetModuleFileNameA, and the mode string rb. (In IDA, highlight each byte and press R to convert to ASCII.)
Once the strings are in place, the code moves LoadLibrary into ebp-4 and GetProcAddress into ebp-44. It repeatedly calls them to resolve each API needed for later file operations.
With imports resolved, the second section handles file operations. It grabs a handle to the current binary (GetModuleFileNameA), opens the file (fopen), seeks to offset 0x4E (decimal 78) (fseek), reads the next 38 bytes into a buffer (fread), and then closes the handle (fclose).
To see what lands in the buffer, I opened the binary in a hex editor, navigated to offset 78, and copied the following 38 bytes.
Finally, the decoder loop iterates over the buffer, XORing each byte with the corresponding byte from var_28—the location that ultimately holds the plaintext flag.
At this point I know the encoded flag bytes and the XOR key. The remaining step is to recreate the loop in Python to reveal the flag.
flag_bytes = [0x00] # replace with encoded flag bytes
key_bytes = [0x00] # replace with buffer bytes
result = ""
for index in range(len(flag_bytes)):
decoded = key_bytes[index] ^ flag_bytes[index]
result += chr(decoded)
print(result)
After copying the bytes from var_28 and the buffer into the script, running it prints the flag.