SMC Static Analysis in Practice: Repairing Junk Instructions and Recovering XOR Self-Modifying Code

This article focuses on static recovery techniques for SMC (self-modifying code) samples. The core approach is to bypass deep anti-debugging by repairing junk instructions, analyzing a two-layer XOR decryption chain, deriving the runtime key, restoring the real functions, and auditing the underlying algorithm. It addresses a common pain point in reverse engineering: how to recover code when dynamic debugging is not feasible. Keywords: SMC, IDA, XOR decryption.

Technical Specifications Snapshot

Parameter Details
Technical Topic SMC Static Analysis / Reverse Engineering / CTF Reversing
Target Sample 2018 Wangding Cup Round 3 SimpleSMC
Core Tools IDA Pro, Hex-Rays, IDAPython
Target Architecture x64
Primary Protections Junk instructions, control-flow pollution, self-modifying code, shallow anti-debugging
Core Algorithm Two-stage XOR decryption
Key Dependencies Cross-reference analysis, patching, disassembly reanalysis
Source Platform Curated from a CNBlogs article

This challenge is a classic SMC static recovery case

At its core, SMC modifies its own code segment at runtime. The conventional approach is dynamic debugging: let the program finish decrypting itself, then dump the plaintext code.

However, when the sample contains deeply hidden anti-debugging logic, dynamic analysis often becomes unstable and may crash immediately at runtime. In that situation, a more reliable strategy is to statically identify who modifies the code, how it is modified, and which region is being changed.

Entry analysis should begin with the main logic and initialization path

The sample entry point start does not reveal much by itself, but the tail call to sub_4010E0 is critical. This function links the main logic and the initialization routine, making it the right starting point for control-flow analysis.

After following into sub_400D45, you can see an obvious main execution branch associated with loc_400BAC and loc_400AA6. Both regions look abnormal in IDA, which strongly suggests that they are either being modified at runtime or polluted with junk instructions.

# Pseudocode: confirm the entry chain first, then lock onto abnormal code regions
entry = "start"
dispatcher = "sub_4010E0"      # Core dispatcher function
main_logic = "sub_400D45"      # Main logic entry
suspicious = ["loc_400BAC", "loc_400AA6"]  # Abnormal code regions

# Core idea: first find who calls whom, then find who modifies whom

This step narrows an opaque entry point into a small set of verifiable abnormal code regions.

Repairing junk instructions is the first step in recovering SMC

Inside loc_400BAC, IDA highlights the region in red and shows conflicting jumps, such as jle and jnz both targeting loc_400BC8+1. Structures like this usually do not belong to normal business logic. They are typically crafted to disrupt linear disassembly.

The original analysis points out that you do not need to delete the entire block aggressively. Instead, you only need to patch the incorrect branch region precisely: change a specific location into a short jump EB 0B, then apply NOP to the polluted bytes in the middle while preserving the real data that follows. This prevents the instruction pointer from drifting into the wrong region.

The repaired loc_400BAC reveals an XOR decryption function

After reanalysis, loc_400BAC becomes a very clear loop: it processes loc_400AA6 byte by byte until it encounters 0xC3.

Its essential logic is as follows: each byte in the target code region is XORed with a 7-byte repeating key. This means loc_400BAC is not part of the business logic. It is a runtime decryptor.

for (i = 0; target[i] != 0xC3; ++i) {
    target[i] ^= key[i % 7];  // XOR with the repeating 7-byte key
}

This routine performs runtime XOR decryption on loc_400AA6, and it stops when it encounters the ret opcode byte 0xC3.

loc_400AA6 is actually decrypted twice

If you continue examining the cross-references to loc_400AA6, you will find that sub_400C48 also modifies this code region. It is polluted with junk instructions as well, and IDA even shows an abnormal JUMPOUT, which usually indicates forged control flow.

The repair strategy is the same as before: preserve the real landing-site data, and patch only the branch logic and polluted region. After reanalysis, you can see that the core of sub_400C48 is another XOR pass over loc_400AA6, this time using the byte stream at 0x41E1B0.

Entry point and main dispatcher relationship AI Visual Insight: This diagram shows how the sample flows from start into the initialization and main dispatcher functions. It highlights an important reverse engineering workflow: inspect the entry point first, then trace the tail calls. The dispatcher is surrounded by several subfunction parameters, which indicates that the initialization logic and core logic are not exposed directly but are chained through indirect calls.

Junk instructions and abnormal branches in loc_400BAC AI Visual Insight: This image highlights IDA’s red warning region and the conflicting conditional jumps. A typical pattern is that multiple branches target the same offset-plus-one location, which breaks normal disassembly boundaries. This layout is often used to manufacture false function boundaries so the analyst sees apparent complexity, while the real goal is to hide the actual XOR decryption logic that follows.

The initialization phase and the main logic phase form a two-layer XOR chain

The full execution flow can now be summarized as follows: after program startup, .init_array first runs sub_400C48, which performs the first XOR pass on loc_400AA6; then the main logic reads user input, and sub_400BAC performs the second XOR pass; only after that does the program actually call loc_400AA6.

From this, we can derive the final formula:

final_code[i] = file_code[i] ^ stream_41E1B0[i] ^ key[i % 7]

This formula is crucial because it unifies the encrypted code visible in the file, the initialization byte stream, and the runtime key into a single static recovery problem.

Deriving the 7-byte key is the key breakthrough in static recovery

If the real code satisfies the formula above, then we can rewrite it as:

key[i % 7] = file_code[i] ^ stream_41E1B0[i] ^ real_code[i]

The problem then becomes: how do we guess the first 7 bytes of real_code? On x64, a common function prologue is push rbp; mov rbp, rsp; sub rsp, xx, whose machine code is often close to 55 48 89 E5 48 83 EC.

If you substitute this known function prologue into the first 7 bytes, you can directly derive the key. The original analysis obtains: F1@gChe.

# IDAPython approach: derive the key from a guessed function prologue
file_code = get_bytes(0x400AA6, 7)
stream = get_bytes(0x41E1B0, 7)
real_head = bytes([0x55, 0x48, 0x89, 0xE5, 0x48, 0x83, 0xEC])

key = bytearray(7)
for i in range(7):
    key[i] = file_code[i] ^ stream[i] ^ real_head[i]  # Derive the key

print(key.decode(errors="ignore"))  # Expected result: F1@gChe

This script quickly derives the 7-byte repeating XOR key by assuming a known function prologue.

Once the code is restored, the main logic becomes much simpler

After obtaining the key, you can decrypt loc_400AA6 in bulk and restore the real function body. The recovered function is no longer mysterious: it mainly processes the input data, calls sub_4009AE(a1, 32LL, 64LL) for further transformation, and then compares the result against the expected output.

However, sub_4009AE is still disguised by junk instructions. The handling method remains the same as before: patch the polluted control flow, reanalyze the function, and you will find that it is an outer loop that calls sub_400A18 for the inner transformation, eventually forming the full verification algorithm.

The methodology matters more than the final flag

The most valuable part of this challenge is not the final flag. It is the transferable analysis framework it provides:

  1. Identify abnormal jumps and IDA red-highlighted regions first.
  2. Patch the junk instructions and force reanalysis.
  3. Use cross-references to confirm who modifies the target code region.
  4. Express the multi-stage decryption process as a unified formula.
  5. Derive the key using function prologues or other known-plaintext features.
  6. Restore the real algorithm and verification logic last.
# Unified decryption model: restore the target code region in bulk
for i in range(code_len):
    real_code[i] = file_code[i] ^ stream[i] ^ key[i % 7]  # Two-layer XOR recovery

This model captures the most important static recovery process in the entire challenge.

FAQ

Q1: Why not prioritize dynamic debugging for this challenge?

A: Because the sample contains deeper anti-debugging logic or crash-triggering execution paths, dynamic execution may not reliably reach the point where decryption has completed. Static analysis demands stronger auditing skills, but it delivers more controllable and reproducible results.

Q2: Why can you guess the function prologue to derive the key?

A: Because x64 binaries often use stable byte patterns for function prologues under unoptimized or conventionally optimized builds. If you can correctly predict even the first few bytes, that is often enough to recover a short repeating XOR key.

Q3: What is the most common mistake when analyzing SMC?

A: The most common mistake is excessive NOP patching that accidentally deletes real data, which completely distorts the later control flow. When repairing junk instructions, patch only the polluted region and preserve the original bytes after the real branch landing site.

AI Readability Summary: This article uses a representative CTF sample to systematically break down the static analysis workflow for self-modifying code (SMC): identify junk instructions, repair corrupted control flow, locate the two XOR decryption stages, derive the 7-byte key, and restore the real code with IDAPython. The end result is a complete analysis of the main logic and verification algorithm without relying on dynamic debugging.