AES Side-Channel Fault Injection on STM32: Recovering an AES-128 Key with Voltage Glitching and DFA

This article breaks down an AES side-channel experiment built on an STM32 Badge: by combining a PB11 trigger signal, PowerShorter voltage glitching, and DFA analysis, it reliably recovers a 128-bit AES master key. It addresses the core challenge of locating the fault window and filtering valid ciphertext pairs. Keywords: fault injection, DFA, STM32.

The technical specification snapshot is straightforward

Parameter Description
Target platform STM32F303 Badge development board
Communication USB serial, 38400 baud
Trigger signal PB11, high level indicates AES computation start
Attack method Voltage fault injection + DFA (Differential Fault Analysis)
Fault device PowerShorter
Supporting tools Oscilloscope, faultviz, pyserial
Core dependencies power_shorter, osrtoolkit.dfa, numpy, serial
Source status Blog post, no star count provided

This experiment targets an AES round-9 fault attack

The original goal of the experiment is clear: use USB for both power delivery and serial communication, send the AES command to the device, and make the chip perform one AES encryption. During encryption, PB11 outputs a trigger pulse. An external device uses that pulse to inject a voltage glitch at a precise moment and induce a faulty ciphertext.

The core idea is not to read the key directly. Instead, the workflow collects pairs of correct ciphertexts and faulty ciphertexts at scale, then uses DFA mathematics to recover the final-round subkey and ultimately derive the full 128-bit master key.

AI Visual Insight: The image shows the experiment entry information printed by the board over USB serial. It indicates that the device exposes the interactive challenge interface only after stable power is present, which means the attack depends on maintaining both adequate power quality and reliable serial access.

AI Visual Insight: The image highlights the board’s key peripherals and pin definitions, especially PB11 as the AES trigger line and PB0 as the final result indicator. This “internal computation + external trigger” structure is well suited for timing-aligned fault injection.

The minimum attack chain can be summarized as follows

PC sends AES command -> PB11 goes high -> PowerShorter triggers a delayed glitch -> MCU outputs faulty ciphertext -> DFA recovers the key

This sequence defines the full attack loop: trigger, inject, capture, filter, and analyze. Every stage matters.

Hardware wiring determines whether faults can be reproduced reliably

The logic wiring diagram centers on four critical lines: the 3.3V injection line, the NRST reset line, the common GND reference, and the PB11 trigger line. These lines are responsible for creating the glitch, recovering after a crash, maintaining a shared voltage reference, and aligning the AES timing window.

AI Visual Insight: The diagram shows the complete attack wiring between the PowerShorter and the target board: E1+ connects to 3.3V to create a momentary short-circuit glitch, RELAY connects to NRST for remote reset after lockups, E1-Tri connects to PB11 for precise triggering, and the oscilloscope probes the sampling point to observe the power waveform during AES execution.

The value of this wiring approach is that it turns “random poking” into event-driven injection. PB11 provides a logical trigger first, and then PowerShorter generates a repeatable glitch pulse based on glitch_delay and glitch_pulse.

The oscilloscope helps locate the round-9 fault window

Once the physical setup is complete, you should not start blind glitching immediately. First, use the oscilloscope to observe the power waveform after the PB11 trigger and estimate the approximate position of AES round 9 on the time axis. The original experiment concludes that the target window sits at roughly 2.3 ms after the trigger.

AI Visual Insight: The photo shows the real bench setup, including the target board, wiring, PowerShorter, and measurement instruments. It makes clear that this experiment is not purely software analysis, but a classic embedded hardware attack chain that must manage power, communication, reset, and trigger signal integrity together.

AI Visual Insight: The waveform plot shows the power trace after the PB11 trigger. Based on it, the attacker narrows the glitch delay parameter to approximately 229500 to 232500, concentrating the fault injection near AES round 9 and improving the hit rate for useful faults.

import random, time, serial
from power_shorter import *
from osrtoolkit.dfa import aes as dfaaes

# Initialize the fault device and serial port
ps = PowerShorter('COM4')
ser = serial.Serial('COM19', 38400, timeout=0.5)

# Capture one correct ciphertext as the DFA baseline
ser.reset_input_buffer()
ser.write(b'AES\n')  # Trigger the target to execute AES
c_correct = bytes.fromhex(ser.read(32).decode())

This code captures the baseline ciphertext and provides the reference for all subsequent fault comparisons.

Round 9 is the best balance point for AES-DFA

AES-128 has 10 rounds, and the final round does not include MixColumns. Because of that structural difference, the fault propagation characteristics around round 9 are ideal for DFA.

If the fault lands in round 10, it typically corrupts only one ciphertext byte, which gives constraints that are too weak. If it lands in earlier rounds, the difference spreads across too many bytes and the resulting equations become too complex. The region around round 9 is the sweet spot: a single-byte fault propagates through MixColumns in round 9 into four bytes, and after entering round 10 it no longer expands across columns.

The four-byte difference pattern is especially valuable

It maps directly to one column of the AES state matrix and can constrain four bytes of the final-round subkey at once. When you combine multiple fault pairs from different columns, the candidate space shrinks quickly. In practice, only a small number of valid samples is usually enough to recover the full key uniquely.

def restart():
    ps.relay(RELAY.RELAY2, 1)  # Raise the relay and reset the target
    time.sleep(0.5)
    ps.relay(RELAY.RELAY2, 0)  # Release the relay and resume operation
    time.sleep(0.5)

fault_list_pair = []

def attack_once():
    glitch_pulse = random.randint(14, 18)
    glitch_delay = random.randint(229500, 232500)

    # Configure the glitch engine to emit a short pulse after a delay
    ps.engine_cfg(Engine.E1, [(0, glitch_delay), (1, glitch_pulse), (0, 1)])
    ps.arm(Engine.E1)

    ser.reset_input_buffer()
    ser.write(b'AES\n')  # Send the command and wait for the glitch trigger
    line = ser.read(32)
    c_fault = bytes.fromhex(line.decode())

    # Keep only ciphertext pairs identified as round-9 faults
    if dfaaes.DfaAESSubFunctions.check_fault_pair(c_correct, c_fault) == 9:
        fault_list_pair.append((c_correct.hex(), c_fault.hex()))

This code automates injection, fault classification, and valid sample collection.

The filtering criteria for valid fault samples must be structured

The single most important condition in the experiment is check_fault_pair(c_correct, c_fault) == 9. This is not a simple byte-count comparison. It infers the fault round from the ciphertext differential structure.

Only samples whose differential pattern matches “a single-point fault in round 9 propagating into four bytes” are marked as valid. This prevents invalid glitches, faults that occur too early, faults that occur too late, and communication errors from contaminating the DFA solving stage.

Clean samples matter at the engineering level

The cleaner the successful sample set, the more stable the DFA solver becomes. Compared with blindly collecting large numbers of faulty ciphertexts, precise sample filtering significantly reduces noise in the later solving phase and improves the success rate of master-key recovery.

# Keep trying until enough valid samples have been collected
while len(fault_list_pair) < 24:
    try:
        attack_once()
    except Exception:
        restart()  # Reset automatically if the target misbehaves

# Recover the master key from the valid fault pairs
solver = dfaaes.DfaAES(fault_list_pair)
master_key = solver.get_master_key()  # Derive the AES-128 master key
print(master_key)

This code closes the loop from sample collection to full master-key recovery.

The experimental results show that DFA is practical against embedded AES

The original experiment accumulated 24 valid fault pairs and then successfully recovered the AES master key, ultimately reconstructing the final flag: flag{3e0d16a0a614c45b423628799b22f3de}. This confirms that the injection window, sample quality, and DFA solving pipeline all worked as intended.

AI Visual Insight: The image shows the result panel from batch fault injection. Different delay and pulse combinations produced filterable valid samples, which demonstrates that the experiment did not succeed through a single lucky hit but through parameter exploration that gradually converged on a stable fault window.

AI Visual Insight: The image shows the output of the DFA solving stage, indicating that the system used multiple correct/faulty ciphertext pairs to recover the final-round subkey and then derive the master key.

AI Visual Insight: The image shows that after the recovered key was sent back to the target in the required format, the device entered the success state and lit PB0. This verifies that the recovered result is not only mathematically valid but also confirmed end to end on the target system.

This case shows that fault injection is a hardware-software problem

From the outside, AES appears secure. But once a device exposes a stable trigger signal, a controllable power path, and a repeatable execution environment, an attacker can turn transient implementation faults into a key-recovery problem.

For real products, it is not enough to implement AES correctly. You should also consider clock monitoring, power-integrity checks, random delays, fault sensors, and redundant result verification as defense mechanisms.

FAQ answers the key implementation questions

1. Why is a trigger signal like PB11 necessary?

Without a trigger signal, glitching becomes blind and the probability of hitting a specific AES round is extremely low. PB11 provides a hard timing reference for “computation start,” which allows glitch_delay to map consistently to a specific round.

2. Why are only a few dozen valid samples often enough?

Because a round-9 fault creates a highly constrained four-byte differential structure. Each valid sample quickly narrows the candidate space for multiple key bytes, so the attack does not require massive amounts of data.

3. How can this type of attack be mitigated?

You can defend at three layers: in hardware, detect abnormal voltage and clock behavior; in firmware, add random jitter and duplicate-computation checks; at the protocol level, limit repeated triggering of sensitive operations and reduce exposure of debugging interfaces.

The core summary captures the practical lessons

This article reconstructs an AES side-channel experiment based on an STM32 development board, focusing on three critical elements: voltage fault injection, PB11-based triggering, and DFA-based key recovery. It explains the wiring scheme, timing localization, injection script, and round-9 fault classification logic, and it closes with the experimental outcome and a developer-oriented FAQ.