Menno Markus

Part 2: Hardware Breakpoints

May 02, 20269 min read

The code for this part can be found here. Feeling lost? Consider starting at part 1.

In the previous post we setup a basic debugger structure. In this post we’ll finally get to the exiting part: setting a breakpoint! There are two types of breakpoints, hardware breakpoints and software breakpoints. This part will focus on hardware breakpoints and we’ll discuss what software breakpoints even are in the next post.

A hardware breakpoint, as the name implies, is implemented by the CPU hardware. The details differ per architecture, but on x86 this is controlled by the debug registers, of which there are 8. Registers DR0 through DR3 specify the address of a breakpoint. We’ll ignore DR4 and DR5 as they are generally unused. Register DR6 is the debug status register and DR7 the debug control register. We’ll explain further how each of them works below, but the Wikipedia page is a good reference if you need to know more beyond this post.

Manipulating registers

You probably know that a CPU executes on assembly and registers. Each thread in our program has a register context associated with it holding the current values of whatever the thread was executing. Even the location of the current assembly instruction being executed is stored in here, by the register named instruction program counter (RIP). This allows the OS to suspend execution by saving the context, than later resume running by restoring all the register values. If we take a peek at the X86_64 CONTEXT structure we can spot our debug registers among it’s fields.

pub const CONTEXT = struct {
    // ...
    EFlags: DWORD, // 32 bit register.
    //...
    Dr0: DWORD64, // 64 bit registers.
    Dr1: DWORD64,
    Dr2: DWORD64,
    Dr3: DWORD64,
    Dr6: DWORD64,
    Dr7: DWORD64,
    //...
    Rip: DWORD64,
    //...
};

If we want to set the debug registers, we’ll need a way to modify this context. Windows provides two functions for this GetThreadContext and SetThreadContext. These functions take a thread handle and return/apply the CONTEXT struct respectively. They are privilege functions though, not just anyone can call them. We need a thread handle with certain access rights.

You may recall from the previous part DEBUG_EVENTgave us a thread id 1. We can use this to open a thread handle and ask for the get/set thread context permission. Whenever you open a handle, Windows requires you to call CloseHandle when done.

const thread_access = win.THREAD_GET_CONTEXT | win.THREAD_SET_CONTEXT;
const thread_handle = win.OpenThread(thread_access, win.FALSE, thread_id);
_ = win.CloseHandle(thread_handle);

Given the thread handle we ask for the thread context. Notice how we first set the ContextFlags field to CONTEXT_ALL. Windows reads this field to know which values of the struct we are asking for. It might skip certain fields depending on the flags set.

var thread_ctx = std.mem.zeroes(win.CONTEXT);
thread_ctx.ContextFlags = win.CONTEXT_ALL; // Get and set (nearly) all of the thread context.
if (win.GetThreadContext(thread_handle, &thread_ctx) == win.FALSE) {
    return error.GetThreadContextFailed;
}

We’ll be manipulating the context a lot so let’s create some wrapper functions. Repeatedly opening a handle isn’t exactly great, a real debugger should probably store it of to the side. But we’re far from a real debugger.

pub fn getThreadContext(thread_id: win.DWORD) !struct { win.HANDLE, win.CONTEXT } {
    const thread_access = win.THREAD_GET_CONTEXT | win.THREAD_SET_CONTEXT;
    const thread_handle = win.OpenThread(thread_access, win.FALSE, thread_id);

    var thread_ctx = std.mem.zeroes(win.CONTEXT);
    thread_ctx.ContextFlags = win.CONTEXT_ALL; // Get and set (nearly) all of the thread context.
    if (win.GetThreadContext(thread_handle, &thread_ctx) == win.FALSE) {
        _ = win.CloseHandle(thread_handle);
        return error.GetThreadContextFailed;
    }

    return .{ thread_handle, thread_ctx };
}

pub fn setThreadContext(thread_handle: win.HANDLE, thread_ctx: win.CONTEXT) !void {
    const success: win.WINBOOL = win.SetThreadContext(thread_handle, &thread_ctx);
    _ = win.CloseHandle(thread_handle);

    if (success == win.FALSE) {
        return error.SetThreadContextFailed;
    }
}

Setting a breakpoint

To set an instruction breakpoint we first must pick one of DR0 through DR3 to fill out the instruction address we want to break on, easy enough! But we also must enable the register through the DR7 control register. This register is a bitset with fields for each debug register. To visualise its layout lets make use of a Zig feature to specify the bit fields of a 64-bit integer:

pub const Dr7 = packed struct(u64) {
    local_enable_0: u1,
    global_enable_0: u1,
    local_enable_1: u1,
    global_enable_1: u1,
    local_enable_2: u1,
    global_enable_2: u1,
    local_enable_3: u1,
    global_enable_3: u1,

    unused0: u8,

    trigger_condition_0: u2,
    trigger_size_0: u2,
    trigger_condition_1: u2,
    trigger_size_1: u2,
    trigger_condition_2: u2,
    trigger_size_2: u2,
    trigger_condition_3: u2,
    trigger_size_3: u2,

    unused1: u32,
};

As you can see there are 3 unique bits of information here. A 1-bit enable flag, a 2-bit condition flag and a 2-bit size flag. The condition and size reveal a powerful feature of hardware registers. They can not only break on execution, but also on data read/writes!

Condition value 0 is for execution and when set the size must also be 0 to indicate 1-byte. Condition value 1 is for just data writes and 3 for data reads and writes. Size can than be set to either 0 through 3, indicating 1-byte, 2-bytes, 8-bytes or 4-bytes respectively 2. We’ll only implement instruction breakpoints here, leaving data breakpoints as an excessive to the reader.

Recall that in the previous part we left of on implementing a setBreakpoint function. Let’s fill it in now! It’s worth noting we only set our breakpoint on a single thread. A real debugger would apply it across all threads, but we ignore multi threading.

pub fn setBreakpoint(thread_id: win.DWORD, break_at_address: usize) !void {
    const thread_handle, var thread_ctx = try getThreadContext(thread_id);

    thread_ctx.Dr0 = break_at_address;

    var debug_control: Dr7 = @bitCast(thread_ctx.Dr7);
    debug_control.local_enable_0 = 1;                   // Enable.
    debug_control.trigger_condition_0 = 0;              // Break on instruction execution.
    debug_control.trigger_size_0 = 0;                   // 1 byte.
    thread_ctx.Dr7 = @bitCast(debug_control);

    try setThreadContext(thread_handle, thread_ctx);
    log.info("Successfully set breakpoint at address 0x{X}", .{break_at_address});
}

Handling a breakpoint hit

When the CPU hits a hardware breakpoint it raises an exception, similar to our custom exception in the previous part. The exception raised is called EXCEPTION_SINGLE_STEP. This might seem like a strange name but will make more sense in the next part. You can probably already imagine what else it might be used for… Let’s add a case to catch this exception.

if (debug_event.dwDebugEventCode == win.EXCEPTION_DEBUG_EVENT) {
    const exception_code = debug_event.u.Exception.ExceptionRecord.ExceptionCode;

    // Received the signal the debugee is ready.
    if (exception_code == 0xE0000001) {
        // ...
        try setBreakpoint(thread_id, break_at_address);
    }

    // A hardware breakpoint has been hit.
    if (exception_code == win.EXCEPTION_SINGLE_STEP) {
        log.info("Breakpoint hit! Continue?", .{});
        _ = try stdin.takeDelimiter('\n') orelse unreachable;

        try handleBreakpoint(thread_id);
    }
}

So far we skipped over an important detail though. Exceptions can be classified as one of 3 categories:

Hardware breakpoints cause a fault exception, stopping before the instruction executes. With our breakpoint still set, resuming execution we would immediately break again! So you might imagine we’ll have to perform a complicated dance involving temporarily removing our breakpoint… and belief me we’ll get there… in another part! X86_64 actually has a feature for this, the resume flag. This 1-bit flag is found on the EFlags register and will disable hardware breakpoints for 1 instruction, than clear itself.

We can change the EFlags register the same way as before. We’ll also check the debug status register DR6 to confirm it was our breakpoint, DR0, that got hit.

pub fn handleBreakpoint(thread_id: win.DWORD) !void {
    const thread_handle, var thread_ctx = try getThreadContext(thread_id);

    const debug_status_dr0 = thread_ctx.Dr6 & 0x1;
    std.debug.assert(debug_status_dr0 == 1); // Check DR0 was hit by reading debug status bit 0.

    const resume_flag_bit = 0x00010000;
    thread_ctx.EFlags |= resume_flag_bit; // Disable breakpoints for 1 instruction.

    try setThreadContext(thread_handle, thread_ctx);
    log.info("Resumed execution.", .{});
}

Result

Let’s set a breakpoint on the start of the doWork() function. Try it out, and… we got a hit! Continuing through, we can see the function executes, incrementing counter, before entering our function in a loop again. Perfect!

info(debugee): Address of doWork() == 0x7FF72A1B1720
info(debugee): Address of counter == 0x7FF72A279600
info(debugee): Waiting for debugger...
info(debugger): Set hardware breakpoint at address?
0x7FF72A1B1720
info(debugger): Successfully set breakpoint at address 0x7FF72A1B1720
info(debugee): Starting work!
info(debugger): Breakpoint hit! Continue?
y
info(debugger): Resumed execution.
info(debugee): 0
info(debugger): Breakpoint hit! Continue?
y
info(debugger): Resumed execution.
info(debugee): 1
info(debugger): Breakpoint hit! Continue?

It should also be trivial to implement a data breakpoint watching for writes to counter instead.

There are however quite some limitations with hardware breakpoints. For one, we can only set a maximum of 4! If you ever got a debugger message complaining you can’t set more than 4 data breakpoints, you might start the see why. But most debuggers can at least set more than 4 instruction breakpoints? What’s going on? This is where software breakpoints come into play. We’ll take a look at these in the next part.

Footnotes

  1. When we called CreateProcessA the PROCESS_INFORMATION struct actually also returned a thread handle. It might be tempting to try to use it. But remember it might not have the right permissions for what we need.

    1
  2. This ordering may seem strange but is explained by 8-bytes not existing on 32-bit processors.

    1
< Part 1: Debugger Setup   •   Part 3: Software Instruction Breakpoints >
Return to Home