The Joys of Linux Kernel ROP Gadget Scanning
Linux Kernel ROP gadget scanning is one of those things that seems easy in theory – just run ROPgadget --binary vmlinux
on it!
In practice, however, anyone who has used that method has likely had to sift through a large amount of false positives and likely missed some gadgets due to false negatives.
This is a result of a few quirks of Linux kernel images, some of which make solving the false positive/negative problems a bit difficult.
I want to use this post to describe some of the complexity behind static ROP gadget scanning in modern Linux kernel images and discuss how I handle them in my fork of ropr called kropr.
The Executable Section Problem
Lets start with probably the most well known problem leading to false positives, the fact that generic ROP gadget scanners do not account for some sections of the kernel being only executable at boot time.
Here are all the executable regions in a Ubuntu kernel image (output from readelf):
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 1] .text PROGBITS ffffffff81000000 00001000
0000000001600000 0000000000000000 AX 0 0 4096
[21] .init.text PROGBITS ffffffff838ae000 02850000
00000000000c8725 0000000000000000 AX 0 0 16
[22] .altinstr_aux PROGBITS ffffffff83976725 02918725
00000000000032b2 0000000000000000 AX 0 0 1
[29] .altinstr_replacement PROGBITS ffffffff83d4728a 02ce9286
0000000000008dcd 0000000000000000 AX 0 0 1
[31] .exit.text PROGBITS ffffffff83d50090 02cf2090
00000000000046a5 0000000000000000 AX 0 0 16
This vmlinux contains five executable sections, all of which in a normal binary would be viable locations to find ROP gadgets. However, for the Linux kernel, this is not the case.
We can see this by booting the kernel and looking at the output from the gdb-pt-dump utility in gdb, which dumps the page tables along with their permissions/attributes:
gef> pt
Address : Length Permissions Region
...
0xffffffff81000000 : 0x1600000 | W:0 X:1 S:1 UC:0 WB:1 G:1 | kernel
0xffffffff82600000 : 0xda0000 | W:0 X:0 S:1 UC:0 WB:1 G:1 | kernel
0xffffffff833a0000 : 0x9c5000 | W:1 X:0 S:1 UC:0 WB:1 G:1 | kernel
0xffffffff83d65000 : 0x1000 | W:0 X:0 S:1 UC:0 WB:1 G:1 | kernel
0xffffffff83d66000 : 0x69a000 | W:1 X:0 S:1 UC:0 WB:1 G:1 | kernel
...
The only executable region here matches with what we saw in the readelf
output previously for the .text
section. As such, the .text
section is the only one we should care about when scanning for gadgets.
In kropr, I address this source of false positives by just filtering for the .text
section when parsing the kernel image.
The Thunk Problem
In response to speculative execution vulnerabilities, Linux had to do some strange things to control flow instructions to mitigate particular attacks. One of these measures was to turn all returns and all calls/jumps into calls/jumps to thunks.
Here is an example of what this actually looks like:
gef> disas free_pipe_info
Dump of assembler code for function free_pipe_info:
0xffffffff814fc470 <+0>: nop DWORD PTR [rax+rax*1+0x0]
0xffffffff814fc475 <+5>: push rbp
0xffffffff814fc476 <+6>: mov rbp,rsp
0xffffffff814fc479 <+9>: push r12
0xffffffff814fc47b <+11>: push rbx
...
0xffffffff814fc4df <+111>: mov rax,QWORD PTR [rax+0x8]
0xffffffff814fc4e3 <+115>: call 0xffffffff8222fcc0 <__x86_indirect_thunk_rax>
...
0xffffffff814fc52a <+186>: pop rbx
0xffffffff814fc52b <+187>: pop r12
0xffffffff814fc52d <+189>: pop rbp
0xffffffff814fc52e <+190>: xor eax,eax
0xffffffff814fc530 <+192>: xor edx,edx
0xffffffff814fc532 <+194>: xor esi,esi
0xffffffff814fc534 <+196>: xor edi,edi
0xffffffff814fc536 <+198>: jmp 0xffffffff82230460 <__x86_return_thunk>
In the above code, where you would expect to see an indirect call we instead see a call to __x86_indirect_thunk_rax
, and where you would expect to see a ret
instruction at the end of the function we instead see a jump to __x86_return_thunk
.
These thunks are actually due to mitigations against two different microarchitectural vulnerabilities. One of which is Spectre V2, which can be mitigated via retpolines. This is the mitigation that adds __x86_indirect_thunk_<register>
calls to the code in place of the expected call <register>
instructions. The other vulnerability is Retbleed, which can be mitgated via a jmp2ret (more details can be found in the retbleed paper), which is the mitigation that adds __x86_return_thunk
jumps in place of return instructions.
So, how do these thunks affect ROP gadget scanning? Well, they actually cause some pretty major problems…
False Negatives From Thunks
Here is an example of some output from kropr, ropr, and ROPgadget:
┌──(jmill@ubun)-[~]
└─$ kropr --patch-rets=false --patch-retpolines=false ./ubuntu-vmlinux | grep 0xffffffff8191f11c
0xffffffff8191f11c: pop rdi; jmp 0xffffffff82230460 <__x86_return_thunk>;
==> Found 175774 gadgets in 1.579 seconds
┌──(jmill@ubun)-[~]
└─$ ropr ./ubuntu-vmlinux | grep 0xffffffff8191f11c
==> Found 456762 gadgets in 2.583 seconds
┌──(jmill@ubun)-[~]
└─$ ROPgadget --binary ./ubuntu-vmlinux | grep 0xffffffff8191f11c
0xffffffff8191f11c : pop rdi ; jmp 0xffffffff82230460
( ignore the kropr flags for now, we’ll get to those later )
You can see there is a pop rdi; ret;
gadget that ropr is entirely unable to find because they do not account for thunked returns.
On the other hand, ROPgadget is actually able to find it, but its output makes it unclear that this is actually a ROP gadget rather than a JOP (Jump Oriented Programming) gadget.
So, this is an instance of a false negative in the case of ropr, and a true positive that is difficult to visually parse in the case of ROPgadget which may lead to it being overlooked.
I address this in kropr, as can be seen in the above output, by adding symbol names for thunked calls/jumps/returns.
False Positives From Thunks
So, as we saw, thunks can introduce false negatives, but as it turns out they can also introduce false positives!
Here is an example of two gadgets found by kropr:
0xffffffff810c41ff: jmp 0xffffffff82230460 <__x86_return_thunk>;
0xffffffff810c4200: pop rsp; ret 0x116;
Notice that these gadgets are 1 byte apart in memory, the second gadget actually starts with an unaligned instruction in the second byte of the jump instruction in the first gadget.
This is kind of interesting, because normally ret
is a single byte instruction (0xc3) meaning there cannot be an unaligned instruction inside of it, but as a result of these mitigations we now have these extra unaligned gadgets.
So… that second gadget is real, right?
Well, maybe?
The thing is, __x86_return_thunk
is, as was stated, a mitigation against the Retbleed vulnerability. Retbleed only impacted AMD’s Zen 1-2 CPUs, and this mitigation comes with a performance hit. To dodge that perf hit on unaffected CPUs, Kernel developers made it so these thunks are conditionally applied at runtime. The kernel will actually patch itself during startup depending on what CPU you are running it on.
If you are running Zen 1-2 CPU affected by Retbleed, then it is a real gadget, it will be present at runtime. On other CPUs, which are not affected by Retbleed, these gadgets are false positives because the thunk will be patched to something else.
As an example, lets check on my Zen 3 CPU running this kernel under Qemu with KVM enabled and the --cpu host
argument being passed:
gef> x/i 0xffffffff810c41ff
0xffffffff810c41ff: jmp 0xffffffff8250410b <srso_alias_return_thunk>
gef> x/i 0xffffffff810c41ff+1
0xffffffff810c4200: (bad)
wat.
So, Zen 3 is actually vulnerable to an entirely different return instruction related speculative execution vulnerability called Speculative Return Stack Overflow (SRSO, aka Inception). This vulnerability has its own thunk, srso_alias_return_thunk
that gets patched over the jmp __x86_return_thunk
instructions at boot if your CPU is vulnerable to SRSO.
So, I guess its at this point that I wrap up the blog and admit that static ROP gadget discovery for the Linux kernel is impractical to do without some false negatives/positives or full knowledge of all of the CPU features and mitigations applicable to the target system.
Or it would be, but actually I’m not done yapping quite yet !!!
Thunk Patching
Just because it is impractical to account for all possible CPUs someone might be using, doesn’t mean it isn’t worth trying to make a reliable ROP gadget scanner! What I want is a happy medium default configuration between having low false negatives but eliminating as many false positives as possible.
In kropr, to deal with the thunks problem I actually patch out all of the thunk calls/jump/returns by default, eliminating false positives from unaligned instructions inside thunks while having a nice side effect of making gadgets that contain thunks look more like you would expect them to.
If you remember from earlier in the post I said to ignore the arguments in this command:
┌──(jmill@ubun)-[~]
└─$ kropr --patch-rets=false --patch-retpolines=false ./ubuntu-vmlinux | grep 0xffffffff8191f11c
0xffffffff8191f11c: pop rdi; jmp 0xffffffff82230460 <__x86_return_thunk>;
Well, here is the output without those arguments (though I needed to add --nouniq
to prevent the gadget from being deduplicated with the other pop rdi; ret
gadgets):
┌──(jmill@ubun)-[~]
└─$ kropr --nouniq ./ubuntu-vmlinux | grep 0xffffffff8191f11c
0xffffffff8191f11c: pop rdi; ret;
Kinda nice, eh? its not a thunk anymore, its just a normal return!
And the same is true of retpoline thunks:
┌──(jmill@ubun)-[~]
└─$ kropr --patch-retpolines=false ./ubuntu-vmlinux | grep 0xffffffff810efe0b
0xffffffff810efe0b: jmp 0xffffffff8222fda0 <__x86_indirect_thunk_rdi>;
┌──(jmill@ubun)-[~]
└─$ kropr --nouniq ./ubuntu-vmlinux | grep 0xffffffff810efe0b
0xffffffff810efe0b: jmp rdi;
Its just a normal jump now!
Additionally, the case earlier with the unaligned instruction inside the ret thunk has also been addressed, because the return is back to being a single-byte instruction:
Before:
0xffffffff810c41ff: jmp 0xffffffff82230460 <__x86_return_thunk>;
0xffffffff810c4200: pop rsp; ret 0x116;
After:
0xffffffff810c41ff: ret;
So, what is this witchcraft? am I doing some cursed post-processing string replacement?
Nope, but honestly that might have been easier!
Instead, what I do in kropr is partially re-implent the kernel’s self-patching routine that happens when a CPU is not vulnerable to any vulnerabilities that necessitate thunked calls, jumps, or returns. The code that does this in Linux can be found here for returns, and here for retpoline jumps/calls.
In short, for returns it iterates over the entries in the .return_sites
section of the kernel image, which contains offsets to all of the jump instructions to thunked returns. It then replaces the thunks with a ret
instruction followed by four int3
instructions to replace the entire jump instruction.
For retpolines it iterates of the entries in the .retpoline_sites
section of the kernel image, which contains offsets to all of the calls/jumps to retpoline thunks. For each instruction it will decode the instruction to determine which register the thunk corresponds to and whether it is a call or jump instruction. It then patches over the existing thunk instruction with the typical version (e.g. call __x86_indirect_thunk_rdi
becomes call rdi
) and then fills the remaining space after the previous instruction with nop instructions.
There is some additional complexity in each of these routines that I’m glossing over which deals with various kernel configurations and other microarchitectural mitigations, but they aren’t all that important for the purposes of finding reliable gadgets.
The Alternatives Problem
As though the Thunk Problem wasn’t enough of a doozy, dealing with ‘alternatives’ is even more painful, I don’t even try to account for them at the moment in kropr!
So, what is an ‘alternative’?
The Linux kernel supports many different x86_64 processors that support many different hardware features, some of these features determine whether some instructions are valid on the processor or not. Newer generations of processors may introduce new instructions which are related to security features or even just provide a faster alternative to an existing instruction.
Alternatives, in the context of the Linux kernel, account for this case where an instruction at some address should be some instruction by default but should be a different instruction if the CPU features allow it.
For example, CPUs started to support SMAP (Supervisor Mode Access Prevention) in 2012, which introduced two instructions – stac
and clac
. The stac
instruction sets a bit in the eflags
register which temporarily disables the enforcement of SMAP, and the clac
instruction clears that bit reenabling the enforcement of SMAP.
If you’ve ever wondered how the function copy_from_user
in the kernel works when SMAP is enabled, this is how. They do stac
-> memory read
-> clac
, temporarily bypassing SMAP enforcement for the access of userspace memory.
On a CPU that doesn’t support the SMAP feature, these instructions would raise an Invalid Opcode exception. This means that the kernel needs to only use the stac
and clac
instructions in copy_from_user
if the SMAP feature is actually supported. Alternatives are what make this possible.
There is section of the kernel image for alternatives called .altinstructions
which specifies
- A location for an instruction that should be conditionally replaced
- An offset into another section called
.altinstr_replacement
which contains the alternate instruction’s code - A ‘cpuid’ value representing a CPU feature related to this alternative
- A ‘flags’ value used to specify additonal information about when the alternative should be applied
- The length of the original instruction
- The length of the replacement instruction
The actual struct for these entries in the .altinstructions
section that is used by the kernel can be found here.
Since these instructions can be replaced during boot, they serve as a source of both false positives and false negatives. An instruction in a gadget might be replaced at runtime, creating a false positive, and a useful instruction could only be present at runtime creating a false negative.
When doing static ROP gadget scanning there is no way to know what CPU the user is targeting without their input. The data from the host’s CPU could be enumerated via cpuid
but that will be different than the set of CPU features supported in a Qemu VM, even if using KVM and passing the --cpu host
! If the VM specifies --cpu kvm64
or --cpu qemu64
the set of features will be even less similar to that of the host.
I think there are a few options to address this problem:
- Allow the user to provide a CPUID dump or
/proc/cpuinfo
content from the target kernel - Allow the user to specify a CPU model and have a database of what features common CPUs support
- Provide an option that filters out any gadgets that overlap with any of the instructions in
.altinstructions
to remove any false positives - Provide a reasonable default configuration of alternatives to apply, e.g., I think we can assume that most CPUs support SMAP related instructions these days
While I do want to support some of these in kropr eventually, none of these options are currently implemented. I don’t think alternatives have a major impact on the number of false positives/negatives, at least not anywhere near as bad as the other problems I discussed. All of these replacements are related to the CPU architecture, which means there aren’t that many of them and the replacements mostly add instructions that are not typically used in ROP chains.
The Conclusion Problem
Thanks for reading, that’s all I’ve got :3
This whole rabbit hole of trying to improve Linux kernel ROP gadget discovery was really fun to go down, and led to the creation of what I think is a pretty useful tool!
At this point kropr has existed as a fork for a bit over a year and I’ve been using it or kernel pwn since its creation. Despite never really advertising it outside of my lab its actually gained a decent amount of attention, which is always nice to see. Anyways, if you do any Linux kernel pwn you should check it out and open an issue on the repo if you run into any problems while using it!
Github Link: https://github.com/zolutal/kropr