In this blog, we'll demonstrate what eBPF Verifier is, why it's necessary, how to address limitations imposed by the verifier.
We'll begin by covering the need of verifier and what rules is accompanied by. Following that, I'll show you some flawed bpf programs and corresponding bpf errors, then we'll correct those bpf programs.
This blog idea came to me while developing our Ddosify's eBPF Agent, Alaz. All the source codes referenced in this article are available here.
If you are interested in function tracing with eBPF, you can check out our Exploring Funtion Tracing With eBPF and Uprobes - Episode 1 blog.
Prerequisitesโ
- Clang for compiling our eBPF program.
- Linux machine to work on.
- bpftool for loading our compiled bpf programs into kernel.
What is eBPF and Verifier?โ
eBPF (extended Berkeley Packet Filter) is a revolutionary technology originating from the Linux kernel. It allows sandboxed programs to run in privileged contexts inside the kernel.*
Compiled eBPF bytecode runs on the eBPF virtual machine residing inside the kernel. It takes in a program in the form of eBPF bytecode instructions, and these have to be converted to native machine instructions that run on the CPU.
eBPF bytecode consists of a set of instructions, and those instructions act on (virtual) eBPF registers. The eBPF instruction set and register model were designed to map neatly to common CPU architectures so that the step of compiling or interpreting from bytecode to machine code is reasonably straightforward. - (Learning eBPF by Liz Rice)
Verifier is the component that ensures the safe deployment of BPF programs into the kernel. We need to make sure we'll not crash the kernel or access anything we should not.
Image has been retrieved from ebpf.io
Let's briefly discuss eBPF registers before delving to verifier. We'll come across them in following examples and verifier logs.
eBPF Registersโ
eBPF bytecode runs on the eBPF VM inside the kernel. This VM has its own instruction set and software-based registers. These instructions act on the defined registers.
eBPF virtual machine has a total of 11 registers. All registers are 64-bit.
R0 is return register.
R1-R5 registers keeps function arguments.
R6-R9 callee saved registers.
R10 used as stack frame ptr.
The context of the bpf program are set to R1.
When a function call is made, that means you can have a maximum of 5 function arguments.
You can have look at bpf registers and instruction structure at kernel source code bpf.h.
Register types are listed on enum bpf_reg_type
. For a summary I'll get the first 3, the rest are pointers to different types of structs.
enum bpf_reg_type {
NOT_INIT = 0, /* nothing was written into register */
SCALAR_VALUE, /* reg doesn't contain a valid pointer */
PTR_TO_CTX, /* reg points to bpf_context */
PTR_TO_...
...
How verifier does it?โ
eBPF verifier does a DAG (Directed Acyclic Graph) check on the program to disallow loops and other checks for detecting unreachable instructions.
Starting from the first instruction, it will simulate (actually not all) possible execution paths, and observes the state of bpf registers and bpf stack.
You can read more about it on official documentation.
Example Verifier Errorsโ
In the following section, we'll take a look at 3 bpf programs that has different kind of verifier errors. We'll investigate them and make corrections. All flawed and corrected programs reside in GitHub repository here. You can follow along.
I'll specify the commands that will be used to compile the BPF programs, load them into the kernel, and view the verifier logs.
Commands for compiling bpf programsโ
We'll use clang for compiling our bpf programs.
clang-14 -O2 -g -Wall -Werror -target bpf -c bpf_flawed.c -I ../
clang-14 -O2 -g -Wall -Werror -target bpf -c bpf_corrected.c -I ../
Please note that -g
option tells compiler to include debug information inside the object file.
This enables us to see source code lines on verifier logs together with bpf instructions.
Commands for loading programs into kernelโ
Using bpftool
, we'll try to load into kernel.
At this point verifier will run its checks on them.
Try to load flawed program into the kernel.
bpftool prog load bpf_flawed.o /sys/fs/bpf/flawed
bpftool --debug prog load bpf_corrected.o /sys/fs/bpf/corrected # debug flag is for seeing verifier logs
You'll need root permission to load programs into kernel.
1 - Unreleased referenceโ
In the following flawed example snippet, we're reserving a ringbuf slot, getting pids that are odd numbers and submitting to userspace through it.
/* reserve sample from BPF ringbuf */
e = bpf_ringbuf_reserve(&rb, sizeof(*e), 0);
if (!e)
return 0;
// Just for the demonstration, we'll only submit an event for pids that end with odd numbers
if (pid % 2 == 0){
return 0;
}
e->pid = pid;
bpf_ringbuf_submit(e, 0);
return 0;
The verifier log awaiting for us:
Unreleased reference id=3 alloc_insn=10
Let's look at the 10th instruction
10: (85) call bpf_ringbuf_reserve#131 ; R0=ringbuf_mem_or_null(id=3,ref_obj_id=3,off=0,imm=0) refs=3
On the right side, verifier internals are logged. A reference to a slot on ringbuf represented with id 3.
Later on odd number check is made on instruction 13. And program continues in the basic block that exits.
13: (57) r1 &= 1 ; R1_w=scalar(umax=1,var_off=(0x0; 0x1)) refs=3
14: (15) if r1 == 0x0 goto pc+4 19: R0=ringbuf_mem(ref_obj_id=3,off=0,imm=0) R1_w=0 R6=scalar(id=4,umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 refs=3
Let's look at the instruction before exit.
19: (b7) r0 = 0 ; R0_w=0 refs=3
20: (95) exit
Before the exit, program still has reference to object with id 3, that we get in result of bpf_ringbuf_reserve call. That's why verifier stops us from loading the program into the kernel.
We need to release the reference, that's what we'll do in the corrected program.
// Just for the demonstration, we'll only submit an event for pids that end with odd numbers
if (pid % 2 == 0){
// Release reserved ringbuf location
bpf_ringbuf_discard(e,0);
return 0;
}
In case of pid is an even number, we added a call to bpf_ringbuf_discard
.
Note that there are no refs left before exit instruction this time.
13: (57) r1 &= 1 ; R1_w=scalar(umax=1,var_off=(0x0; 0x1)) refs=3
14: (55) if r1 != 0x0 goto pc+4 ; R1_w=0 refs=3
17: (85) call bpf_ringbuf_discard#133 ;
23: (b7) r0 = 0 ; R0_w=0
24: (95) exit
2 - R0 !read_okโ
When I encountered this error while development, it really confused me, but actually it is very simple. Let's see examples, and I'll tell you why I was confused.
SEC("uprobe/test_uprobe_func")
void BPF_UPROBE(test_uprobe_func) {
// totally empty
}
For demonstration, I've implemented a user probe that does nothing. As you'll expect it compiles fine. But when you try to load it to kernel. We get following:
0: (95) exit
R0 !read_ok
The thing is ebpf programs return value in register R0 no matter your functions return type is. It's trying to read register R0 for returning before exit instruction. But it is uninitialized. That's why verifier throws the error.
In order to get rid of this error, you don't need to necessarily change your functions return type to int. You just need to initialize R0 before exit, that's all :)
SEC("uprobe/test_uprobe_func")
void BPF_UPROBE(test_uprobe_func) {
// These func calls will initialize R0
bpf_printk("Add these log lines"); // 4: R0_w=scalar()
bpf_printk("So that");
bpf_printk("R0 gets initialized"); // R0=scalar()
}
Simply adding some prints fades the error away. Because R0 is used for the return value from a helper function.
Looking at verifier logs, we see R0 is initialized this time:
4: R0_w=scalar()
...
8: R0_w=scalar()
; bpf_printk("R0 gets initialized"); // R0=scalar()
At some point in development, you can come across this error mysteriously if you use void as return type like me, so I suggest you to use always int to avoid this error. Following is more suitable:
SEC("uprobe/test_uprobe_func1")
int BPF_UPROBE(test_uprobe_func1) {
return 0;
}
3 Invalid Mem Accessโ
This one is pretty much straightforward. Typically, this occurs when a NULL check is missed or when an attempt is made to read out-of-bounds memory. Let's look at the example.
SEC("ksyscall/execve")
int kprobe_exec(void *ctx)
{
struct msg_t *p;
u64 id = bpf_get_current_pid_tgid();
p = bpf_map_lookup_elem(&id_name_map, &id);
char a = p->message[0];
bpf_printk("%d", a);
return 0;
}
In the verifier logs, we can see that it threw error for char a = p->message[0]
. Program is dereferencing R0, but we missed to do its null check. That is the reason.
; char a = p->message[0];
8: (71) r3 = *(u8 *)(r0 +0)
R0 invalid mem access 'map_value_or_null'
When we add the NULL check before dereferencing the pointer, verifier is now happy.
p = bpf_map_lookup_elem(&id_name_map, &id);
// NULL check
if(p){
char a = p->message[0];
bpf_printk("%d", a);
}
Conclusionโ
We've introduced the verifier and explored BPF registers, delving into some common errors. In the upcoming blog post, we will further cover a range of diverse and less common error cases. Additionally, I will be updating the repository independently of the blogs. Please feel free to contribute more examples to the repository!
On the Ddosify Platform, our eBPF Agent, Alaz, utilizes eBPF to gather insights and collect observability data from Kubernetes clusters. The technique demonstrated in this blog is actually employed in capturing encrypted traffic. Don't forget to drop a star on GitHub.
If you require any assistance with Ddosify or wish to discuss this topic further, feel free to join our Discord Server.
Share on social media: