Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A tutorial to explain use of Rust-Aya for writing tracepoint programs #157

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
167 changes: 166 additions & 1 deletion docs/book/programs/tracepoints.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,168 @@
# Tracepoints

This page is a work in progress, please feel free to open a Pull Request!
!!! example "Source Code"

Full code for the example in this chapter is available [here](https://github.com/nsengupta/book/tree/main/examples/aya-tracepoint-echo-open).

# What are the tracepoints in eBPF?

In the Linux kernel, tracepoints are 'hooks' which are left by the kernel developers, at predefined points in the code. These points are statically defined, in the sense that a given kernel provides these hooks, by default. One can provide code to be clasped with these hooks, and crucially, at _runtime_! If such a clasped code exists at a tracepoint, the kernel calls it.

You can find more information about tracepoints in the [kernel documentation](https://docs.kernel.org/trace/tracepoints.html).

--------------------------------

Just as a side note: tracepoints are not exclusive to eBPF. The need for such hooks into the kernel was felt as far back as 2005, accordingly to this [article] (https://lwn.net/Articles/852112/).

--------------------------------

How does one know which are these tracepoints? On a Ubuntu 22.04, running Linux kernel version 6.5.0-28-generic, the list is available by firing the command:

`sudo cat /sys/kernel/debug/tracing/available_events`

The output looks like this:

```bash
.....
irq:softirq_exit
irq:softirq_entry
irq:irq_handler_exit
irq:irq_handler_entry
syscalls:sys_exit_capset
syscalls:sys_enter_capset
syscalls:sys_exit_capget
.....
```

There are 2180 events availble!

In order to avoid name collision, the pattern followed is "\<subsystem name\>:\<tracepoint name\>", as can be seen from the output above.

## Example project

To illustrate tracepoints using Aya, let's write a program which informs us whenever any file is opened by any process running in the system.
The kernel tracepoint `syscalls:sys_openat` lets us do this. We will use `cargo generate` utility to build the template code structure:

```bash
cargo generate https://github.com/aya-rs/aya-template
Favorite `https://github.com/aya-rs/aya-template` not found in config, using it as a git repository: https://github.com/aya-rs/aya-template
Project Name: aya-tracepoint-echo-open
Destination: /home/nirmalya/Workspace-Rust/eBPF/my-second-ebpf/aya-tracepoint-echo-open ...
project-name: aya-tracepoint-echo-open ...
Generating template ...
✔ Which type of eBPF program? · tracepoint
Which tracepoint category? (e.g sched, net etc...): syscalls
Which tracepoint name? (e.g sched_switch, net_dev_queue): sys_enter_openat
Moving generated files into: `<Current working directory>/aya-tracepoint-echo-open`...
Initializing a fresh Git repository
Done! New project created <Current working directory>/aya-tracepoint-echo-open
```

Note that the project's name is _aya-tracepoint-echo-open_, type of eBPF program is _tracepoint_ (from the menu), category is _syscalls_ and name of tracepoint is _sys_enter_openat_ .

The directory structure is similar to what is described [here](https://aya-rs.dev/book/start/#the-lifecycle-of-an-ebpf-program).

To build the application, move to the directory `aya-tracepoint-echo-open` and fire the following commands:

```bash
# First, build the application
cargo xtask build-ebpf
# And, then run
RUST_LOG=info cargo xtask run
```

The output on the screen will be:

```bash
[2024-05-12T02:30:29Z INFO aya_tracepoint_echo_open] Waiting for Ctrl-C...
[2024-05-12T02:30:29Z INFO aya_tracepoint_echo_open] tracepoint sys_enter_openat called
[2024-05-12T02:30:29Z INFO aya_tracepoint_echo_open] tracepoint sys_enter_openat called
[2024-05-12T02:30:29Z INFO aya_tracepoint_echo_open] tracepoint sys_enter_openat called
[2024-05-12T02:30:29Z INFO aya_tracepoint_echo_open] tracepoint sys_enter_openat called
...
```

So, the program is running but it is incomplete. We don't know which files are being opened. Let's modify the code, to see the names of those files.

## The modified code

```rust linenums="1" title="aya-tracepoint-echo-open-ebpf/src/main.rs"
--8<-- "examples/aya-tracepoint-echo-open/aya-tracepoint-echo-open-ebpf/src/main.rs"
```

1. Design Attempt 1: Delegate to a simple implementation (commented because it is not the right solution).
2. Design Attempt 1: A simple implementation which assumes that the maximum Path String of files opened is 16.
3. Design Attempt 1: Using Kernel's internal String, copy into a stack location.
4. Design Attempt 2: The maximum length of path to the file being opened.
5. Design Attempt 2: A `struct` that encapsulates the space for holding the Path String.
6. Design Attempt 2: The 'map' is created.
7. Design Attempt 2: The _buffer_ in the map is accessed.
8. Design Attempt 2: Contents of the bytes pointed to by `filename_addr` is copied.

Note: We are going to rely on `aya-log` to print filenames from the eBPF program.

## eBPF code

- This is the code (its skeleton is generated by `cargo generate`) that runs in the Kernel's eBPF Virtual Machine. The
pattern highlights how aya programs are structured: `aya_tracepoint_echo_open(ctx: TracePointContext)` is a public function; it delegates the actual eBPF task to a another function ( `try_aya_tracepoint_echo_open(ctx: TracePointContext) -> Result<u32, i64>`).
- The `TracePointContext` is one of goodies that Aya brings in. This works as a Rust-aware facade of the internal nuts and bolts that interact with the kernel's own APIs written in 'C'.
- The name of the file is a `string` in 'C' (a null-terminated array of `char` s). To access that string, we need to have the address of the byte at the start of the string. The offset at which this address resides, is 24 according to the kernel's documentation (available through the `cat` command mentioned in the code block above). Using the context's `read_at()` function, the address is obtained and held in `filename_addr`.
- The content of what `filename_addr` is pointing to, is the Path String of the file opened and we are intenested in.
- Our aim is to access that content and print as an UTF-8 string.

As it turns out, we can attempt a simpler yet incomplete implementation (attempt 1) and then improve upon it
(attempt 2).

## Design (attempt 1)

Because at the tracepoint `sys_enter_openat`, the kernel knows the name of the file (the path to the file, including relative path), we should be able to ask kernel to share that with us. Then, we can print the complete name of the file that is being opened.

Refer to the function:
```rust
fn try_aya_tracepoint_echo_open_small_file_path(ctx: TracePointContext) -> Result<u32, i64> {
// ..
}
```
When called from inside
```rust
aya_tracepoint_echo_open(ctx: TracePointContext) -> u32 {
```
it works. The output is like this:

```console
$ RUST_LOG=info cargo xtask run
[2024-05-16T15:32:29Z INFO aya_tracepoint_echo_open] Waiting for Ctrl-C...
[2024-05-16T15:32:30Z INFO aya_tracepoint_echo_open] tracepoint sys_enter_openat called, filename /proc/meminfo
[2024-05-16T15:32:30Z INFO aya_tracepoint_echo_open] tracepoint sys_enter_openat called, filename /sys/fs/cgroup/
[2024-05-16T15:32:30Z INFO aya_tracepoint_echo_open] tracepoint sys_enter_openat called, filename /sys/fs/cgroup/
[2024-05-16T15:32:30Z INFO aya_tracepoint_echo_open] tracepoint sys_enter_openat called, filename /sys/fs/cgroup/
```

But the program is not designed correctly. Why? There are two reasons:

1. The `buf` array is only of 16 bytes. The Path String of files being opened are likely to be much longer than this.
The contents of such Path Strings (note: the kernel can accommodate 4096 bytes of Path String) cannot be copied
to a small sized `buf`. Thus, we will not know all the files that are opened.

2. It is not possible to hold 4096 bytes in the eBPF stack (the limit is 512 bytes). Therefore, we have to resort to
some other mechanism to deal with this.

As it happens, _Aya_ provides a mechanism to do this, in the form of **eBPF Maps**. These maps are structured to accommodate data which are not bounded by the limit of 512 bytes. Moreover, these maps are a means to share data between eBPF programs and User-space programs.

## Design (attempt 2)

We create a data structure `Buf` that can hold a buffer 4K long. Thereafter, an eBPF Map `BUF: PerCpuArray<Buf>`i s
initiaized. The `filename`'s length can be as long as 4K (4096) bytes but we can hold that in the map's own space. We
are not bound by eBPF's stack-size limitation any more.

When run, the output is the same as earlier:

```console
$ RUST_LOG=info cargo xtask run
[2024-05-17T10:13:16Z INFO aya_tracepoint_echo_open] Waiting for Ctrl-C...
[2024-05-17T10:13:16Z INFO aya_tracepoint_echo_open] Kernel tracepoint sys_enter_openat called, filename /proc/33769/oom_score_adj
[2024-05-17T10:13:16Z INFO aya_tracepoint_echo_open] Kernel tracepoint sys_enter_openat called, filename /snap/firefox/4259/usr/lib/firefox/glibc-hwcaps/x86-64-v4/libmozsandbox.so
[2024-05-17T10:13:16Z INFO aya_tracepoint_echo_open] Kernel tracepoint sys_enter_openat called, filename /snap/firefox/4259/usr/lib/firefox/glibc-hwcaps/x86-64-v3/libmozsandbox.so
[2024-05-17T10:13:16Z INFO aya_tracepoint_echo_open] Kernel tracepoint sys_enter_openat called, filename /snap/firefox/4259/usr/lib/firefox/glibc-hwcaps/x86-64-v2/libmozsandbox.so
```

2 changes: 2 additions & 0 deletions examples/aya-tracepoint-echo-open/.cargo/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[alias]
xtask = "run --package xtask --"
19 changes: 19 additions & 0 deletions examples/aya-tracepoint-echo-open/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
### https://raw.github.com/github/gitignore/master/Rust.gitignore

# Generated by Cargo
# will have compiled files and executables
debug/
target/

# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
# More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
Cargo.lock

# These are IDE-specific configurations
**/.vscode
**/.vscode/*
**/.vim
**/.vim/*

# These are backup files generated by rustfmt
**/*.rs.bk
2 changes: 2 additions & 0 deletions examples/aya-tracepoint-echo-open/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[workspace]
members = ["xtask", "aya-tracepoint-echo-open", "aya-tracepoint-echo-open-common"]
25 changes: 25 additions & 0 deletions examples/aya-tracepoint-echo-open/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@

# aya-tracepoint-echo-open

This is an experimentation with eBPF tracepoint, using smaller stack space and then, using eBPF Maps.
The intention is to understand how to use the _tracepoints_ made available by the kernel.
The original code is generated using Rust-Aya 0.1.0 and then modified to help in experimentation.

## Prerequisites

1. Install bpf-linker: `cargo install bpf-linker`

## Build eBPF

```bash
cargo xtask build-ebpf
```

To perform a release build you can use the `--release` flag.
You may also change the target architecture with the `--target` flag.

## Run

```bash
RUST_LOG=info cargo xtask run
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
[package]
name = "aya-tracepoint-echo-open-common"
version = "0.1.0"
edition = "2021"

[features]
default = []
user = ["aya"]

[dependencies]
aya = { version = "0.12", optional = true }

[lib]
path = "src/lib.rs"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
#![no_std]
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[build]
target-dir = "../target"
target = "bpfel-unknown-none"

[unstable]
build-std = ["core"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
[package]
name = "aya-tracepoint-echo-open-ebpf"
version = "0.1.0"
edition = "2021"

[dependencies]
aya-ebpf = "0.1.0"
aya-log-ebpf = "0.1.0"
aya-tracepoint-echo-open-common = { path = "../aya-tracepoint-echo-open-common" }

[[bin]]
name = "aya-tracepoint-echo-open"
path = "src/main.rs"

[profile.dev]
opt-level = 3
debug = false
debug-assertions = false
overflow-checks = false
lto = true
panic = "abort"
incremental = false
codegen-units = 1
rpath = false

[profile.release]
lto = true
panic = "abort"
codegen-units = 1

[workspace]
members = []
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[toolchain]
channel = "nightly"
# The source code of rustc, provided by the rust-src component, is needed for
# building eBPF programs.
components = [
"cargo",
"clippy",
"rust-docs",
"rust-src",
"rust-std",
"rustc",
"rustfmt",
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
#![no_std]
#![no_main]

use aya_ebpf::{
helpers::bpf_probe_read_user_str_bytes,
macros::{map, tracepoint},
maps::PerCpuArray,
programs::TracePointContext,
};
use aya_log_ebpf::info;

// (4)
const MAX_PATH: usize = 4096;

// (5)
#[repr(C)]
pub struct Buf {
pub buf: [u8; MAX_PATH],
}

#[map]
pub static mut BUF: PerCpuArray<Buf> = PerCpuArray::with_max_entries(1, 0); // (6)

#[tracepoint]
pub fn aya_tracepoint_echo_open(ctx: TracePointContext) -> u32 {
// match try_aya_tracepoint_echo_open_small_file_path(ctx: TracePointContext) { // (1)
match try_aya_tracepoint_echo_open(ctx) {
Ok(ret) => ret,
Err(ret) => ret as u32,
}
}

fn try_aya_tracepoint_echo_open(ctx: TracePointContext) -> Result<u32, i64> {
// Load the pointer to the filename. The offset value can be found running:
// sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format
const FILENAME_OFFSET: usize = 24;

if let Ok(filename_addr) = unsafe { ctx.read_at::<u64>(FILENAME_OFFSET) } {
// get the map-backed buffer that we're going to use as storage for the filename
let buf = unsafe {
let ptr = BUF.get_ptr_mut(0).ok_or(0)?; // (7)
&mut *ptr
};

// read the filename
let filename = unsafe {
core::str::from_utf8_unchecked(
bpf_probe_read_user_str_bytes( // (8)
filename_addr as *const u8,
&mut buf.buf,
)?
)
};

if filename.len() < MAX_PATH {
// log the filename
info!(
&ctx,
"Kernel tracepoint sys_enter_openat called, filename {}", filename
);
}
}
Ok(0)
}

// This function assumes that the maximum length of a file's path can be of 16 bytes. This is meant
// to be read as an example, only. Refer to the accompanying `tracepoints.md` for its inclusion in the
// code.
fn try_aya_tracepoint_echo_open_small_file_path(ctx: TracePointContext) -> Result<u32, i64> { // (2)
const MAX_SMALL_PATH: i32 = 16;
let mut buf: [u8; MAX_SMALL_PATH] = [0; MAX_SMALL_PATH];

// Load the pointer to the filename. The offset value can be found running:
// sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format
const FILENAME_OFFSET: usize = 24;
if let Ok(filename_addr) = unsafe { ctx.read_at::<u64>(FILENAME_OFFSET) } {
// read the filename
let filename = unsafe {
// Get an UTF-8 String from an array of bytes
core::str::from_utf8_unchecked(
// Use the address of the kernel's string // (3)
// to copy its contents into the array named 'buf'
match bpf_probe_read_user_str_bytes (
filename_addr as *const u8,
&mut buf,
) {
Ok(_) => &buf,
Err(e) => {
info!(&ctx, "tracepoint sys_enter_openat called buf_probe failed {}", e);
return Err(e);
},
}
)
};
info!(&ctx, "tracepoint sys_enter_openat called, filename {}", filename);
}
Ok(0)
}

#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
unsafe { core::hint::unreachable_unchecked() }
}
Loading