Skip to content

Conversation

will-isovalent
Copy link
Contributor

@will-isovalent will-isovalent commented Aug 19, 2025

Introduce a next-generation debug framework that provides structured debug messages from
BPF programs via perf buffers instead of the traditional bpf_printk() mechanism.

Why do this? It would be nice to include helperful debugging messages throughout our eBPF
codebase. However, until now we had no way to selectively output debugging based on
subsystem, and our previous bpf_printk lacked a standardized way of providing much-needed
context in its output. Moreover, reading from the tracelog with bpftool can be tedious and
error-prone in practice, particularly when juggling multiple terminals. The new debugging
framework solves all of these problems by providing a standardized way to encode and
submit debug messages to userspace from bpf code. Subsequent commits in this series will
leverage the new framework to implement fine-grained subsystem filtering for BPF debug messages.

Key changes:

  • Add PERF_DEBUG build flag and --enable-perf-debug CLI option
  • Create new debug.h header with structured debug event framework
  • Add debug reader in userspace to parse and display structured debug messages
  • Debug reader is implemented via a separate perf event map to avoid overfilling the events buffer and risk losing events
  • Refactor set_in_init_tree() to accept context parameter for debug integration

The new framework provides:

  • Structured debug messages with timestamp, PID, and CPU information
  • Efficient message formatting using BPF snprintf() helper
  • Conditional compilation based on TETRAGON_PERF_DEBUG flag
  • Fallback to traditional bpf_printk() when perf debug is disabled
  • Runtime control via --enable-perf-debug command line flag

@will-isovalent will-isovalent added the release-note/misc This PR makes changes that have no direct user impact. label Aug 19, 2025
@will-isovalent will-isovalent force-pushed the pr/will/next-generation-debugging branch 2 times, most recently from c07ed4a to c357edd Compare August 20, 2025 15:55
@olsajiri
Copy link
Contributor

olsajiri commented Aug 21, 2025

Debug reader is implemented via a separate perf event map to avoid overfilling the events buffer and risk losing events

given the last discussion with @kevsecurity on ring buffers, would the new ring buffer be better choice in here?

also there's also stdout/stderr stream support added recently to kernel https://lore.kernel.org/bpf/[email protected]/ which I was thinking to use eventually.. might be faster, but I guess it will take some time it hits some customer's stable release ... maybe just keep that in mind, so we could use
same framework in future with streams inside

@kevsecurity
Copy link
Contributor

given the last discussion with @kevsecurity on ring buffers, would the new ring buffer be better choice in here?

I should have a PR soon where we default to the bpf ring buffer instead of perf ring buffer (and fallback when told to or not available). I'd imagine you could do the same by copying or refactoring.

But in terms of benefits, bpf ring buffer's reserve and commit approach will save adding more heap maps (as we have a hard upper limit of 64 maps/program and this will become important sometime). Also, one less copy has to be good!

@will-isovalent
Copy link
Contributor Author

Nice I actually wanted to use a ring buffer originally but thought it would be out of scope to add support for it. I'm happy to update the PR when we have initial ringbuffer support.

@kevsecurity
Copy link
Contributor

Nice I actually wanted to use a ring buffer originally but thought it would be out of scope to add support for it. I'm happy to update the PR when we have initial ringbuffer support.

Awesome. Shouldn't be long; just some tests to fix.

@will-isovalent will-isovalent force-pushed the pr/will/next-generation-debugging branch from c357edd to e550e66 Compare August 21, 2025 15:40
Copy link

netlify bot commented Aug 21, 2025

Deploy Preview for tetragon ready!

Name Link
🔨 Latest commit db2bdbb
🔍 Latest deploy log https://app.netlify.com/projects/tetragon/deploys/68a748a1b95d740008e22b0e
😎 Deploy Preview https://deploy-preview-4024--tetragon.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@will-isovalent will-isovalent force-pushed the pr/will/next-generation-debugging branch 10 times, most recently from d911e31 to 1309e52 Compare August 26, 2025 02:53
Introduce a next-generation debug framework that provides structured debug messages from
BPF programs via perf buffers instead of the traditional bpf_printk() mechanism.

Why do this? It would be nice to include helperful debugging messages throughout our eBPF
codebase. However, until now we had no way to selectively output debugging based on
subsystem, and our previous bpf_printk lacked a standardized way of providing much-needed
context in its output. Moreover, reading from the tracelog with bpftool can be tedious and
error-prone in practice, particularly when juggling multiple terminals. The new debugging
framework solves all of these problems by providing a standardized way to encode and
submit debug messages to userspace from bpf code. Subsequent commits in this series will
leverage the new framework to implement fine-grained subsystem filtering for BPF debug messages.

Key changes:
    - Add PERF_DEBUG build flag and --enable-perf-debug CLI option
    - Create new debug.h header with structured debug event framework
    - Add debug reader in userspace to parse and display structured debug messages
    - Debug reader is implemented via a separate perf event map to avoid overfilling the events buffer and risk losing events
    - Refactor set_in_init_tree() to accept context parameter for debug integration

The new framework provides:
    - Structured debug messages with timestamp, PID, and CPU information
    - Efficient message formatting using BPF snprintf() helper
    - Conditional compilation based on TETRAGON_PERF_DEBUG flag
    - Fallback to traditional bpf_printk() when perf debug is disabled
    - Runtime control via --enable-perf-debug command line flag

Signed-off-by: William Findlay <[email protected]>
This commit introduces a new v513 kernel variant to automatically enable
the perf buffer-based debug system for Linux kernel 5.13 and later,
while preserving fallback behavior for older kernels.

Changes made:

1. **Added complete v513 variant to bpf/Makefile**:
   - Add v513 variants for all object types
   - Integrated v513 into DEFINE_VARIANT macro system

2. **Automatic TETRAGON_PERF_DEBUG for 5.13+ kernels**:
   - CFLAGS_v513, CFLAGS_v61, and CFLAGS_v612 now include -DTETRAGON_PERF_DEBUG
   - Modern kernels (5.13+) get perf debug by default without manual configuration
   - Older variants (v310, v53, v511) continue using printk-based debug by default

3. **Updated conditional compilation logic in bpf/lib/debug.h**:
   - Changed from 'ifdef TETRAGON_PERF_DEBUG' to 'if defined(__V513_BPF_PROG) || defined(TETRAGON_PERF_DEBUG)'
   - Kernels 5.13+ automatically use perf buffer debug system
   - Older kernels fall back to bpf_printk unless manually overridden

Signed-off-by: William Findlay <[email protected]>
Add unit tests for the config package to validate BPF object selection
logic across different kernel versions and variants:

- TestKernelVersionSelection: Tests Enable*Progs() functions for
  v612, v61, v513, and RHEL7 variant selection logic
- TestBaseSensorObjSelection: Tests ExitObj() and ForkObj() selection
  across all kernel versions and ForceSmallProgs scenarios
- TestAdditionalSensorObjSelection: Tests comprehensive BPF object
  selection including BprmCommitObj(), EnforcerObj(), LoaderObj(),
  CgroupObj(), and CgtrackerObj()

Tests use the testify framework for clean assertions and cover:
- All kernel version thresholds (6.12, 6.1, 5.13, 4.19, 3.10)
- ForceSmallProgs override behavior
- Proper fallback to base variants when appropriate
- Complete v513 variant system validation

This provides confidence that the v513 variant implementation works
correctly across all BPF objects and kernel versions.

Signed-off-by: William Findlay <[email protected]>
Signed-off-by: William Findlay <[email protected]>
Signed-off-by: William Findlay <[email protected]>
@will-isovalent will-isovalent force-pushed the pr/will/next-generation-debugging branch from 1309e52 to e923028 Compare August 26, 2025 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants