eBPF Observability Tools Are Not Security Tools

eBPF has many uses in improving computer security, but just taking eBPF observability tools as-is and using them for security monitoring would be like driving your car into the ocean and expecting it to float.

Observability tools are designed have the lowest overhead possible so that they are safe to run in production while analyzing an active performance issue. Keeping overhead low can require tradeoffs in other areas: tcpdump(8), for example, will drop packets if the system is overloaded, resulting in incomplete visibility. This creates an obvious security risk for tcpdump(8)-based security monitoring: An attacker could overwhelm the system with mostly innocent packets, hoping that a few malicious packets get dropped and are left undetected. Long ago I encountered systems which met strict security auditing requirements with the following behavior: If the kernel could not log an event, it would immediately halt! While this was vulnerable to DoS attacks, it met the system's security auditing non-repudiation requirements, and logs were 100% complete.

There are ways to evade detection in other tools as well, like top(1) (since it samples processes and relies on its comm field) and even ls(1) (putting escape characters in files). Rootkits do this. These techniques have been known in the industry for decades and haven't been "fixed" because they aren't "broken." They are cars, not boats. Similar methods can be used to evade detection in the eBPF bcc and bpftrace observability tools as well: overwhelming them with events, doing time-of-check-time-of-use attacks (TOCTOU), escape characters, etc.

When will the eBPF community "fix" these tools? Well, when will Tesla fix my Model 3 so I can drive it under the Oakland bridge instead of over it? (I joke, and I don't drive a Tesla.) What you actually want is a security monitoring tool that meets a different set of requirements. Trying to adapt observability tools into security tools generally increases overhead (e.g., adding extra probes) which negates the main reason I developed these using eBPF in the first place. That would be like taking the wheels off a car to help make it float. There are other issues as well, like decreasing maintainability when moving probes from stable tracepoints to unstable inner workings for TOU tracing. Had I written these as security tools to start with, I would have done them differently: I'd start with LSM hooks, use a plugin model instead of standalone CLI tools, support configurable policies for event drop behavior, optimize event logging (which we still haven't done), and lots more.

None of this should be news to experienced security engineers. I'm writing this post because others see the tools and examples I've shared and believe that, with a bit of shell scripting, they could have a good security monitoring product. I get that it looks that way, but in reality there's a bunch of work to do. Ideally I'd link to an example in bcc for security monitoring (we could create a subdirectory for them) but that currently doesn't exist. In the meantime my best advice is: If you are making a security monitoring product, hire a good security engineer (e.g., someone with solid pen-testing experience).

BPF for security monitoring was first explored by myself and a Netflix security engineer, Alex Maestretti, in a 2017 BSides talk (some slides below). Since then I've worked with other security engineers on the topic (hi Michael, Nabil, Sargun, KP). (I also did security work many years ago, so I'm not completely new to the topic.)

BSidesSF2017 BPF security monitoring: Alex Maesretti, Brendan Gregg

There is potential for an awesome eBPF security product, and it's not just the visibility that's valuable (all those arrows) it's also the low overhead. These slides included our overhead evaluation showing bcc/eBPF was far more efficient than auditd or go-audit. (It was pioneering work, but unfortunately the slides are all we have: Alex, I, and others left Netflix before open sourcing it.) There are now other eBPF security products, including open source projects (e.g., tetragon), but I don't know enough about them all to have a recommendation.

Note that I'm talking about the observability tools here and not the eBPF kernel runtime itself, which has been designed as a secure sandbox. Nor am I talking about privilege escalation, since to run the tools you already need root access (that car has sailed!).

Brendan Gregg's Blog

eBPF Observability Tools Are Not Security Tools