BPF Performance Tools (book)
This is the official site for the book BPF Performance Tools: Linux System and Application Observability, published by Addison Wesley (2019). This book can help you get the most out of your systems and applications, helping you improve performance, reduce costs, and solve software issues. Here I'll describe the book, link to related content, and list errata and updates.
The book is available on Amazon.com (paperback, kindle), InformIT (paperback, PDF, etc), and Safari (here and here). The paper book was released in December 2019 but sold out immediately; more copies printed soon. ISBN-13: 9780136554820.
The Amazon Kindle preview shows the first 100 pages out of this 880 page book.
As an example new tool from the book, readahead.bt provides a new view of file system read ahead performance: the age of read-ahead pages when they are finally referenced, and unused read-ahead pages while tracing:
# readahead.bt Attaching 5 probes... ^C Readahead unused pages: 128 Readahead used page age (ms): @age_ms:  2455 |@@@@@@@@@@@@@@@ | [2, 4) 8424 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4, 8) 4417 |@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [8, 16) 7680 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [16, 32) 4352 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | [32, 64) 0 | | [64, 128) 0 | | [128, 256) 384 |@@ |
The book covers many of the existing tools as well, for example, tcplife for efficiently logging TCP session details:
# tcplife PID COMM LADDR LPORT RADDR RPORT TX_KB RX_KB MS 4169 java 184.108.40.206 40158 220.127.116.11 6001 7 33 3590.91 4169 java 18.104.22.168 56940 22.214.171.124 6101 0 0 2.48 4169 java 126.96.36.199 6001 188.8.131.52 49482 0 0 17.94 4169 java 184.108.40.206 18926 220.127.116.11 6101 0 0 0.90 4169 java 18.104.22.168 44530 22.214.171.124 6001 0 0 2.64 4169 java 126.96.36.199 44406 188.8.131.52 6001 11 28 3982.11 34781 sshd 184.108.40.206 22 220.127.116.11 41566 5 7 2317.30 [...]
The book explains these and over 150 other BPF tools, as well as summarizing over 30 traditional performance analysis tools (top, vmstat, iostat, perf, Ftrace, etc) so that you can use the right tool for the job.
What is BPF?
Extended BPF is a built-in Linux kernel technology, added in parts since 3.18. At least Linux 4.9 is necessary to utilize the tools in this book. All Linux distributions can use the BPF tools (Ubuntu, CentOS, Fedora, Red Hat, etc): although the status of BCC and bpftrace varies for each distribution. Some have packages, others still require a build from source. See the install instructions for BCC and bpftrace.
Other operating systems including BSD (where BPF originated) are not covered in this book. As extended BPF is being ported elsewhere, a future edition of this book may cover more than Linux.
This book is primarily for engineers, developers, and support staff in enterprise and cloud environments. No programming is required, unless you want to, as you can use this book as either:
- A reference of ready-to-run performance analysis and debugging tools.
- A guide for learning how to develop new tools.
This book is also useful for students as a way to learn system internals in an interactive way: you can run and develop tools to examine the workings of the system.
Over 150 BPF tools are covered in the book, for performance analysis, troubleshooting, and other uses (e.g., security forensics). These tools provide observability for CPUs, memory, disks, file systems, networking, languages, applications, containers, hypervisors, security, and the Linux kernel. To explain how to analyze different languages, three types of execution are studied: compiled, JIT-compiled, and interpreted, using C, Java, and the bash shell as examples. The same approaches can be applied to other languages, and a summary for Node.js, C++, and Golang are included.
To cover all these targets, many new tools needed to be developed for this book. The diagram on the top right shows these new tools colored red. The source to these is included in the book, and can also be found here:
The /originals directory contains an as-is snapshot of the published tools, and /updated contains those tools plus updated versions.
Table of Contents
Achievement unlocked, finished chapter 1 of BPF performance tools, found and disabled several services that were spamming the system with several open/close/exec loops— bsingharora (@bsingharora) November 26, 2019
Found a misconfiguration in nginx with gzip_static being enabled via `opensnoop`, the new BPF Performance Tools book by @brendangregg is great reading so far. We saw a 10% latency drop immediately 😮— Kyle Scott Mcgill (@kylescottmcgill) November 28, 2019
Part I: Technologies
2. Technology Background
3. Performance Analysis
8. File Systems
9. Disk I/O
17. Other BPF Tools
18. Tips and Tricks
Apx.B. bpftrace Cheat Sheet
Apx C. bcc Tool Development
Apx D. C BPF
Apx E. BPF Instructions
PDF Download eBook ePUB
The Safari online book store features early drafts of books for feedback, called "rough cuts." I'd never published one before, but did this time to see if it helped. It did not. This happened:
- I received next to no feedback from the rough cut.
- A badly-formatted ePUB version immediately appeared on pirate sites, months before the book was finished.
This pirate version is missing bug fixes and content I later added. It is really frustrating as I've worked hard to give readers the best possible experience, but some of you may be studying this draft instead, thinking that it's the final book. There is also (obviously) no way for the publisher to ask the pirates to update their version. Please only read the finished book, preferably "second printing" or later (as the second printing should include the errata fixes, listed below). One tell-tale sign: the cover of the final book includes the text "Foreword by Alexei Starovoitov...," and the early draft versions did not.
- bpftrace: The BPF front-end used for code examples in the book.
- BCC: The BPF front-end used for complex tools in the book.
- Linux eBPF Tracing Tools: My page about BPF tracing tools for performance analysis.
- BPF Performance Tools (blog post): My blog post to launch this book.
- pxxvii, Preface: the footnote 1 text is somehow from chapter 6 by mistake; it should be: "The exercises include some advanced and "unsolved" problems, for which I have yet to see a working solution. It is possible that some of these problems are impossible to solve without kernel or application changes."
- pxxxiv, Preface: the Kindle version has a conversion error where two early page numbers are inserted into the text, appearing as "xxxivtracepoints" and "xxxvmost of whom".
- 2.6, p45: "This figure also shows the Linux kernel versions ...": that was the old figure, but not this new one.
- 2.10.2, p60: "The location of the probe from the previous readelf(1) output was 0x6a2."; that previous output was deleted.
- 2.13, p64: "Linux 2.6.21" -> "Linux 2.6.31".
- 5.9.6, p154: "A a rate of 99" extra "a".
- 6.2.3, p192: "perf list" should be "perf script" (twice).
- 6.2.5, p196: "perf script to show the rate" should be "perf stat ..." (matching the screenshot).
- 9.1.3, p346: the Safari version misnumbers step 2a as another step 1.
- 9.3.2, p359: "biostoop(8)" -> "biosnoop(8)".
- 9.3.7, p370: "kprobe:blk_start_request,kprobe:blk_mq_start_request" -> "kprobe:blk_account_io_done", to trace the full I/O latency (and not just OS queued time).
- ApxC, p749: "line 4 imports the BPF library" should be line 2.
- ApxC, p749: "predate his capability" typo his->this.
- ApxC, p767: "make $(getconf" -> "make -j $(getconf" (missing -j).
- ApxC, p767: "thesamples" -> "the samples".
These are updates to BPF and its front-ends, many of which were mentioned in the book as "planned" and have since been implemented:
- 5.5.1 p173: bpf_probe_read_kernel() and bpf_probe_read_user() have now been implemented and may show up in Linux 5.5.
- 5.15.2 p174: bpfrace added signal() (thanks Bas Smit).
- 5.15.2 p175: bpftrace added override_return() (thanks Bas Smit).
- bpftrace added strncmp() (thanks Jay Kamat, Bas Smit).
- 5.10.3, p155: bpftrace added if else support (thanks Daniel Xu).
- 5.10.4, p155: bpftrace added while() loops (thanks Bas Smit).
- bpftrace curtask is now a task_struct if type info is available (headers or BTF).
Thanks to all the reviewers, and to Deirdré Straughan for editing another one of my books!