USENIX LISA 2016: Linux 4.x Tracing Tools, Using BPF Superpowers
Video: https://www.youtube.com/watch?v=UmOU3I36T2U&t=1151sTalk for USENIX LISA 2016 by Brendan Gregg.
Description: "The Linux 4.x series heralds a new era of Linux performance analysis, with the long-awaited integration of a programmable tracer: Enhanced BPF (eBPF). Formally the Berkeley Packet Filter, BPF has been enhanced in Linux to provide system tracing capabilities, and integrates with dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). This has allowed dozens of new observability tools to be developed so far: for example, measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived.
In this talk I'll show you how to use BPF in the Linux 4.x series, and I'll summarize the different tools and front ends available, with a focus on iovisor bcc. bcc is an open source project to provide a Python front end for BPF, and comes with dozens of new observability tools (many of which I developed). These tools include new BPF versions of old classics, and many new tools, including: execsnoop, opensnoop, funccount, trace, biosnoop, bitesize, ext4slower, ext4dist, tcpconnect, tcpretrans, runqlat, offcputime, offwaketime, and many more. I'll also summarize use cases and some long-standing issues that can now be solved, and how we are using these capabilities at Netflix."
next prev 1/68 | |
next prev 2/68 | |
next prev 3/68 | |
next prev 4/68 | |
next prev 5/68 | |
next prev 6/68 | |
next prev 7/68 | |
next prev 8/68 | |
next prev 9/68 | |
next prev 10/68 | |
next prev 11/68 | |
next prev 12/68 | |
next prev 13/68 | |
next prev 14/68 | |
next prev 15/68 | |
next prev 16/68 | |
next prev 17/68 | |
next prev 18/68 | |
next prev 19/68 | |
next prev 20/68 | |
next prev 21/68 | |
next prev 22/68 | |
next prev 23/68 | |
next prev 24/68 | |
next prev 25/68 | |
next prev 26/68 | |
next prev 27/68 | |
next prev 28/68 | |
next prev 29/68 | |
next prev 30/68 | |
next prev 31/68 | |
next prev 32/68 | |
next prev 33/68 | |
next prev 34/68 | |
next prev 35/68 | |
next prev 36/68 | |
next prev 37/68 | |
next prev 38/68 | |
next prev 39/68 | |
next prev 40/68 | |
next prev 41/68 | |
next prev 42/68 | |
next prev 43/68 | |
next prev 44/68 | |
next prev 45/68 | |
next prev 46/68 | |
next prev 47/68 | |
next prev 48/68 | |
next prev 49/68 | |
next prev 50/68 | |
next prev 51/68 | |
next prev 52/68 | |
next prev 53/68 | |
next prev 54/68 | |
next prev 55/68 | |
next prev 56/68 | |
next prev 57/68 | |
next prev 58/68 | |
next prev 59/68 | |
next prev 60/68 | |
next prev 61/68 | |
next prev 62/68 | |
next prev 63/68 | |
next prev 64/68 | |
next prev 65/68 | |
next prev 66/68 | |
next prev 67/68 | |
next prev 68/68 |
PDF: LISA2016_BPF_tools_16_9.pdf
Keywords (from pdftotext):
slide 1:
Linux 4.x Tracing Tools Using BPF Superpowers Brendan Gregg, NeElix bgregg@neElix.com December 4–9, 2016 | Boston, MA www.usenix.org/lisa16 #lisa16slide 2:
slide 3:
slide 4:
Demo Gme GIVE ME 15 MINUTES AND I'LL CHANGE YOUR VIEW OF LINUX TRACING inspired by Greg Law's: Give me fiOeen minutes and I'll change your view of GDBslide 5:
Demoslide 6:
LISA 2014 perf-tools (Orace)slide 7:
LISA 2016 bcc tools (BPF)slide 8:
Wielding Superpowers WHAT DYNAMIC TRACING CAN DOslide 9:
Previously • Metrics were vendor chosen, closed source, and incomplete • The art of inference & making do # ps alx F S UID 3 S 1 S 1 S […] PID PPID CPU PRI NICE 0 30 0 30 ADDR WCHAN TTY TIME CMD 4412 ? 186:14 swapper 46520 ? 0:00 /etc/init 46554 co 0:00 –shslide 10:
Crystal Ball Observability Dynamic Tracingslide 11:
Linux Event Sourcesslide 12:
Event Tracing Efficiency Eg, tracing TCP retransmits Kernel Old way: packet capture tcpdump Analyzer 1. read 2. dump buffer 1. read 2. process 3. print file system send receive disks New way: dynamic tracing Tracer 1. configure 2. read tcp_retransmit_skb()slide 13:
New CLI Tools # biolatency Tracing block device I/O... Hit Ctrl-C to end. usecs : count distribution 4 ->gt; 7 : 0 8 ->gt; 15 : 0 16 ->gt; 31 : 0 32 ->gt; 63 : 0 64 ->gt; 127 : 1 128 ->gt; 255 : 12 |******** 256 ->gt; 511 : 15 |********** 512 ->gt; 1023 : 43 |******************************* 1024 ->gt; 2047 : 52 |**************************************| 2048 ->gt; 4095 : 47 |********************************** 4096 ->gt; 8191 : 52 |**************************************| 8192 ->gt; 16383 : 36 |************************** 16384 ->gt; 32767 : 15 |********** 32768 ->gt; 65535 : 2 65536 ->gt; 131071 : 2slide 14:
New VisualizaGons and GUIsslide 15:
NeElix Intended Usage Self-service UI: Flame Graphs Tracing Reports should be open sourced; you may also build/buy your ownslide 16:
Conquer Performance Measure anythingslide 17:
Introducing BPF BPF TRACINGslide 18:
A Linux Tracing Timeline 1990’s: StaGc tracers, prototype dynamic tracers 2000: LTT + DProbes (dynamic tracing; not integrated) 2004: kprobes (2.6.9) 2005: DTrace (not Linux), SystemTap (out-of-tree) 2008: Orace (2.6.27) 2009: perf (2.6.31) 2009: tracepoints (2.6.32) 2010-2016: Orace & perf_events enhancements 2014-2016: BPF patches also: LTTng, ktap, sysdig, ...slide 19:
Ye Olde BPF Berkeley Packet Filter # tcpdump host 127.0.0.1 and port 22 -d (000) ldh [12] (001) jeq #0x800 jt 2 jf 18 (002) ld [26] (003) jeq #0x7f000001 jt 6 jf 4 (004) ld [30] (005) jeq #0x7f000001 jt 6 jf 18 (006) ldb [23] (007) jeq #0x84 jt 10 jf 8 (008) jeq #0x6 jt 10 jf 9 (009) jeq #0x11 jt 10 jf 18 (010) ldh [20] (011) jset #0x1fff jt 18 jf 12 (012) ldxb 4*([14]&0xf) (013) ldh [x + 14] (014) jeq #0x16 jt 17 jf 15 (015) ldh [x + 16] (016) jeq #0x16 jt 17 jf 18 (017) ret #65535 (018) retslide 20:
BPF Enhancements by Linux Version 3.18: bpf syscall 3.19: sockets 4.1: kprobes 4.4: bpf_perf_event_output 4.6: stack traces 4.7: tracepoints 4.9: profiling eg, Ubuntu:slide 21:
Enhanced BPF is in Linuxslide 22:
BPF • aka eBPF == enhanced Berkeley Packet Filter – Lead developer: Alexei Starovoitov (Facebook) • Many uses – Virtual networking – Security – ProgrammaGc tracing • Different front-ends – C, perf, bcc, ply, … BPF mascotslide 23:
BPF for Tracing User Program Kernel 1. generate verifier kprobes BPF bytecode perevent data BPF 2. load tracepoints 3. perf_output staGsGcs uprobes 3. async read mapsslide 24:
Raw BPF samples/bpf/sock_example.c 87 lines truncatedslide 25:
C/BPF samples/bpf/tracex1_kern.c 58 lines truncatedslide 26:
bcc • BPF Compiler CollecGon Tracing layers: – hrps://github.com/iovisor/bcc – Lead developer: Brenden Blanco (PlumGRID) • Includes tracing tools • Front-ends – Python – Lua – C helper libraries bcc tool bcc tool bcc Python user kernel lua front-ends Kernel Events BPFslide 27:
bcc/BPF bcc examples/tracing/bitehist.py enTre programslide 28:
ply/BPF hrps://github.com/wkz/ply/blob/master/README.md enTre programslide 29:
The Tracing Landscape, Dec 2016 (less brutal) (my opinion) dtrace4L. ktap Ease of use sysdig perf stap Orace bcc/BPF (alpha) (brutal) ply/BPF (mature) C/BPF Stage of Development Raw BPF Scope & Capabilityslide 30:
State of BPF, Dec 2016 Dynamic tracing, kernel-level (BPF support for kprobes) Dynamic tracing, user-level (BPF support for uprobes) StaGc tracing, kernel-level (BPF support for tracepoints) Timed sampling events (BPF with perf_event_open) PMC events (BPF with perf_event_open) Filtering (via BPF programs) Debug output (bpf_trace_printk()) Per-event output (bpf_perf_event_output()) Basic variables (global & per-thread variables, via BPF maps) AssociaGve arrays (via BPF maps) Frequency counGng (via BPF maps) Histograms (power-of-2, linear, and custom, via BPF maps) Timestamps and Gme deltas (bpf_kGme_get_() and BPF) Stack traces, kernel (BPF stackmap) Stack traces, user (BPF stackmap) Overwrite ring buffers String factory (stringmap) OpGonal: bounded loops,slide 31: For end-users HOW TO USE BCC/BPFslide 32:InstallaGon hrps://github.com/iovisor/bcc/blob/master/INSTALL.md • eg, Ubuntu Xenial: # echo "deb [trusted=yes] https://repo.iovisor.org/apt/xenial xenial-nightly main" | \ sudo tee /etc/apt/sources.list.d/iovisor.list # sudo apt-get update # sudo apt-get install bcc-tools – puts tools in /usr/share/bcc/tools, and tools/old for older kernels – 16.04 is good, 16.10 berer: more tools work – bcc should also arrive as an official Ubuntu snapslide 33:Pre-bcc Performance Checklist 1. uptime 2. dmesg | tail 3. vmstat 1 4. mpstat -P ALL 1 5. pidstat 1 6. iostat -xz 1 7. free -m 8. sar -n DEV 1 9. sar -n TCP,ETCP 1 10. top hrp://techblog.neElix.com/2015/11/linux-performance-analysis-in-60s.htmlslide 34:bcc General Performance Checklist execsnoop opensnoop ext4slower (…) biolatency biosnoop cachestat tcpconnect tcpaccept tcpretrans gethostlatency runqlat profileslide 35:1. execsnoop # execsnoop PCOMM bash preconv man man man nroff nroff groff groff […] PID RET ARGS 0 /usr/bin/man ls 0 /usr/bin/preconv -e UTF-8 0 /usr/bin/tbl 0 /usr/bin/nroff -mandoc -rLL=169n -rLT=169n -Tutf8 0 /usr/bin/pager -s 0 /usr/bin/locale charmap 0 /usr/bin/groff -mtty-char -Tutf8 -mandoc -rLL=169n -rLT=169n 0 /usr/bin/troff -mtty-char -mandoc -rLL=169n -rLT=169n -Tutf8 0 /usr/bin/grottyslide 36:2. opensnoop # opensnoop PID COMM 27159 catalina.sh redis-server redis-server 30668 sshd 30668 sshd 30668 sshd 30668 sshd 30668 sshd 30668 sshd 30668 sshd 30668 sshd 30668 sshd 30668 sshd snmp-pass […] FD ERR PATH 0 /apps/tomcat8/bin/setclasspath.sh 0 /proc/4057/stat 0 /proc/2360/stat 0 /proc/sys/kernel/ngroups_max 0 /etc/group 0 /root/.ssh/authorized_keys 0 /root/.ssh/authorized_keys 2 /var/run/nologin 2 /etc/nologin 0 /etc/login.defs 0 /etc/passwd 0 /etc/shadow 0 /etc/localtime 0 /proc/cpuinfoslide 37:3. ext4slower # ext4slower 1 Tracing ext4 operations slower than 1 ms TIME COMM PID T BYTES OFF_KB 06:49:17 bash R 128 06:49:17 cksum R 39552 06:49:17 cksum R 96 06:49:17 cksum R 96 06:49:17 cksum R 10320 06:49:17 cksum R 65536 06:49:17 cksum R 55400 06:49:17 cksum R 36792 06:49:17 cksum R 15008 06:49:17 cksum R 6123 06:49:17 cksum R 6280 06:49:17 cksum R 27696 06:49:17 cksum R 58080 […] LAT(ms) FILENAME 7.75 cksum 1.34 [ 5.36 2to3-2.7 14.94 2to3-3.4 6.82 411toppm 4.01 a2p 8.77 ab 16.34 aclocal-1.14 19.31 acpi_listen 17.23 add-apt-repository 18.40 addpart 2.16 addr2line 10.11 ag also: btrfsslower, xfsslower, zfslowerslide 38:4. biolatency # biolatency -mT 1 Tracing block device I/O... Hit Ctrl-C to end. 06:20:16 msecs 0 ->gt; 1 2 ->gt; 3 4 ->gt; 7 8 ->gt; 15 16 ->gt; 31 32 ->gt; 63 64 ->gt; 127 […] : count : 36 : 1 : 3 : 17 : 33 : 7 : 6 distribution |**************************************| |*** |***************** |********************************** |******* |******slide 39:5. biosnoop # biosnoop TIME(s) […] COMM supervise supervise supervise supervise supervise supervise supervise supervise xfsaild/md0 xfsaild/md0 xfsaild/md0 xfsaild/md0 kworker/0:3 supervise PID DISK xvda1 xvda1 xvda1 xvda1 xvda1 xvda1 xvda1 xvda1 xvdc xvdb xvdb xvdb xvdb xvda1 SECTOR BYTES LAT(ms)slide 40:6. cachestat # cachestat HITS MISSES […] DIRTIES READ_HIT% WRITE_HIT% 80.4% 19.6% 96.2% 3.7% 89.6% 10.4% 100.0% 0.0% 100.0% 0.0% 55.2% 4.5% 100.0% 0.0% 99.9% 0.0% 100.0% 0.0% 100.0% 0.0% BUFFERS_MB CACHED_MBslide 41:7. tcpconnect # tcpconnect PID COMM IP SADDR DADDR DPORT 25333 recordProgra 4 127.0.0.1 25338 curl 4 100.66.3.172 25340 curl 4 100.66.3.172 25342 curl 4 100.66.3.172 25344 curl 4 100.66.3.172 25365 recordProgra 4 127.0.0.1 26119 ssh 6 ::1 ::1 25388 recordProgra 4 127.0.0.1 25220 ssh 6 fe80::8a3:9dff:fed5:6b19 fe80::8a3:9dff:fed5:6b19 22 […]slide 42:8. tcpaccept # tcpaccept PID COMM IP RADDR LADDR LPORT sshd 4 11.16.213.254 redis-server 4 127.0.0.1 redis-server 4 127.0.0.1 redis-server 4 127.0.0.1 redis-server 4 127.0.0.1 sshd 6 ::1 ::1 redis-server 4 127.0.0.1 redis-server 4 127.0.0.1 sshd 6 fe80::8a3:9dff:fed5:6b19 fe80::8a3:9dff:fed5:6b19 22 redis-server 4 127.0.0.1 […]slide 43:9. tcpretrans # tcpretrans TIME PID 01:55:05 0 01:55:05 0 01:55:17 0 […] IP LADDR:LPORT 4 10.153.223.157:22 4 10.153.223.157:22 4 10.153.223.157:22 T>gt; RADDR:RPORT R>gt; 69.53.245.40:34619 R>gt; 69.53.245.40:34619 R>gt; 69.53.245.40:22957 STATE ESTABLISHED ESTABLISHED ESTABLISHEDslide 44:10. gethostlatency # gethostlatency TIME PID COMM 06:10:24 28011 wget 06:10:28 28127 wget 06:10:41 28404 wget 06:10:48 28544 curl 06:11:10 29054 curl 06:11:16 29195 curl 06:11:24 25313 wget 06:11:25 29404 curl 06:11:28 29475 curl […] LATms HOST 90.00 www.iovisor.org 0.00 www.iovisor.org 9.00 www.netflix.com 35.00 www.netflix.com.au 31.00 www.plumgrid.com 3.00 www.facebook.com 3.00 www.usenix.org 72.00 foo 1.00 fooslide 45:11. runqlat # runqlat -m 5 Tracing run queue latency... Hit Ctrl-C to end. msecs 0 ->gt; 1 2 ->gt; 3 4 ->gt; 7 8 ->gt; 15 16 ->gt; 31 32 ->gt; 63 […] : count : 3818 : 39 : 39 : 62 : 2214 : 226 distribution |****************************************| |*********************** |**slide 46:12. profile # profile Sampling at 49 Hertz of all threads by user + kernel stack... Hit Ctrl-C to end. […] ffffffff813d0af8 __clear_user ffffffff813d5277 iov_iter_zero ffffffff814ec5f2 read_iter_zero ffffffff8120be9d __vfs_read ffffffff8120c385 vfs_read ffffffff8120d786 sys_read ffffffff817cc076 entry_SYSCALL_64_fastpath 00007fc5652ad9b0 read dd (25036) […]slide 47:Other bcc Tracing Tools • Single-purpose – bitesize – capabile – memleak – ext4dist (btrfs, …) • MulG tools – funccount – argdist – trace – stackcount hrps://github.com/iovisor/bcc#toolsslide 48:trace • Trace custom events. Ad hoc analysis: # trace 'sys_read (arg3 >gt; 20000) "read %d bytes", arg3' TIME PID COMM FUNC 05:18:23 4490 sys_read read 1048576 bytes 05:18:23 4490 sys_read read 1048576 bytes 05:18:23 4490 sys_read read 1048576 bytes 05:18:23 4490 sys_read read 1048576 bytes by Sasha Goldshteinslide 49:trace One-Liners trace –K blk_account_io_start Trace this kernel function, and print info with a kernel stack trace trace 'do_sys_open "%s", arg2' Trace the open syscall and print the filename being opened trace 'sys_read (arg3 >gt; 20000) "read %d bytes", arg3' Trace the read syscall and print a message for reads >gt;20000 bytes trace r::do_sys_return Trace the return from the open syscall trace 'c:open (arg2 == 42) "%s %d", arg1, arg2' Trace the open() call from libc only if the flags (arg2) argument is 42 trace 'p:c:write (arg1 == 1) "writing %d bytes to STDOUT", arg3' Trace the write() call from libc to monitor writes to STDOUT trace 'r:c:malloc (retval) "allocated = %p", retval Trace returns from malloc and print non-NULL allocated buffers trace 't:block:block_rq_complete "sectors=%d", args->gt;nr_sector' Trace the block_rq_complete kernel tracepoint and print # of tx sectors trace 'u:pthread:pthread_create (arg4 != 0)' Trace the USDT probe pthread_create when its 4th argument is non-zero from: trace -hslide 50:argdist # argdist -H 'p::tcp_cleanup_rbuf(struct sock *sk, int copied):int:copied' [15:34:45] copied : count distribution 0 ->gt; 1 : 15088 |********************************** 2 ->gt; 3 : 0 4 ->gt; 7 : 0 8 ->gt; 15 : 0 16 ->gt; 31 : 0 32 ->gt; 63 : 0 64 ->gt; 127 : 4786 |*********** 128 ->gt; 255 : 1 256 ->gt; 511 : 1 512 ->gt; 1023 : 4 1024 ->gt; 2047 : 11 2048 ->gt; 4095 : 5 4096 ->gt; 8191 : 27 8192 ->gt; 16383 : 105 16384 ->gt; 32767 : 0 32768 ->gt; 65535 : 10086 |*********************** 65536 ->gt; 131071 : 60 131072 ->gt; 262143 : 17285 |****************************************| [...] by Sasha Goldshteinslide 51:argdist One-Liners argdist -H 'p::__kmalloc(u64 size):u64:size' Print a histogram of allocation sizes passed to kmalloc argdist -p 1005 -C 'p:c:malloc(size_t size):size_t:size:size==16' Print a frequency count of how many times process 1005 called malloc for 16 bytes argdist -C 'r:c:gets():char*:$retval#snooped strings' Snoop on all strings returned by gets() argdist -H 'r::__kmalloc(size_t size):u64:$latency/$entry(size)#ns per byte' Print a histogram of nanoseconds per byte from kmalloc allocations argdist -C 'p::__kmalloc(size_t size, gfp_t flags):size_t:size:flags&GFP_ATOMIC' Print frequency count of kmalloc allocation sizes that have GFP_ATOMIC argdist -p 1005 -C 'p:c:write(int fd):int:fd' -T 5 Print frequency counts of how many times writes were issued to a particular file descriptor number, in process 1005, but only show the top 5 busiest fds argdist -p 1005 -H 'r:c:read()' Print a histogram of error codes returned by read() in process 1005 argdist -C 'r::__vfs_read():u32:$PID:$latency >gt; 100000' Print frequency of reads by process where the latency was >gt;0.1ms from: argdist -hslide 52:Coming to a GUI near you BCC/BPF VISUALIZATIONSslide 53:Latency Heatmapsslide 54:CPU + Off-CPU Flame Graphs • Can now be BPF opGmized hrp://www.brendangregg.com/flamegraphs.htmlslide 55:Off-Wake Flame Graphs • Shows blocking stack with waker stack – Berer understand why blocked – Merged in-kernel using BPF – Include mulGple waker stacks == chain graphs • We couldn't do this beforeslide 56:Overview for tool developers HOW TO PROGRAM BCC/BPFslide 57:Linux Event Sources BPF output Linux 4.4 Linux 4.7 BPF stacks Linux 4.6 Linux 4.3 Linux 4.1 (version feature arrived) Linux 4.9 Linux 4.9slide 58:Methodology • Find/draw a funcGonal diagram – Eg, storage I/O subsystem: • Apply performance methods hrp://www.brendangregg.com/methodology.html 1. Workload CharacterizaGon 2. Latency Analysis 3. USE Method • Start with the Q's, then find the A'sslide 59:bitehist.py Output # ./bitehist.py Tracing... Hit Ctrl-C to end. kbytes : count 0 ->gt; 1 : 3 2 ->gt; 3 : 0 4 ->gt; 7 : 211 8 ->gt; 15 : 0 16 ->gt; 31 : 0 32 ->gt; 63 : 0 64 ->gt; 127 : 1 128 ->gt; 255 : 800 distribution |********** |**************************************|slide 60:bitehist.py Code bcc examples/tracing/bitehist.pyslide 61:bytehist.py Internals User-Level Kernel C BPF Program compile BPF Bytecode Event BPF.arach_kprobe() Verifier BPF Bytecode error StaGsGcs Python Program print async read Mapslide 62:bytehist.py Annotated Map C BPF Program Python Program Event "kprobe__" is a shortcut for BPF.arach_kprobe() StaGsGcs bcc examples/tracing/bitehist.pyslide 63:Current ComplicaGons IniGalize all variables Extra bpf_probe_read()s BPF_PERF_OUTPUT() Verifier errorsslide 64:bcc Tutorials 1. hrps://github.com/iovisor/bcc/blob/master/INSTALL.md 2. …/docs/tutorial.md 3. …/docs/tutorial_bcc_python_developer.md 4. …/docs/reference_guide.md 5. .../CONTRIBUTING-SCRIPTS.mdslide 65:bcc lua bcc examples/lua/strlen_count.luaslide 66:Summary BPF Tracing in Linux • 3.19: sockets • 3.19: maps • 4.1: kprobes • 4.3: uprobes • 4.4: BPF output • 4.6: stacks • 4.7: tracepoints • 4.9: profiling • 4.9: PMCs Future Work • More tooling • Bug fixes • Berer errors • VisualizaGons • GUIs • High-level language hrps://github.com/iovisor/bcc#toolsslide 67:Links & References iovisor bcc: • hrps://github.com/iovisor/bcc hrps://github.com/iovisor/bcc/tree/master/docs • hrp://www.brendangregg.com/blog/ (search for "bcc") • hrp://blogs.microsoO.co.il/sasha/2016/02/14/two-new-ebpf-tools-memleak-and-argdist/ • On designing tracing tools: hrps://www.youtube.com/watch?v=uibLwoVKjec BPF: • hrps://www.kernel.org/doc/DocumentaGon/networking/filter.txt • hrps://github.com/iovisor/bpf-docs • hrps://suchakra.wordpress.com/tag/bpf/ Flame Graphs: • hrp://www.brendangregg.com/flamegraphs.html • hrp://www.brendangregg.com/blog/2016-01-20/ebpf-offcpu-flame-graph.html • hrp://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html Dynamic InstrumentaGon: • hrp://Op.cs.wisc.edu/par-distr-sys/papers/Hollingsworth94Dynamic.pdf • hrps://en.wikipedia.org/wiki/DTrace • DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD, Brendan Gregg, Jim Mauro; PrenGce Hall 2011 NeElix Tech Blog on Vector: • hrp://techblog.neElix.com/2015/04/introducing-vector-neElixs-on-host.html Greg Law's GDB talk: hrps://www.youtube.com/watch?v=PorfLSr3DDI • Linux Performance: hrp://www.brendangregg.com/linuxperf.htmlslide 68:Thanks QuesGons? iovisor bcc: hrps://github.com/iovisor/bcc hrp://www.brendangregg.com hrp://slideshare.net/brendangregg bgregg@neElix.com @brendangregg Thanks to Alexei Starovoitov (Facebook), Brenden Blanco (PLUMgrid), Sasha Goldshtein (Sela), Daniel Borkmann (Cisco), Wang Nan (Huawei), and other BPF and bcc contributors!