Systems Performance 2nd Ed.



BPF Performance Tools book

Recent posts:
Blog index
About
RSS

USENIX LISA 2016: Linux 4.x Tracing Tools, Using BPF Superpowers

Video: https://www.youtube.com/watch?v=UmOU3I36T2U&t=1151s

Talk for USENIX LISA 2016 by Brendan Gregg.

Description: "The Linux 4.x series heralds a new era of Linux performance analysis, with the long-awaited integration of a programmable tracer: Enhanced BPF (eBPF). Formally the Berkeley Packet Filter, BPF has been enhanced in Linux to provide system tracing capabilities, and integrates with dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). This has allowed dozens of new observability tools to be developed so far: for example, measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived.

In this talk I'll show you how to use BPF in the Linux 4.x series, and I'll summarize the different tools and front ends available, with a focus on iovisor bcc. bcc is an open source project to provide a Python front end for BPF, and comes with dozens of new observability tools (many of which I developed). These tools include new BPF versions of old classics, and many new tools, including: execsnoop, opensnoop, funccount, trace, biosnoop, bitesize, ext4slower, ext4dist, tcpconnect, tcpretrans, runqlat, offcputime, offwaketime, and many more. I'll also summarize use cases and some long-standing issues that can now be solved, and how we are using these capabilities at Netflix."

next
prev
1/68
next
prev
2/68
next
prev
3/68
next
prev
4/68
next
prev
5/68
next
prev
6/68
next
prev
7/68
next
prev
8/68
next
prev
9/68
next
prev
10/68
next
prev
11/68
next
prev
12/68
next
prev
13/68
next
prev
14/68
next
prev
15/68
next
prev
16/68
next
prev
17/68
next
prev
18/68
next
prev
19/68
next
prev
20/68
next
prev
21/68
next
prev
22/68
next
prev
23/68
next
prev
24/68
next
prev
25/68
next
prev
26/68
next
prev
27/68
next
prev
28/68
next
prev
29/68
next
prev
30/68
next
prev
31/68
next
prev
32/68
next
prev
33/68
next
prev
34/68
next
prev
35/68
next
prev
36/68
next
prev
37/68
next
prev
38/68
next
prev
39/68
next
prev
40/68
next
prev
41/68
next
prev
42/68
next
prev
43/68
next
prev
44/68
next
prev
45/68
next
prev
46/68
next
prev
47/68
next
prev
48/68
next
prev
49/68
next
prev
50/68
next
prev
51/68
next
prev
52/68
next
prev
53/68
next
prev
54/68
next
prev
55/68
next
prev
56/68
next
prev
57/68
next
prev
58/68
next
prev
59/68
next
prev
60/68
next
prev
61/68
next
prev
62/68
next
prev
63/68
next
prev
64/68
next
prev
65/68
next
prev
66/68
next
prev
67/68
next
prev
68/68

PDF: LISA2016_BPF_tools_16_9.pdf

Keywords (from pdftotext):

slide 1:
    Linux 4.x Tracing Tools
    Using BPF Superpowers
    Brendan Gregg, NeElix
    bgregg@neElix.com
    December 4–9, 2016 | Boston, MA
    www.usenix.org/lisa16
    #lisa16
    
slide 2:
slide 3:
slide 4:
    Demo Gme
    GIVE ME 15 MINUTES
    AND I'LL CHANGE YOUR VIEW
    OF LINUX TRACING
    inspired by Greg Law's: Give me fiOeen minutes and I'll change your view of GDB
    
slide 5:
    Demo
    
slide 6:
    LISA 2014
    perf-tools
    (Orace)
    
slide 7:
    LISA 2016
    bcc tools
    (BPF)
    
slide 8:
    Wielding Superpowers
    WHAT DYNAMIC TRACING CAN DO
    
slide 9:
    Previously
    • Metrics were vendor chosen, closed source, and incomplete
    • The art of inference & making do
    # ps alx
    F S UID
    3 S
    1 S
    1 S
    […]
    PID
    PPID CPU PRI NICE
    0 30
    0 30
    ADDR
    WCHAN TTY TIME CMD
    4412 ? 186:14 swapper
    46520 ?
    0:00 /etc/init
    46554 co 0:00 –sh
    
slide 10:
    Crystal Ball Observability
    Dynamic Tracing
    
slide 11:
    Linux Event Sources
    
slide 12:
    Event Tracing Efficiency
    Eg, tracing TCP retransmits
    Kernel
    Old way: packet capture
    tcpdump
    Analyzer
    1. read
    2. dump
    buffer
    1. read
    2. process
    3. print
    file system
    send
    receive
    disks
    New way: dynamic tracing
    Tracer
    1. configure
    2. read
    tcp_retransmit_skb()
    
slide 13:
    New CLI Tools
    # biolatency
    Tracing block device I/O... Hit Ctrl-C to end.
    usecs
    : count
    distribution
    4 ->gt; 7
    : 0
    8 ->gt; 15
    : 0
    16 ->gt; 31
    : 0
    32 ->gt; 63
    : 0
    64 ->gt; 127
    : 1
    128 ->gt; 255
    : 12
    |********
    256 ->gt; 511
    : 15
    |**********
    512 ->gt; 1023
    : 43
    |*******************************
    1024 ->gt; 2047
    : 52
    |**************************************|
    2048 ->gt; 4095
    : 47
    |**********************************
    4096 ->gt; 8191
    : 52
    |**************************************|
    8192 ->gt; 16383
    : 36
    |**************************
    16384 ->gt; 32767
    : 15
    |**********
    32768 ->gt; 65535
    : 2
    65536 ->gt; 131071
    : 2
    
slide 14:
    New VisualizaGons and GUIs
    
slide 15:
    NeElix Intended Usage
    Self-service UI:
    Flame Graphs
    Tracing Reports
    should be open sourced; you may also build/buy your own
    
slide 16:
    Conquer Performance
    Measure anything
    
slide 17:
    Introducing BPF
    BPF TRACING
    
slide 18:
    A Linux Tracing Timeline
    1990’s: StaGc tracers, prototype dynamic tracers
    2000: LTT + DProbes (dynamic tracing; not integrated)
    2004: kprobes (2.6.9)
    2005: DTrace (not Linux), SystemTap (out-of-tree)
    2008: Orace (2.6.27)
    2009: perf (2.6.31)
    2009: tracepoints (2.6.32)
    2010-2016: Orace & perf_events enhancements
    2014-2016: BPF patches
    also: LTTng, ktap, sysdig, ...
    
slide 19:
    Ye Olde BPF
    Berkeley Packet Filter
    # tcpdump host 127.0.0.1 and port 22 -d
    (000) ldh
    [12]
    (001) jeq
    #0x800
    jt 2
    jf 18
    (002) ld
    [26]
    (003) jeq
    #0x7f000001
    jt 6
    jf 4
    (004) ld
    [30]
    (005) jeq
    #0x7f000001
    jt 6
    jf 18
    (006) ldb
    [23]
    (007) jeq
    #0x84
    jt 10
    jf 8
    (008) jeq
    #0x6
    jt 10
    jf 9
    (009) jeq
    #0x11
    jt 10
    jf 18
    (010) ldh
    [20]
    (011) jset
    #0x1fff
    jt 18
    jf 12
    (012) ldxb
    4*([14]&0xf)
    (013) ldh
    [x + 14]
    (014) jeq
    #0x16
    jt 17
    jf 15
    (015) ldh
    [x + 16]
    (016) jeq
    #0x16
    jt 17
    jf 18
    (017) ret
    #65535
    (018) ret
    
slide 20:
    BPF Enhancements by Linux Version
    3.18: bpf syscall
    3.19: sockets
    4.1: kprobes
    4.4: bpf_perf_event_output
    4.6: stack traces
    4.7: tracepoints
    4.9: profiling
    eg, Ubuntu:
    
slide 21:
    Enhanced
    BPF
    is in Linux
    
slide 22:
    BPF
    • aka eBPF == enhanced Berkeley Packet Filter
    – Lead developer: Alexei Starovoitov (Facebook)
    • Many uses
    – Virtual networking
    – Security
    – ProgrammaGc tracing
    • Different front-ends
    – C, perf, bcc, ply, …
    BPF mascot
    
slide 23:
    BPF for Tracing
    User Program
    Kernel
    1. generate
    verifier
    kprobes
    BPF bytecode
    perevent
    data
    BPF
    2. load
    tracepoints
    3. perf_output
    staGsGcs
    uprobes
    3. async
    read
    maps
    
slide 24:
    Raw BPF
    samples/bpf/sock_example.c
    87 lines truncated
    
slide 25:
    C/BPF
    samples/bpf/tracex1_kern.c
    58 lines truncated
    
slide 26:
    bcc
    • BPF Compiler CollecGon
    Tracing layers:
    – hrps://github.com/iovisor/bcc
    – Lead developer: Brenden Blanco
    (PlumGRID)
    • Includes tracing tools
    • Front-ends
    – Python
    – Lua
    – C helper libraries
    bcc tool
    bcc tool
    bcc
    Python
    user
    kernel
    lua
    front-ends
    Kernel
    Events
    BPF
    
slide 27:
    bcc/BPF
    bcc examples/tracing/bitehist.py
    enTre program
    
slide 28:
    ply/BPF
    hrps://github.com/wkz/ply/blob/master/README.md
    enTre program
    
slide 29:
    The Tracing Landscape, Dec 2016
    (less brutal)
    (my opinion)
    dtrace4L.
    ktap
    Ease of use
    sysdig
    perf
    stap
    Orace
    bcc/BPF
    (alpha)
    (brutal)
    ply/BPF
    (mature)
    C/BPF
    Stage of
    Development
    Raw BPF
    Scope & Capability
    
slide 30:
    State of BPF, Dec 2016
    Dynamic tracing, kernel-level (BPF support for kprobes)
    Dynamic tracing, user-level (BPF support for uprobes)
    StaGc tracing, kernel-level (BPF support for tracepoints)
    Timed sampling events (BPF with perf_event_open)
    PMC events (BPF with perf_event_open)
    Filtering (via BPF programs)
    Debug output (bpf_trace_printk())
    Per-event output (bpf_perf_event_output())
    Basic variables (global & per-thread variables, via BPF maps)
    AssociaGve arrays (via BPF maps)
    Frequency counGng (via BPF maps)
    Histograms (power-of-2, linear, and custom, via BPF maps)
    Timestamps and Gme deltas (bpf_kGme_get_() and BPF)
    Stack traces, kernel (BPF stackmap)
    Stack traces, user (BPF stackmap)
    Overwrite ring buffers
    String factory (stringmap)
    OpGonal: bounded loops, 
slide 31:
    For end-users
    HOW TO USE BCC/BPF
    
slide 32:
    InstallaGon
    hrps://github.com/iovisor/bcc/blob/master/INSTALL.md
    • eg, Ubuntu Xenial:
    # echo "deb [trusted=yes] https://repo.iovisor.org/apt/xenial xenial-nightly main" | \
    sudo tee /etc/apt/sources.list.d/iovisor.list
    # sudo apt-get update
    # sudo apt-get install bcc-tools
    – puts tools in /usr/share/bcc/tools, and tools/old for older kernels
    – 16.04 is good, 16.10 berer: more tools work
    – bcc should also arrive as an official Ubuntu snap
    
slide 33:
    Pre-bcc Performance Checklist
    1. uptime
    2. dmesg | tail
    3. vmstat 1
    4. mpstat -P ALL 1
    5. pidstat 1
    6. iostat -xz 1
    7. free -m
    8. sar -n DEV 1
    9. sar -n TCP,ETCP 1
    10. top
    hrp://techblog.neElix.com/2015/11/linux-performance-analysis-in-60s.html
    
slide 34:
    bcc General Performance Checklist
    execsnoop
    opensnoop
    ext4slower (…)
    biolatency
    biosnoop
    cachestat
    tcpconnect
    tcpaccept
    tcpretrans
    gethostlatency
    runqlat
    profile
    
slide 35:
    1. execsnoop
    # execsnoop
    PCOMM
    bash
    preconv
    man
    man
    man
    nroff
    nroff
    groff
    groff
    […]
    PID
    RET ARGS
    0 /usr/bin/man ls
    0 /usr/bin/preconv -e UTF-8
    0 /usr/bin/tbl
    0 /usr/bin/nroff -mandoc -rLL=169n -rLT=169n -Tutf8
    0 /usr/bin/pager -s
    0 /usr/bin/locale charmap
    0 /usr/bin/groff -mtty-char -Tutf8 -mandoc -rLL=169n -rLT=169n
    0 /usr/bin/troff -mtty-char -mandoc -rLL=169n -rLT=169n -Tutf8
    0 /usr/bin/grotty
    
slide 36:
    2. opensnoop
    # opensnoop
    PID
    COMM
    27159 catalina.sh
    redis-server
    redis-server
    30668 sshd
    30668 sshd
    30668 sshd
    30668 sshd
    30668 sshd
    30668 sshd
    30668 sshd
    30668 sshd
    30668 sshd
    30668 sshd
    snmp-pass
    […]
    FD ERR PATH
    0 /apps/tomcat8/bin/setclasspath.sh
    0 /proc/4057/stat
    0 /proc/2360/stat
    0 /proc/sys/kernel/ngroups_max
    0 /etc/group
    0 /root/.ssh/authorized_keys
    0 /root/.ssh/authorized_keys
    2 /var/run/nologin
    2 /etc/nologin
    0 /etc/login.defs
    0 /etc/passwd
    0 /etc/shadow
    0 /etc/localtime
    0 /proc/cpuinfo
    
slide 37:
    3. ext4slower
    # ext4slower 1
    Tracing ext4 operations slower than 1 ms
    TIME
    COMM
    PID
    T BYTES
    OFF_KB
    06:49:17 bash
    R 128
    06:49:17 cksum
    R 39552
    06:49:17 cksum
    R 96
    06:49:17 cksum
    R 96
    06:49:17 cksum
    R 10320
    06:49:17 cksum
    R 65536
    06:49:17 cksum
    R 55400
    06:49:17 cksum
    R 36792
    06:49:17 cksum
    R 15008
    06:49:17 cksum
    R 6123
    06:49:17 cksum
    R 6280
    06:49:17 cksum
    R 27696
    06:49:17 cksum
    R 58080
    […]
    LAT(ms) FILENAME
    7.75 cksum
    1.34 [
    5.36 2to3-2.7
    14.94 2to3-3.4
    6.82 411toppm
    4.01 a2p
    8.77 ab
    16.34 aclocal-1.14
    19.31 acpi_listen
    17.23 add-apt-repository
    18.40 addpart
    2.16 addr2line
    10.11 ag
    also: btrfsslower, xfsslower, zfslower
    
slide 38:
    4. biolatency
    # biolatency -mT 1
    Tracing block device I/O... Hit Ctrl-C to end.
    06:20:16
    msecs
    0 ->gt; 1
    2 ->gt; 3
    4 ->gt; 7
    8 ->gt; 15
    16 ->gt; 31
    32 ->gt; 63
    64 ->gt; 127
    […]
    : count
    : 36
    : 1
    : 3
    : 17
    : 33
    : 7
    : 6
    distribution
    |**************************************|
    |***
    |*****************
    |**********************************
    |*******
    |******
    
slide 39:
    5. biosnoop
    # biosnoop
    TIME(s)
    […]
    COMM
    supervise
    supervise
    supervise
    supervise
    supervise
    supervise
    supervise
    supervise
    xfsaild/md0
    xfsaild/md0
    xfsaild/md0
    xfsaild/md0
    kworker/0:3
    supervise
    PID
    DISK
    xvda1
    xvda1
    xvda1
    xvda1
    xvda1
    xvda1
    xvda1
    xvda1
    xvdc
    xvdb
    xvdb
    xvdb
    xvdb
    xvda1
    SECTOR
    BYTES
    LAT(ms)
    
slide 40:
    6. cachestat
    # cachestat
    HITS
    MISSES
    […]
    DIRTIES
    READ_HIT% WRITE_HIT%
    80.4%
    19.6%
    96.2%
    3.7%
    89.6%
    10.4%
    100.0%
    0.0%
    100.0%
    0.0%
    55.2%
    4.5%
    100.0%
    0.0%
    99.9%
    0.0%
    100.0%
    0.0%
    100.0%
    0.0%
    BUFFERS_MB
    CACHED_MB
    
slide 41:
    7. tcpconnect
    # tcpconnect
    PID
    COMM
    IP SADDR
    DADDR
    DPORT
    25333 recordProgra 4 127.0.0.1
    25338 curl
    4 100.66.3.172
    25340 curl
    4 100.66.3.172
    25342 curl
    4 100.66.3.172
    25344 curl
    4 100.66.3.172
    25365 recordProgra 4 127.0.0.1
    26119 ssh
    6 ::1
    ::1
    25388 recordProgra 4 127.0.0.1
    25220 ssh
    6 fe80::8a3:9dff:fed5:6b19 fe80::8a3:9dff:fed5:6b19 22
    […]
    
slide 42:
    8. tcpaccept
    # tcpaccept
    PID
    COMM
    IP RADDR
    LADDR
    LPORT
    sshd
    4 11.16.213.254
    redis-server 4 127.0.0.1
    redis-server 4 127.0.0.1
    redis-server 4 127.0.0.1
    redis-server 4 127.0.0.1
    sshd
    6 ::1
    ::1
    redis-server 4 127.0.0.1
    redis-server 4 127.0.0.1
    sshd
    6 fe80::8a3:9dff:fed5:6b19 fe80::8a3:9dff:fed5:6b19 22
    redis-server 4 127.0.0.1
    […]
    
slide 43:
    9. tcpretrans
    # tcpretrans
    TIME
    PID
    01:55:05 0
    01:55:05 0
    01:55:17 0
    […]
    IP LADDR:LPORT
    4 10.153.223.157:22
    4 10.153.223.157:22
    4 10.153.223.157:22
    T>gt; RADDR:RPORT
    R>gt; 69.53.245.40:34619
    R>gt; 69.53.245.40:34619
    R>gt; 69.53.245.40:22957
    STATE
    ESTABLISHED
    ESTABLISHED
    ESTABLISHED
    
slide 44:
    10. gethostlatency
    # gethostlatency
    TIME
    PID
    COMM
    06:10:24 28011 wget
    06:10:28 28127 wget
    06:10:41 28404 wget
    06:10:48 28544 curl
    06:11:10 29054 curl
    06:11:16 29195 curl
    06:11:24 25313 wget
    06:11:25 29404 curl
    06:11:28 29475 curl
    […]
    LATms HOST
    90.00 www.iovisor.org
    0.00 www.iovisor.org
    9.00 www.netflix.com
    35.00 www.netflix.com.au
    31.00 www.plumgrid.com
    3.00 www.facebook.com
    3.00 www.usenix.org
    72.00 foo
    1.00 foo
    
slide 45:
    11. runqlat
    # runqlat -m 5
    Tracing run queue latency... Hit Ctrl-C to end.
    msecs
    0 ->gt; 1
    2 ->gt; 3
    4 ->gt; 7
    8 ->gt; 15
    16 ->gt; 31
    32 ->gt; 63
    […]
    : count
    : 3818
    : 39
    : 39
    : 62
    : 2214
    : 226
    distribution
    |****************************************|
    |***********************
    |**
    
slide 46:
    12. profile
    # profile
    Sampling at 49 Hertz of all threads by user + kernel stack... Hit Ctrl-C to end.
    […]
    ffffffff813d0af8 __clear_user
    ffffffff813d5277 iov_iter_zero
    ffffffff814ec5f2 read_iter_zero
    ffffffff8120be9d __vfs_read
    ffffffff8120c385 vfs_read
    ffffffff8120d786 sys_read
    ffffffff817cc076 entry_SYSCALL_64_fastpath
    00007fc5652ad9b0 read
    dd (25036)
    […]
    
slide 47:
    Other bcc Tracing Tools
    • Single-purpose
    – bitesize
    – capabile
    – memleak
    – ext4dist (btrfs, …)
    • MulG tools
    – funccount
    – argdist
    – trace
    – stackcount
    hrps://github.com/iovisor/bcc#tools
    
slide 48:
    trace
    • Trace custom events. Ad hoc analysis:
    # trace 'sys_read (arg3 >gt; 20000) "read %d bytes", arg3'
    TIME
    PID
    COMM
    FUNC
    05:18:23 4490
    sys_read
    read 1048576 bytes
    05:18:23 4490
    sys_read
    read 1048576 bytes
    05:18:23 4490
    sys_read
    read 1048576 bytes
    05:18:23 4490
    sys_read
    read 1048576 bytes
    by Sasha Goldshtein
    
slide 49:
    trace One-Liners
    trace –K blk_account_io_start
    Trace this kernel function, and print info with a kernel stack trace
    trace 'do_sys_open "%s", arg2'
    Trace the open syscall and print the filename being opened
    trace 'sys_read (arg3 >gt; 20000) "read %d bytes", arg3'
    Trace the read syscall and print a message for reads >gt;20000 bytes
    trace r::do_sys_return
    Trace the return from the open syscall
    trace 'c:open (arg2 == 42) "%s %d", arg1, arg2'
    Trace the open() call from libc only if the flags (arg2) argument is 42
    trace 'p:c:write (arg1 == 1) "writing %d bytes to STDOUT", arg3'
    Trace the write() call from libc to monitor writes to STDOUT
    trace 'r:c:malloc (retval) "allocated = %p", retval
    Trace returns from malloc and print non-NULL allocated buffers
    trace 't:block:block_rq_complete "sectors=%d", args->gt;nr_sector'
    Trace the block_rq_complete kernel tracepoint and print # of tx sectors
    trace 'u:pthread:pthread_create (arg4 != 0)'
    Trace the USDT probe pthread_create when its 4th argument is non-zero
    from: trace -h
    
slide 50:
    argdist
    # argdist -H 'p::tcp_cleanup_rbuf(struct sock *sk, int copied):int:copied'
    [15:34:45]
    copied
    : count
    distribution
    0 ->gt; 1
    : 15088
    |**********************************
    2 ->gt; 3
    : 0
    4 ->gt; 7
    : 0
    8 ->gt; 15
    : 0
    16 ->gt; 31
    : 0
    32 ->gt; 63
    : 0
    64 ->gt; 127
    : 4786
    |***********
    128 ->gt; 255
    : 1
    256 ->gt; 511
    : 1
    512 ->gt; 1023
    : 4
    1024 ->gt; 2047
    : 11
    2048 ->gt; 4095
    : 5
    4096 ->gt; 8191
    : 27
    8192 ->gt; 16383
    : 105
    16384 ->gt; 32767
    : 0
    32768 ->gt; 65535
    : 10086
    |***********************
    65536 ->gt; 131071
    : 60
    131072 ->gt; 262143
    : 17285
    |****************************************|
    [...]
    by Sasha Goldshtein
    
slide 51:
    argdist One-Liners
    argdist -H 'p::__kmalloc(u64 size):u64:size'
    Print a histogram of allocation sizes passed to kmalloc
    argdist -p 1005 -C 'p:c:malloc(size_t size):size_t:size:size==16'
    Print a frequency count of how many times process 1005 called malloc for 16 bytes
    argdist -C 'r:c:gets():char*:$retval#snooped strings'
    Snoop on all strings returned by gets()
    argdist -H 'r::__kmalloc(size_t size):u64:$latency/$entry(size)#ns per byte'
    Print a histogram of nanoseconds per byte from kmalloc allocations
    argdist -C 'p::__kmalloc(size_t size, gfp_t flags):size_t:size:flags&GFP_ATOMIC'
    Print frequency count of kmalloc allocation sizes that have GFP_ATOMIC
    argdist -p 1005 -C 'p:c:write(int fd):int:fd' -T 5
    Print frequency counts of how many times writes were issued to a particular file descriptor
    number, in process 1005, but only show the top 5 busiest fds
    argdist -p 1005 -H 'r:c:read()'
    Print a histogram of error codes returned by read() in process 1005
    argdist -C 'r::__vfs_read():u32:$PID:$latency >gt; 100000'
    Print frequency of reads by process where the latency was >gt;0.1ms
    from: argdist -h
    
slide 52:
    Coming to a GUI near you
    BCC/BPF VISUALIZATIONS
    
slide 53:
    Latency Heatmaps
    
slide 54:
    CPU + Off-CPU Flame Graphs
    • Can now be
    BPF opGmized
    hrp://www.brendangregg.com/flamegraphs.html
    
slide 55:
    Off-Wake Flame
    Graphs
    • Shows blocking stack with
    waker stack
    – Berer understand why blocked
    – Merged in-kernel using BPF
    – Include mulGple waker stacks ==
    chain graphs
    • We couldn't do this before
    
slide 56:
    Overview for tool developers
    HOW TO PROGRAM BCC/BPF
    
slide 57:
    Linux Event Sources
    BPF output
    Linux 4.4
    Linux 4.7
    BPF stacks
    Linux 4.6
    Linux 4.3
    Linux 4.1
    (version
    feature
    arrived)
    Linux 4.9
    Linux 4.9
    
slide 58:
    Methodology
    • Find/draw a funcGonal diagram
    – Eg, storage I/O subsystem:
    • Apply performance methods
    hrp://www.brendangregg.com/methodology.html
    1. Workload CharacterizaGon
    2. Latency Analysis
    3. USE Method
    • Start with the Q's,
    then find the A's
    
slide 59:
    bitehist.py Output
    # ./bitehist.py
    Tracing... Hit Ctrl-C to end.
    kbytes
    : count
    0 ->gt; 1
    : 3
    2 ->gt; 3
    : 0
    4 ->gt; 7
    : 211
    8 ->gt; 15
    : 0
    16 ->gt; 31
    : 0
    32 ->gt; 63
    : 0
    64 ->gt; 127
    : 1
    128 ->gt; 255
    : 800
    distribution
    |**********
    |**************************************|
    
slide 60:
    bitehist.py Code
    bcc examples/tracing/bitehist.py
    
slide 61:
    bytehist.py Internals
    User-Level
    Kernel
    C BPF Program
    compile
    BPF Bytecode
    Event
    BPF.arach_kprobe()
    Verifier
    BPF Bytecode
    error
    StaGsGcs
    Python Program
    print
    async read
    Map
    
slide 62:
    bytehist.py Annotated
    Map
    C BPF Program
    Python Program
    Event
    "kprobe__" is a shortcut for BPF.arach_kprobe()
    StaGsGcs
    bcc examples/tracing/bitehist.py
    
slide 63:
    Current ComplicaGons
    IniGalize all variables
    Extra bpf_probe_read()s
    BPF_PERF_OUTPUT()
    Verifier errors
    
slide 64:
    bcc Tutorials
    1. hrps://github.com/iovisor/bcc/blob/master/INSTALL.md
    2. …/docs/tutorial.md
    3. …/docs/tutorial_bcc_python_developer.md
    4. …/docs/reference_guide.md
    5. .../CONTRIBUTING-SCRIPTS.md
    
slide 65:
    bcc lua
    bcc examples/lua/strlen_count.lua
    
slide 66:
    Summary
    BPF Tracing in Linux
    • 3.19: sockets
    • 3.19: maps
    • 4.1: kprobes
    • 4.3: uprobes
    • 4.4: BPF output
    • 4.6: stacks
    • 4.7: tracepoints
    • 4.9: profiling
    • 4.9: PMCs
    Future Work
    • More tooling
    • Bug fixes
    • Berer errors
    • VisualizaGons
    • GUIs
    • High-level
    language
    hrps://github.com/iovisor/bcc#tools
    
slide 67:
    Links & References
    iovisor bcc:
    • hrps://github.com/iovisor/bcc hrps://github.com/iovisor/bcc/tree/master/docs
    • hrp://www.brendangregg.com/blog/ (search for "bcc")
    • hrp://blogs.microsoO.co.il/sasha/2016/02/14/two-new-ebpf-tools-memleak-and-argdist/
    • On designing tracing tools: hrps://www.youtube.com/watch?v=uibLwoVKjec
    BPF:
    • hrps://www.kernel.org/doc/DocumentaGon/networking/filter.txt
    • hrps://github.com/iovisor/bpf-docs
    • hrps://suchakra.wordpress.com/tag/bpf/
    Flame Graphs:
    • hrp://www.brendangregg.com/flamegraphs.html
    • hrp://www.brendangregg.com/blog/2016-01-20/ebpf-offcpu-flame-graph.html
    • hrp://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html
    Dynamic InstrumentaGon:
    • hrp://Op.cs.wisc.edu/par-distr-sys/papers/Hollingsworth94Dynamic.pdf
    • hrps://en.wikipedia.org/wiki/DTrace
    • DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD, Brendan Gregg, Jim Mauro; PrenGce Hall 2011
    NeElix Tech Blog on Vector:
    • hrp://techblog.neElix.com/2015/04/introducing-vector-neElixs-on-host.html
    Greg Law's GDB talk: hrps://www.youtube.com/watch?v=PorfLSr3DDI
    • Linux Performance: hrp://www.brendangregg.com/linuxperf.html
    
slide 68:
    Thanks
    QuesGons?
    iovisor bcc: hrps://github.com/iovisor/bcc
    hrp://www.brendangregg.com
    hrp://slideshare.net/brendangregg
    bgregg@neElix.com
    @brendangregg
    Thanks to Alexei Starovoitov (Facebook), Brenden Blanco
    (PLUMgrid), Sasha Goldshtein (Sela), Daniel Borkmann (Cisco),
    Wang Nan (Huawei), and other BPF and bcc contributors!