sysdig CCWFS 2016: Designing Tracing Tools
Video: https://www.youtube.com/watch?v=uibLwoVKjecKeynote by Brendan Gregg for Sysdig Camp-Con-World-Fest-Summit 2016.
Description: "You have a system with an advanced programmatic tracer: do you know what to do with it? Brendan has used numerous tracers in production environments, and has published hundreds of tracing-based tools. In this talk he will share tips and know-how for creating CLI tracing tools and GUI visualizations, to solve real problems effectively. Programmatic tracing is an amazing superpower, and this talk will show you how to wield it!"
next prev 1/40 | |
next prev 2/40 | |
next prev 3/40 | |
next prev 4/40 | |
next prev 5/40 | |
next prev 6/40 | |
next prev 7/40 | |
next prev 8/40 | |
next prev 9/40 | |
next prev 10/40 | |
next prev 11/40 | |
next prev 12/40 | |
next prev 13/40 | |
next prev 14/40 | |
next prev 15/40 | |
next prev 16/40 | |
next prev 17/40 | |
next prev 18/40 | |
next prev 19/40 | |
next prev 20/40 | |
next prev 21/40 | |
next prev 22/40 | |
next prev 23/40 | |
next prev 24/40 | |
next prev 25/40 | |
next prev 26/40 | |
next prev 27/40 | |
next prev 28/40 | |
next prev 29/40 | |
next prev 30/40 | |
next prev 31/40 | |
next prev 32/40 | |
next prev 33/40 | |
next prev 34/40 | |
next prev 35/40 | |
next prev 36/40 | |
next prev 37/40 | |
next prev 38/40 | |
next prev 39/40 | |
next prev 40/40 |
PDF: SysdigSummit2016_DesigningTracingTools.pdf
Keywords (from pdftotext):
slide 1:
Designing Tracing Tools Brendan Gregg, Senior Performance Architectslide 2:
Wielding Superpowersslide 3:
I'm currently developing more tracing tools (bcc/BPF)slide 4:
Tool Design • For tool developers • For everyone else: what you can ask for – Tool templates – GUI visualizations • The following is applicable to all tracers – sysdig, bcc/BPF, DTrace, SystemTap, LTTng, …slide 5:
Methodologiesslide 6:
Methodology-driven Design • Methodologies provide ideas for purposeful tools • Find/draw a functional diagram, apply methods Operating Systems See: http://www.brendangregg.com/methodology.htmlslide 7:
Methodology Examples Eg, at the syscall layer (well known & documented): • Workload Characterization – exec() or open() per-event trace (execsnoop, opensnoop) – connect()/accept() per-event trace (tcpconnect, tcpaccept) – read()/write() size histogram (one-liners) • Latency Analysis – read()/write() latency histogram (funclatency, …) • USE Method – network utilization by thread (not done yet) – syscall errors (fserrors, soerrors)slide 8:
CLI Tracing Toolsslide 9:
CLI Templates 1. Per event output *snoop, *slower 0, … 2. Filtered event output *slower 3. Interval summary *stat, *top 4. Count summary *count 5. Histogram summary *dist, *latency 6. Heatmap summary spectrogram.lua, subsecoffset.lua, …slide 10:
Template 1: Per Event Output Examples: *snoop, *slower 0, … # opensnoop PID COMM 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd 10085 sshd […] FD ERR PATH 0 /lib/x86_64-linux-gnu/libkeyutils.so.1 0 /lib/x86_64-linux-gnu/libresolv.so.2 0 /lib/x86_64-linux-gnu/libgpg-error.so.0 0 /dev/urandom 2 /lib/x86_64-linux-gnu/.libcrypto.so.1.0.0.hmac 2 /proc/sys/crypto/fips_enabled 0 /proc/filesystems 0 /dev/null 0 /proc/10085/fd 0 /usr/lib/ssl/openssl.cnf 0 /etc/gai.conf 0 /etc/nsswitch.conf 0 /etc/ld.so.cache 0 /lib/x86_64-linux-gnu/libnss_compat.so.2 0 /etc/ld.so.cache 0 /lib/x86_64-linux-gnu/libnss_nis.so.2slide 11:
Template 2: Filtered Event Output Examples: *slower # sysdig -c fileslower 1 TIME PROCESS 2014-04-13 20:40:43.973 cksum 2014-04-13 20:40:44.187 cksum 2014-04-13 20:40:44.689 cksum 2014-04-13 20:40:45.005 cksum 2014-04-13 20:40:45.193 cksum […] TYPE read read read read read LAT(ms) FILE 2 /mnt/partial.0.0 1 /mnt/partial.0.0 2 /mnt/partial.0.0 2 /mnt/partial.0.0 1 /mnt/partial.0.0 Tools like this can also do all event output: # sysdig -c fileslower 0 TIME PROCESS 2014-04-13 20:59:04.414 ls 2014-04-13 20:59:04.414 ls 2014-04-13 20:59:04.414 ls 2014-04-13 20:59:04.414 ls 2014-04-13 20:59:04.414 ls 2014-04-13 20:59:04.415 ls 2014-04-13 20:59:04.415 ls [...] TYPE read read read read read read read LAT(ms) FILE 0 /lib/x86_64-linux-gnu/librt.so.1 0 /lib/x86_64-linux-gnu/libacl.so.1 0 /lib/x86_64-linux-gnu/libc.so.6 0 /lib/x86_64-linux-gnu/libdl.so.2 0 /lib/x86_64-linux-gnu/libattr.so.1 0 /proc/filesystems 0 /proc/filesystemsslide 12:
Template 3: Interval Summary Examples: *stat, *top # dcstat TIME 08:11:47: 08:11:48: 08:11:49: 08:11:50: 08:11:51: 08:11:52: 08:11:53: 08:11:54: 08:11:55: 08:11:56: 08:11:57: […] REFS/s SLOW/s MISS/s HIT%slide 13:
Template 4: Count Summary Examples: *count # funccount 'vfs_*' Tracing... Ctrl-C to end. ADDR FUNC ffffffff811efe81 vfs_create ffffffff811f24a1 vfs_rename ffffffff81215191 vfs_fsync_range ffffffff81231df1 vfs_lock_file ffffffff811e8dd1 vfs_fstatat ffffffff811e8d71 vfs_fstat ffffffff811e4381 vfs_write ffffffff811e8c71 vfs_getattr_nosec ffffffff811e8d41 vfs_getattr ffffffff811e3221 vfs_open ffffffff811e4251 vfs_read Detaching... COUNTslide 14:
Template 5: Histogram Summary Examples: *dist, *latency # biolatency Tracing block device I/O... Hit Ctrl-C to end. usecs : count distribution 4 ->gt; 7 : 0 8 ->gt; 15 : 0 16 ->gt; 31 : 0 32 ->gt; 63 : 0 64 ->gt; 127 : 1 128 ->gt; 255 : 12 |******** 256 ->gt; 511 : 15 |********** 512 ->gt; 1023 : 43 |******************************* 1024 ->gt; 2047 : 52 |**************************************| 2048 ->gt; 4095 : 47 |********************************** 4096 ->gt; 8191 : 52 |**************************************| 8192 ->gt; 16383 : 36 |************************** 16384 ->gt; 32767 : 15 |********** 32768 ->gt; 65535 : 2 65536 ->gt; 131071 : 2slide 15:
Template 6: Heatmap Summary Example: subsecoffset.lua (aka "spectrogram")slide 16:
slide 17:
Valuable Know what already exists, and what doesn'tslide 18:
Low Overhead (or documented) sysdig Kernel sysdig driver 1. enable 3. output lua program 2. async read syscalls ring buffer • Understand tracing internals – For example, sysdig's design has ~20x lower overhead than strace (it still has overhead: test and measure to see if it's acceptable) – Tracing overhead is usually relative to event rate • Design for low overhead, and document expectationsslide 19:
Documentation • Good tools have 3 docs: .TH Title heading Code comments .SH Section heading Man page .IP Indented paragraph Examples file .TP Indented paragraph with label Bold • Man page – troff, docbook, … • Examples file: common man macros (see groff_man(7)) Demonstrations of biosnoop, the Linux eBPF/bcc version. biosnoop traces block device I/O (disk I/O), and prints a line of output per I/O. Example: # ./biosnoop TIME(s) [...] COMM supervise PID DISK xvda1 SECTOR BYTES LAT(ms)slide 20:
Concise, intuitive, self-explanatory # ./iolatency Tracing block I/O. Output every 1 seconds. Ctrl-C to end. >gt;=(ms) ..slide 21:gt; 1 1 ->gt; 2 2 ->gt; 4 4 ->gt; 8 8 ->gt; 16 […] : I/O : 4381 : 9 : 5 : 0 : 1 |Distribution |######################################| • Useful startup message – What I'm tracing, when there's output, when I'll end • Vigorous tooling is concise – No wasted text; leave less useful output for non-default options • Unix philosophy: do one thing and do it well
POSIX-style Arguments # ./biolatency -h usage: biolatency [-h] [-T] [-Q] [-m] [-D] [interval] [count] Summarize block device I/O latency as a histogram positional arguments: interval output interval, in seconds count number of outputs optional arguments: -h, --help -T, --timestamp -Q, --queued -m, --milliseconds -D, --disks examples: ./biolatency ./biolatency 1 10 ./biolatency -mT 1 ./biolatency -Q ./biolatency -D show this help message and exit include timestamp on output include OS queued time in I/O time millisecond histogram print a histogram per disk device # summarize block I/O latency as a histogram # print 1 second summaries, 10 times # 1s summaries, milliseconds, and timestamps # include OS queued time in I/O time # show each disk device separatelyslide 22:
POSIX-style Arguments Option Alternate Expectation --all all events -c CMD --cmd … run this command -d SECONDS --duration … duration of tool execution --help help -i FILE --input … input file -i SECONDS --interval … summary interval -n name --name … this process name only -o FILE --output … output file -p PID --pid … this process ID only --by-process per-process ID breakdown -P PORT --port … this TCP port only -t or -T --[no]timestamp include or exclude timestamps --verbose verbose output --extended, --errors extended output, or only failures [interval [count]] summary interval, and # of outputsslide 23:
Testing, Testing, Testing • If you can't write the workload, you can't write the tool – Be it 10 lines of C, some shell, or dd – dd if=/dev/urandom of=/dev/null bs=1k count=23 • Known workload testing: create 23 events • Testing can be time consuming – I spend more time testing a tool than writing it – Automatic tool testing is a difficult problemslide 24:
Example: gethostlatency # gethostlatency TIME PID COMM 06:10:24 28011 wget 06:10:28 28127 wget 06:10:41 28404 wget 06:10:48 28544 curl 06:11:10 29054 curl 06:11:16 29195 curl 06:11:25 29404 curl 06:11:28 29475 curl LATms HOST 90.00 www.iovisor.org 0.00 www.iovisor.org 9.00 www.netflix.com 35.00 www.netflix.com.au 31.00 www.plumgrid.com 3.00 www.facebook.com 72.00 foo 1.00 fooslide 25:
Example: ext4slower # ext4slower 1 Tracing ext4 operations slower than 1 ms TIME COMM PID T BYTES OFF_KB 06:49:17 bash R 128 06:49:17 cksum R 39552 06:49:17 cksum R 96 06:49:17 cksum R 96 06:49:17 cksum R 10320 06:49:17 cksum R 65536 06:49:17 cksum R 55400 06:49:17 cksum R 36792 06:49:17 cksum R 15008 06:49:17 cksum R 6123 repository 06:49:17 cksum R 6280 06:49:17 cksum R 27696 06:49:17 cksum R 58080 06:49:17 cksum R 906 […] LAT(ms) FILENAME 7.75 cksum 1.34 [ 5.36 2to3-2.7 14.94 2to3-3.4 6.82 411toppm 4.01 a2p 8.77 ab 16.34 aclocal-1.14 19.31 acpi_listen 17.23 add-apt18.40 addpart 2.16 addr2line 10.11 ag 6.30 ec2-meta-dataslide 26:
Example: biolatency # biolatency -m 1 5 Tracing block device I/O... Hit Ctrl-C to end. msecs 0 ->gt; 1 2 ->gt; 3 4 ->gt; 7 8 ->gt; 15 16 ->gt; 31 32 ->gt; 63 64 ->gt; 127 […] : count : 36 : 1 : 3 : 17 : 33 : 7 : 6 distribution |**************************************| |*** |***************** |********************************** |******* |******slide 27:
GUI Tracing Toolsslide 28:
GUI Visualizations 1. Event logs 2. Tables 3. Line graphs 4. Histograms 5. Heatmaps (spectrographs) 6. Waterfall charts 7. Directed graphs 8. Flame graphs 9. Flame chartsslide 29:
Visualization 1: Event Logs https://commons.wikimedia.org/wiki/File:Wireshark_screenshot.pngslide 30:
Visualization 2: Tablesslide 31:
Visualization 3: Line Graphs http://www.paradyn.org/html/screen-shots.htmlslide 32:
Visualization 4: Histograms Or a density plot Or as a frequency trail (can cascade)slide 33:
Visualization 5: Heat Maps eg, Oracle ZFS Storage Appliance Analytics (DTrace-based)slide 34:
Visualization 5: Spectrogramsslide 35:
Visualization 6: Waterfall Chartsslide 36:
Visualization 7: Directed Graphsslide 37:
Visualization 8: Flame Graphs fstat from disk directory read from disk file read from disk path read from disk pipe write Commonly used with CPU profilers. Also useful for tracers: off-CPU time, ...slide 38:
Visualization 9: Flame Chartsslide 39:
Desirable Attributes • Valuable – Methodologies provide ideas for purposeful metrics • Documented – Tool tips, wikis • Tested • Real Time • Dashboards – To support methodologiesslide 40:
Thank You! http://www.brendangregg.com http://slideshare.net/brendangregg bgregg@netflix.com @brendangregg References & Links: – http://www.brendangregg.com/heatmaps.html – http://www.brendangregg.com/flamegraphs.html – http://www.slideshare.net/brendangregg/monitorama-2015-netflix-instance-analysis