Systems Performance: Enterprise and the Cloud, 2nd Edition (2020)
This is the official site for the book Systems Performance: Enterprise and the Cloud, 2nd Edition, published by Addison Wesley (2020). Here I'll describe the book, link to related content, and list errata.
The first edition has been very successful, becoming recommended or required reading at many companies, and has sold editions translated to Chinese, Japanese, Polish, and Korean. I've had many emails from people studying the book for the Facebook engineering interview (I'm glad it's helpful).
What is New in Second Edition?
The second edition adds content on BPF, BCC, bpftrace, perf, and Ftrace, mostly removes Solaris, makes numerous updates to Linux and cloud computing, and includes general improvements and additions. Since writing the first edition I now have over six years experience as a senior performance engineer at Netflix, working on new technologies with other engineering experts. This experience has helped me to improve this book.
Chapters are structured to first cover durable skills (models, architecture, and methodologies) and then implementation with tools and tuning. This will be evident to those who read the first edition: most chapters begin with only light changes since the first edition, but the changes increase as each chapter progresses.
Why Systems Performance
Systems performance is an important skill for all computer users, whether you're trying to understand why your laptop is slow or optimizing the performance of a large-scale compute environment (for example, Facebook's datacenters or the Netflix cloud). Systems performance is the study of application, operating system, kernel, and hardware performance.
There are two general goals:
- Improving price/performance
- Reducing latency outliers
Other activities of systems performance include benchmarking to evaluate systems, capacity planning, bottleneck elimination, and scalability analysis – so that you discover scalability limiters early, in time to fix them.
Topics are introduced in an OS-agnostic way, then Linux is covered as the primary example.
This book is primarily for system administrators, system reliability engineers, performance engineers, support staff, and other operators in enterprise and cloud environments. It is also a useful reference for developers, database administrators, and web server administrators who would like to understand operating system and application performance.
Why This Book is Different
While it covers performance tools and the background for understanding them, what makes this book different is the inclusion of many performance methodologies, including those covered briefly in my USENIX 2012 talk. I've been teaching and developing systems performance classes on and off for over ten years, and have found methodologies to be crucial for giving students a starting point and then guiding them through performance activities. The USE Method is a methodology I developed for this purpose.
Table of Contents
3. Operating Systems
4. Observability Tools
8. File Systems
11. Cloud Computing
16. Case Study
The draft is roughly 800 pages.
PDF Download eBook ePUB
An eBook will be available at some point. It will not be a Safari "rough cut" after my experience with the BPF book.
Thanks to all the reviewers, and to Deirdré Straughan for editing another one of my books!