Linux perf
The “perf” command is the official profiler and tracer for Linux. Its source is included in the Linux tree (under tools/perf). perf is a tool that analyzes performance on Linux systems. Events that can be searched using the perf tool are largely divided into four types.
It Accesses the kernel via the perf_event_open
system call function to collect information.
- Hardware Events / Hardware Cache Events
- SW event provided by kernel (page fault, context-switch..)
- Tracepoint event
- Custom probe event
Characteristics
- Analysis of specific program or system-wide performance
- PMU function control supported by various CPUs
- Collecting information on various events (cpu-cycle, cache-misses) (count based)
- Provides statistical views based on collected performance analysis information (TUI, GUI, etc.)
- Kernel api (call-to-call-to-call) traceable
Effect
- Performance analysis for each complex, diverse kernel version
- Analyzing performance without slowing down kernels or programs
- Available for all the latest CPUs (x86, ARM…)
- Easy to analyze kernel, compatibility issues with systems, and causes of performance degradation
Usage
To use perf, first check that it is installed by trying to run “perf”:
It should print a usage message like that above (truncated). On Ubuntu systems, if perf is not installed it will suggest the packages to install. Something like:
Operation
Perf has four basic modes of operation:
- counting: counting events in kernel context and printing a report (low overhead). Eg, “perf stat”.
- capture: recording events and writing to a perf.data file. Eg, “perf record”.
- reporting: reading a perf.data file and dumping or summarizing it. Eg, “perf report”.
- live recording: recording and summarizing events live. Eg, “perf top”.
Whenever the perf.data file is in use, there is overhead to write this file, which is relative to the traced event rate. perf uses ring buffers and dynamic wakeups to lower this overhead.
One-Liners
Common perf one-liners:
More one-liners in the Tracing section, and even more are listed on http://www.brendangregg.com/perf.html .
CPU Flame Graphs
Flame graphs are generated in three steps:
- Capture stacks
- Fold stacks
- flamegraph.pl
Using Linux perf, the following samples stack traces at 99 Hertz for 30 seconds, and then generates a flame graph of all sampled stacks (except those containing “cpu_idle”: the idle threads):
The “out.svg” file can then be loaded in a web browser.
Broken Stacks
Broken/incomplete stack traces are a common problem with profilers. perf has multiple ways to walk (fetch) a stack. The easiest to get working is usually frame-pointer based walking. Enabling this for different languages:
- C: gcc’s -f-no-omit-frame-pointer option
- Java: -XX:+PreserveFramePointer
Missing Symbols
Missing symbols is a common problem when profiling JIT runtimes. perf support supplemental symbol tables in /tmp/perf-PID.map. Enabling this map for different languages:
- Java: https://github.com/jrudolph/perf-map-agent
- Node.js: —perf_basic_prof_only_functions
Customizations
See flamegraph.pl —help. A common customization is to use an alternate palette scheme: eg, “—color java” for Java profiles.
Tracing
Example static tracing one-liners:
Example dynamic tracing one-liners:
References