Gprof

Gprof is a performance analysis tool for Unix applications. It used a hybrid of instrumentation and sampling and was created as an extended version of the older "prof" tool. Unlike prof, gprof is capable of limited call graph collecting and printing.

History
GPROF was originally written by a group led by Susan L. Graham at the University of California, Berkeley for Berkeley Unix (4.2BSD ). Another implementation was written as part of the GNU project for GNU Binutils in 1988 by Jay Fenlason.

Implementation
Instrumentation code is automatically inserted into the program code during compilation (for example, by using the ' ' option of the gcc compiler), to gather caller-function data. A call to the monitor function 'mcount' is inserted before each function call.

Sampling data is saved in 'gmon.out' or in 'progname.gmon' file just before the program exits, and can be analyzed with the ' ' command-line tool. Several gmon files can be combined with ' ' to accumulate data from several runs of a program.

GPROF output consists of two parts: the flat profile and the call graph. The flat profile gives the total execution time spent in each function and its percentage of the total running time. Function call counts are also reported. Output is sorted by percentage, with hot spots at the top of the list.

The second part of the output is the textual call graph, which shows for each function who called it (parent) and who it called (child subroutines). There is an external tool called gprof2dot capable of converting the call graph from gprof into graphical form.

Limitations and accuracy
At run-time, timing values are obtained by statistical sampling. Sampling is done by probing the target program's program counter at regular intervals using operating system interrupts (programmed via profil(2) or setitimer(2) syscalls). The resulting data is not exact, rather a statistical approximation. The amount of error is usually more than one sampling period. If a value is n times the sampling period, the expected error in the value is the square root of n sampling periods. A typical sampling period is 0.01 second (10 milliseconds) or 0.001 second (1 ms) or in other words 100 or 1000 samples per second of CPU running time.

In some versions, such as BSD, profiling of shared libraries can be limited because of restrictions of the profil function, which may be implemented as library function or as system call. There were analogous utility in glibc called 'sprof' to profile dynamic libraries.

Gprof cannot measure time spent in kernel mode (syscalls, waiting for CPU or I/O waiting), and only user-space code is profiled.

The mcount function may not be thread-safe in some implementations, so multi-threaded application profiles can be incorrect (typically it only profiles the main thread of application).

Instrumentation overhead can be high (estimated as 30% -260% ) for higher-order or object-oriented programs. Mutual recursion and non-trivial cycles are not resolvable by the gprof approach (context-insensitive call graph), because it only records arc traversal, not full call chains.

Gprof with call-graph collecting can be used only with compatible compilers, like GCC, clang/LLVM and some other.

Reception
In 2004 a GPROF paper appeared on the list of the 50 most influential PLDI papers of all time as one of four papers of 1982 year.

According to Thiel, "GPROF ... revolutionized the performance analysis field and quickly became the tool of choice for developers around the world ... the tool still maintains a large following ... the tool is still actively maintained and remains relevant in the modern world."