How to Debug the Execution of a Program in Linux

strace is a useful diagnostic, instructional, and debugging tool. System administrators, diagnosticians and trouble-shooters will find it invaluable for solving problems with programs for which the source is not readily available since they do not need to be recompiled in order to trace them. Students, hackers and the overly-curious will find that a great deal can be learned about a system and its system calls by tracing even ordinary programs. And programmers will find that since system calls and signals are events that happen at the user/kernel interface, a close examination of this boundary is very useful for bug isolation, sanity checking and attempting to capture race conditions.

Trace the Execution

You can use strace command to trace the execution of any executable. The following example shows the output of strace for the Linux uname command.

Counting number of syscalls

Run the ls command counting the number of times each system call was made and print totals showing the number and time spent in each call (useful for basic profiling or bottleneck isolation):

Save the Trace Execution to a File Using Option -o

The following examples stores the strace output to output.txt file.

Print Timestamp for Each Trace Output Line Using Option -t

To print the timestamp for each strace output line, use the option -t as shown below.

Tracing only network related system calls

Trace just the network related system calls of ping command

 

Viewing files opened by a process/daemon using tracefile

tracefile: Output from cmd on stdout can mess up output from strace.

Notes

It is a pity that so much tracing clutter is produced by systems employing shared libraries.

It is instructive to think about system call inputs and outputs as data-flow across the user/kernel boundary. Because user-space and kernel-space are separate and address-protected, it is sometimes possible to make deductive inferences about process behavior using inputs and outputs as propositions.

In some cases, a system call will differ from the documented behavior or have a different name. For example, on System V-derived systems the true time(2) system call does not take an argument and the stat function is called xstat and takes an extra leading argument. These discrepancies are normal but idiosyncratic characteristics of the system call interface and are accounted for by C library wrapper functions.

On some platforms a process that has a system call trace applied to it with the -p option will receive a SIGSTOP . This signal may interrupt a system call that is not restartable. This may have an unpredictable effect on the process if the process takes no action to restart the system call.

 

Leave a Reply

Your email address will not be published. Required fields are marked *