TrumanWong

strace

Tracing system calls and signals

Supplementary instructions

strace command is a tool that integrates diagnosis, debugging, and statistics. We can use strace to trace the system calls and signal transmission of the application to analyze the application in order to solve the problem or understand the application working process. the goal of. Of course, strace is incomparable with professional debugging tools such as gdb, because it is not a professional debugger.

The simplest use of strace is to execute a specified command, and it will exit after the specified command ends. During the execution of the command, strace will record and parse all system calls of the command process and all signal values received by the process.

grammar

strace [ -dffhiqrtttTvxx ] [ -acolumn ] [ -eexpr ] ...
     [ -ofile ] [-ppid ] ... [ -sstrsize ] [ -uusername ]
     [ -Evar=val ] ... [ -Evar ]...
      [command [ arg ... ] ]

strace -c [ -eexpr ] ... [ -Ooverhead ] [ -Ssortby ]
     [ command [ arg... ] ]

Options

-c counts the execution time, number of times, and number of errors for each system call.
-d Output strace debugging information on standard error.
-f tracks child processes spawned by fork calls.
-ff If -o filename is provided, the tracking results of all processes are output to the corresponding filename.pid, where pid is the process number of each process.
-F Attempts to trace vfork calls. With -f, vfork is not traced.
-h outputs brief help information.
-i Output the entry pointer of the system call.
-q Suppresses the output of messages about detachments.
-r prints the relative time for each system call.
-t Prepend each line in the output with time information.
-tt Prepends time information in microseconds before each line in the output.
-ttt microsecond level output, expressing time in seconds.
-T displays the time spent on each call.
-v outputs all system calls. Some calls regarding environment variables, status, input and output, etc. are not output by default due to frequent use.
-V outputs strace version information.
-x output non-standard strings in hexadecimal form
-xx All strings are output in hexadecimal form.
-a column sets the output position of the return value. The default is 40.
-e expr specifies an expression to control how to track. Format: [qualifier=][!]value1[,value2]...
The qualifier can only be one of trace, abbrev, verbose, raw, signal, read, and write. Value is a symbol or number used to qualify. The default qualifier is trace. The exclamation mark is a negative symbol. For example: -eopen is equivalent to - e trace=open means that only open calls are traced. And -etrace!=open means that other calls besides open are traced. There are two special symbols all and none. Note that some shells use! to execute commands in the history, so need to use\\.
-e trace=set only traces the specified system calls. For example: -e trace=open, close, rean, write means that only these four system calls are traced. The default is set=all.
-e trace=file Only trace system calls related to file operations.
-e trace=process Only trace system calls related to process control.
-e trace=network Trace all system calls related to the network.
-e strace=signal traces all system calls related to system signals
-e trace=ipc traces all system calls related to process communication
-e abbrev=set sets the result set of the system call output by strace. -v is equivalent to abbrev=none. The default is abbrev=all.
-e raw=set Display the parameters of the specified system call in hexadecimal.
-e signal=set specifies the system signals to be tracked. The default is all. For example, signal=!SIGIO (or signal=!io) means that the SIGIO signal is not tracked.
-e read=set Output the data read from the specified file. For example: -e read=3,5
-e write=set Output the data written to the specified file.
-o filename writes strace output to file filename
-p pid traces the specified process pid.
-s strsize specifies the maximum length of the output string. The default is 32. The entire file name is always output.
-u username executes the tracked command with the UID and GID of username

Example

Tracking System Calls

Now we make a very simple program to demonstrate the basic usage of strace. The C language code of this program is as follows:

# filename test.c
#include <stdio.h>

int main()
{
     int a;
     scanf("%d", &a);
     printf("%09d\n", a);
     return 0;
}

Then we compile it with gcc -o test test.c and get an executable file test. Then use strace to call execution:

strace ./test

During execution, you will be asked to enter an integer. We entered 99 and finally got the following result:

// Directly execute the result of test
oracle@orainst[orcl]:~ $./test

//Execution result
99
000000099

//The result of executing test through strace
oracle@orainst[orcl]:~ $strace ./test

// trace results of strace
execve("./test", ["./test"], [/* 41 vars */]) = 0
uname({sys="Linux", node="orainst.desktop.mycompany.com", ...}) = 0
brk(0) = 0x8078000
fstat64(3, {st_mode=S_IFREG|0644, st_size=65900, ...}) = 0
old_mmap(NULL, 65900, PROT_READ, MAP_PRIVATE, 3, 0) = 0xbf5ef000
close(3) = 0
open("/lib/tls/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200X\ 1"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1571692, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xbf5ee000
old_mmap(NULL, 1275340, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xa02000
old_mmap(0xb34000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x132000) = 0xb34000
old_mmap(0xb37000, 9676, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb37000
close(3) = 0
set_thread_area({entry_number:-1 -> 6, base_addr:0xbf5ee740, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
munmap(0xbf5ef000, 65900) = 0
fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xbf5ff000
read(0, 99
"99\n", 1024) = 3
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xbf5fe000
write(1, "000000099\n", 10000000099
) = 10
munmap(0xbf5fe000, 4096) = 0
exit_group(0) = ?

As can be seen from the trace structure, the system first calls execve to start a new process, then performs some environment initialization operations, and finally stops at "read(0,"), which means that our scanf function has been executed and is waiting for us to input numbers. Well, after inputting 99, call the write function to output the formatted value "000000099" to the screen, and finally call exit_group to exit, completing the execution of the entire program.

Track Signal Delivery

We still use the test program above to observe the process receiving signals. Or run strace ./test first, don’t enter anything while waiting for the input screen, then open another window and enter the following command

killall test

At this time you can see that our program has been launched, and the final trace results are as follows:

oracle@orainst[orcl]:~
$strace ./test

execve("./test", ["./test"], [/* 41 vars */]) = 0
uname({sys="Linux", node="orainst.desktop.mycompany.com", ...}) = 0
brk(0) = 0x9ae2000
old_mmap(NULL, 65900, PROT_READ, MAP_PRIVATE, 3, 0) = 0xbf5ef000
close(3) = 0
open("/lib/tls/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200X\ 1"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1571692, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xbf5ee000
old_mmap(NULL, 1275340, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2e9000
old_mmap(0x41b000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x132000) = 0x41b000
old_mmap(0x41e000, 9676, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x41e000
close(3) = 0
set_thread_area({entry_number:-1 -> 6, base_addr:0xbf5ee740, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
munmap(0xbf5ef000, 65900) = 0
fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xbf5ff000
read(0, 0xbf5ff000, 1024) = ? ERESTARTSYS (To be restarted)
--- SIGTERM (Terminated) @ 0 (0) ---
+++ killed by SIGTERM +++

The trace clearly tells you that the test process was "+++ killed by SIGTERM +++".

System call statistics

Strace can not only track system calls. By using the -c parameter, it can also perform a statistical analysis of all system calls of the process for you. Let's take a look at the statistics of strace. This time we execute strace with the -c parameter:

strace -c ./test

Finally, you can get the trace result like this:

oracle@orainst[orcl]:~
$strace -c ./test
execve("./test", ["./test"], [/* 41 vars */]) = 0
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ---- ----------
  45.90 0.000140 5 27 25 open
  34.43 0.000105 4 24 21 stat64
   7.54 0.000023 5 5 old_mmap
   2.62 0.000008 8 1 munmap
   1.97 0.000006 6 1 uname
   1.97 0.000006 2 3 fstat64
   1.64 0.000005 3 2 1 read
   1.31 0.000004 2 2 close
   0.98 0.000003 3 1 brk
   0.98 0.000003 3 1 mmap2
   0.66 0.000002 2 1 set_thread_area
------ ----------- ----------- --------- --------- ---- ----------
100.00 0.000305 68 47 total

It clearly tells you which system functions were called, how many times they were called, how much time was consumed, and other information. This is very useful for us to analyze a program.

Description of common parameters

In addition to the -c parameter, strace also provides us with other useful parameters, so that we can easily get the information we want. The following is an introduction to those commonly used parameters one by one.

Redirect Output

The parameter -o is used to output the strace results to a file. If the -o parameter is not specified, the default output device is STDERR, which means that the results of using "-o filename" and "2>filename" are the same.

# Both commands output strace results to the file test.txt
strace -c -o test.txt ./test
strace -c ./test 2>test.txt

Timing system calls

strace can use the parameter -T to print out the time spent on each system call. The time spent on each call is now in the angle brackets on the rightmost side of the call line.

oracle@orainst[orcl]:~
$strace -T ./test

// Only part of the results are excerpted here
read(0, 1
"1\n", 1024) = 2 <2.673455>
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 <0.000014>
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xbf5fe000 <0.000017>
write(1, "000000001\n", 10000000001
) = 10 <0.000016>
munmap(0xbf5fe000, 4096) = 0 <0.000020>
exit_group(0) = ?

System call time

This is a very useful function. strace will record the occurrence time of each system call. Just use the three parameters -t/tt/ttt to see the effect. You can try the specific example yourself.

Parameter name Output style Description
-t 10:33:04 exit_group(0) Output results accurate to seconds
-tt 10:33:48.159682 exit_group(0) Output results are accurate to the subtle
-ttt 1262169244.788478 exit_group(0) Accurate to microseconds, and time expressed as unix timestamp

Truncate output

The -s parameter is used to specify the length of the string output in each line of the trace result. Let's take a look at the impact of the -s parameter in the test program on the results. Now specify -s as 20, and then enter a Numeric string longer than 20 characters

strace -s 20 ./test

read(0, 22222222222222222222222222 // We entered a total of 25 2s
"22222222222222222222"..., 1024) = 26 // And there are only 20 results of 2 that we see

trace an existing process

strace can not only initialize a process by itself for trace, but can also track existing processes. The parameter -p serves this purpose, and its usage is also very simple, as detailed below.

strace -p pid

Comprehensive example

Having said so many functions and parameters, now let’s do something more practical, which is to study Oracle’s lgwr process to see if this process writes a log file every 3 seconds as the document says. Considering that lgwr writes logs There are many trigger conditions. We need to find an idle Oracle instance to do this experiment.

We first need to get the pid of the lgwr process and run the following command

ps -ef|grep lgwr

oracle 5912 1 0 Nov12 ? 00:14:56 ora_lgwr_orcl

Get the pid of lgwr is 5912. Now start strace, then output several traces to the lgwr.txt file, and execute the following command

strace -tt -s 10 -o lgwr.txt -p 5912

Stop strace after a while and check the results. Since there are many output results, for convenience we only look at the pwrite function calls used by Oracle when writing log files.

grep pwrite\(20 lgwr.txt

Wait, why do you use "pwrite(2" when greping? Because I know that the handle numbers of the current log files opened by my machine all start with 2. The specific search method is to use the following statement to find out What are the currently active log files:

select member, v$log.status from v$log, v$logfile
where v$log.group#=v$logfile.group#;

get

MEMBER STATUS
-------------------------------------------------- ----------------
/db/databases/orcl/redo-01-a/redo-t01-g03-m1.log INACTIVE
/db/databases/orcl/redo-03-a/redo-t01-g03-m2.log INACTIVE
/db/databases/orcl/redo-02-a/redo-t01-g02-m1.log CURRENT
/db/databases/orcl/redo-04-a/redo-t01-g02-m2.log CURRENT
/db/databases/orcl/redo-01-a/redo-t01-g01-m1.log INACTIVE
/db/databases/orcl/redo-03-a/redo-t01-g01-m2.log INACTIVE
/db/databases/orcl/redo-02-a/redo-t01-g04-m1.log INACTIVE
/db/databases/orcl/redo-04-a/redo-t01-g04-m2.log INACTIVE

Then go to /proc to find the handle to the open file:

ll /proc/.5912/fd/

get

lrwx------ 1 oracle dba 64 Dec 30 10:55 18 -> /db/databases/orcl/redo-01-a/redo-t01-g01-m1.log
lrwx------ 1 oracle dba 64 Dec 30 10:55 19 -> /db/databases/orcl/redo-03-a/redo-t01-g01-m2.log
lrwx------ 1 oracle dba 64 Dec 30 10:55 20 -> /db/databases/orcl/redo-02-a/redo-t01-g02-m1.log
lrwx------ 1 oracle dba 64 Dec 30 10:55 21 -> /db/databases/orcl/redo-04-a/redo-t01-g02-m2.log
lrwx------ 1 oracle dba 64 Dec 30 10:55 22 -> /db/databases/orcl/redo-01-a/redo-t01-g03-m1.log
lrwx------ 1 oracle dba 64 Dec 30 10:55 23 -> /db/databases/orcl/redo-03-a/redo-t01-g03-m2.log
lrwx------ 1 oracle dba 64 Dec 30 10:55 24 -> /db/databases/orcl/redo-02-a/redo-t01-g04-m1.log
lrwx------ 1 oracle dba 64 Dec 30 10:55 25 -> /db/databases/orcl/redo-04-a/redo-t01-g04-m2.log

Now I can see that the handles of the current log files on my machine are 20 and 21 respectively.

Now we get the following result

11:13:55.603245 pwrite(20, "\1\"\0\0J!"..., 1536, 4363264) = 1536
11:13:55.603569 pwrite(21, "\1\"\0\0J!"..., 1536, 4363264) = 1536
11:13:55.606888 pwrite(20, "\1\"\0\0M!"..., 1536, 4364800) = 1536
11:13:55.607172 pwrite(21, "\1\"\0\0M!"..., 1536, 4364800) = 1536
11:13:55.607934 pwrite(20, "\1\"\0\0P!"..., 1536, 4366336) = 1536
11:13:55.608199 pwrite(21, "\1\"\0\0P!"..., 1536, 4366336) = 1536
11:13:55.610260 pwrite(20, "\1\"\0\0S!"..., 1536, 4367872) = 1536
11:13:55.610530 pwrite(21, "\1\"\0\0S!"..., 1536, 4367872) = 1536
11:14:00.602446 pwrite(20, "\1\"\0\0V!"..., 1536, 4369408) = 1536
11:14:00.602750 pwrite(21, "\1\"\0\0V!"..., 1536, 4369408) = 1536
11:14:00.606386 pwrite(20, "\1\"\0\0Y!"..., 1536, 4370944) = 1536
11:14:00.606676 pwrite(21, "\1\"\0\0Y!"..., 1536, 4370944) = 1536
11:14:00.607900 pwrite(20, "\1\"\0\0\\"..., 1024, 4372480) = 1024
11:14:00.608161 pwrite(21, "\1\"\0\0\\"..., 1024, 4372480) = 1024
11:14:00.608816 pwrite(20, "\1\"\0\0^!"..., 1024, 4373504) = 1024
11:14:00.609071 pwrite(21, "\1\"\0\0^!"..., 1024, 4373504) = 1024
11:14:00.611142 pwrite(20, "\1\"\0\0`!"..., 1536, 4374528) = 1536
11:14:00.611454 pwrite(21, "\1\"\0\0`!"..., 1536, 4374528) = 1536
11:14:05.602804 pwrite(20, "\1\"\0\0c!"..., 1024, 4376064) = 1024
11:14:05.603119 pwrite(21, "\1\"\0\0c!"..., 1024, 4376064) = 1024
11:14:05.607731 pwrite(20, "\1\"\0\0e!"..., 1024, 4377088) = 1024
11:14:05.608020 pwrite(21, "\1\"\0\0e!"..., 1024, 4377088) = 1024
11:14:05.608690 pwrite(20, "\1\"\0\0g!"..., 1024, 4378112) = 1024
11:14:05.608962 pwrite(21, "\1\"\0\0g!"..., 1024, 4378112) = 1024
11:14:05.611022 pwrite(20, "\1\"\0\0i!"..., 1536, 4379136) = 1536
11:14:05.611283 pwrite(21, "\1\"\0\0i!"..., 1536, 4379136) = 1536