比如:tuning to improve memory utilization may degrade file system performance;Choosing RAID disk comfigurations for data integrity may be less expersive than altternative mirroring solutions that often improve performance.It is may be more cost-effective to purchase a CPU upgrade rather than spend days or weeks analyzing how the application could be changed to improve performance.
选项-s:prints sar data for the interval in which the program ran.
option -p: List process accounting records for command and all its children.This option works only if the process accounting software is installed and /usr/lib/acct/turnacct has been invoked to create /var/adm/pacct
interval: Display successive lines which are summaries of the last interval seconds. The first line reported is for the time since a reboot and each subsequent line is for the last interval only.
count: Repeat the statistics count times.
-t: Report terminal statistics as well as disk statistics.
终端字符进程(teminal character processes(MUX- and LAN-based)
计算密集型进程和实时进程
X-终端和X-服务器进程(X-terminals and X-servers)
利用SAR工具分析CPU的利用率
利用SAR进行CPU的利用率分析的命令形式:
#sar -u,这时数据是通过sa1在后台定时生成;
#sar -u 5 100,每隔5秒取样一次,共取100次;
SAR -u:Report CPU utilization (the default); portion of time running in one of several modes. On a multi-processor system, if the -M option is used together with the -u option, per-CPU utilization as well as the average CPU utilization of all the processors are reported. If the -M option is not used, only the average CPU utilization of all the processors is reported:
cpu: cpu number (only on a multi-processor system with the -M option);
%usr: user mode;
%sys: system mode;
%wio: idle with some process waiting for I/O (only block I/O, raw I/O, or VM pageins/swapins indicated);
SAR -q: Report average queue length while occupied, and percent of time occupied. On a multi-processor machine, if the -M option is used together with the -q option, the per-CPU run queue as well as the average run queue of all the processors are reported. If the -M option is not used, only the average run queue information of all the processors is reported:
cpu: cpu number (only on a multi-processor system with the -M option);
runq-sz: Average length of the run queue(s) of processes (in memory and runnable);
%runocc: The percentage of time the run queue(s) were occupied by processes (in memory and runnable);
swpq-sz: Average length of the swap queue of runnable processes (processes swapped out but ready to run);
%swpocc: The percentage of time the swap queue of runnable processes (processes swapped out but ready to run) was occupied.
用sar -w命令分析进程的deactivation/reactivation and switching activities of the system;
也可以用GlancePlus;
利用SAR工具分析系统调用
利用SAR进行系统调用分析的命令形式:
#sar -c,这时数据是通过sa1在后台定时生成;
#sar -c 5 100,每隔5秒取样一次,共取100次;
SAR -c: Report system calls:
scall/s: Number of system calls of all types per second;
sread/s: Number of read() and/or readv() system calls per second;
swrit/s: Number of write() and/or writev() system calls per second;
swpq-sz: Average length of the swap queue of runnable processes (processes swapped out but ready to run);
fork/s: Number of fork() and/or vfork() system calls per second;
exec/s: Number of exec() system calls per second;
rchar/s: Number of characters transferred by read system calls block devices only) per second;
wchar/s: Number of characters transferred by write system calls (block devices only) per second.
对结果的分析:
如果scall/s列的值很大,那么这么多的系统调用的原因就必须仔细分析了。
我们可以查看fork/s和exec/s列的值,看看系统是否在创建大量新的进程。
利用time命令测试某个命令和程序的执行效率
我们可以利用time命令来测试一个命令的执行效率,语法为:
time command
command is executed. Upon completion, time prints the elapsed time during the command, the time spent in the system, and the time spent executing the command. Times are reported in seconds.
Execution time can depend on the performance of the memory in which the program is running.
top [-s time] [-d count] [-q] [-u] [-h] [-n number]
其中各选项的含义为:
-s time: 屏幕刷新的时间间隔time,缺省为5秒;
-d count: 屏幕刷新count次后,top命令自己也退出;
-q: This option runs the top program at the same priority as if it is executed via a nice -20 command so that it will execute faster (see nice(1)). This can be very useful in discovering any system problem when the system is very sluggish. This option is accessibly only to users who have appropriate privileges.
-u: User ID (uid) numbers are displayed instead of usernames. This improves execution speed by eliminating the additional time required to map uid numbers to user names.
-h: Hides the individual CPU state information for systems having multiple processors. Only the average CPU status will be displayed.
-n number: Show only number processes per screen. Note that this option is ignored if number is greater than the maximum number of processes that can be displayed per screen.
我们通过RES(the current size of the process resident in memory)列可以知道每个进程占用内存的数量。
我们通过NICE列可以知道系统是否使用NICE值来调节该进程的工作负载平衡。
利用uptime命令查看系统整体情况
uptime prints the current time, the length of time the system has been up, the number of users logged on to the system, and the average number of jobs in the run queue over the last 1, 5, and 15 minutes.
w is linked to uptime and prints the same output as uptime -w, displaying a summary of the current activity on the system.
它的语法为:
uptime [-hlsuw] [user]
w [-hlsuw] [user]
其中各选项的含义为:
-h: Suppress the first line and the heading line. This option should not be used with the -u option. This option assumes the use of the -w option to uptime.
-l: Use long output. This option assumes the use of the -w option to uptime.
-s: Use the short form of output for displaying terminal information. The terminal name is abbreviated; the login time and CPU times are suppressed.
-u: Print only the first line describing the overall state of the system. This is the default for the uptime command.ormation for systems having multiple processors. Only the average CPU status will be displayed.
-w: Print a summary of the current activity on the system for each user. This is the default for the w command.
磁头/逻辑卷的读/写速率(read/write rates per spindle/logical volume)
原始I/O(raw I/O):主要用于数据库应用
交换队列的长度(swap queue length)
缓存命中率(buffer cache hit ratio)
网络文件系统和无盘工作站速率(NFS and diskless rates(server))
I/O资源成为系统性能的瓶颈的征兆
当I/O成为瓶颈时,会出现下面这些典型的症状:
过高的磁盘利用率(high disk utilization)
太长的磁盘等待队列(large disk queue length)
等待磁盘I/O的时间所占的百分率太高(large percentage of time waiting for disk I/O)
太高的物理I/O速率:large physical I/O rate(not sufficient in itself)
过低的缓存命中率(low buffer cache hit ratio(not sufficient in itself))
太长的运行进程队列,但CPU却空闲(large run queue with idle CPU)
哪些活动是占用I/O资源的大户?
下面是一些占用大量I/O资源的活动:
换页(paging):paging不仅会引起内存问题,还可能引起磁盘问题;
open,creat,and stat system calls:系统调用会引起大量的磁盘I/O;
multiuser I/O and random I/O
relational database
core dumps
利用iostat分析I/O的利用率
iostat - report I/O statistics
iostat iteratively reports I/O statistics for each active disk on the system.
If two or more disks are present, data is presented on successive lines for each disk.
With the advent of new disk technologies, such as data striping, where a single data transfer is spread across several disks, the number of milliseconds per average seek becomes impossible to compute accurately. At best it is only an approximation, varying greatly, based on several dynamic system conditions. For this reason and to maintain backward compatibility, the milliseconds per average seek ( msps ) field is set to the value 1.0.
它的语法为:
iostat [-t] [interval [count]]
其选项的含义为:
-t:Report terminal statistics as well as disk statistics.
interval: Display successive lines which are summaries of the last interval seconds. The first line reported is for the time since a reboot and each subsequent line is for the last interval only.
count: Repeat the statistics count times.
对结果的分析:
通过查看bps列和sps列的值我们可以知道哪些磁盘比较忙,哪些磁盘比较闲。
利用SAR命令分析磁盘活动
通过命令sar -d,我们可以分析系统中的每个磁盘和磁带的活动情况。
Report activity for each block device, e.g., disk or tape drive. One line is printed for each device that had activity during the last interval. If no devices were active, a blank line is printed.Each line contains the following data:
device:设备名;
%busy: Portion of time device was busy servicing a request; statistics.
avque: Average number of requests outstanding for the device;
r+w/s: Number of data transfers per second (read and writes) from and to the device;
blks/s: Number of bytes transferred (in 512-byte units) from and to the device;
avwait: Average time (in milliseconds) that transfer requests waited idly on queue for the device;
avserv: Average time (in milliseconds) to service each transfer request (includes seek, rotational latency, and data transfer times) for the device.
对结果的分析:
如果某个磁盘的%busy列的值大于50%,则说明该磁盘可能存在瓶颈;
如果某个磁盘的avwait珍的值大于avserv列的值,也说明该磁盘可能存在瓶颈;
利用SAR命令分析缓冲区的活动
通过命令sar -b,我们可以分析系统中的缓冲区的活动情况。
Report activity for each block device, e.g., disk or tape drive. One line is printed for each device that had activity during the last interval. If no devices were active, a blank line is printed.Each line contains the following data:
bread/s Number of physical reads per second from the disk (or other block devices) to the buffer cache;
bwrit/s: Number of physical writes per second from the buffer cache to the disk (or other block device);
lread/s: Number of reads per second from buffer cache;
lwrit/s: Number of writes per second to buffer cache;
%rcache: Buffer cache hit ratio for read requests e.g., 1 - bread/lread;
%wcache: Buffer cache hit ratio for write requests e.g., 1 - bwrit/lwrit;
pread/s: Number of reads per second from character device using the physio() (raw I/O) mechanism;
pwrit/s: Number of writes per second to character device using the physio() (i.e., raw I/O ) mechanism; mechanism.
Report activity for each block device, e.g., disk or tape drive. One line is printed for each device that had activity during the last interval. If no devices were active, a blank line is printed.Each line contains the following data:
swpin/s: Number of process swapins per second;
swpot/s: Number of process swapouts per second;
bswin/s: Number of 512-byte units transferred for swapins per second;
bswot/s: Number of 512-byte units transferred for swapouts per second;
pswch/s: Number of process context switches per second.