• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer navigation

The Geek Diary

  • OS
    • Linux
    • CentOS/RHEL
    • Solaris
    • Oracle Linux
    • VCS
  • Interview Questions
  • Database
    • oracle
    • oracle 12c
    • ASM
    • mysql
    • MariaDB
  • DevOps
    • Docker
    • Shell Scripting
  • Big Data
    • Hadoop
    • Cloudera
    • Hortonworks HDP

Solaris Performance troubleshooting : Disk (I/O) performance issues

by admin

An I/O performance bottleneck can be due to a disk or even due to a HBA or a HBA driver. The command iostat (Input output statistics) help us to get started with analyzing a disk I/O bottleneck issue.

A standard iostat output would look like :

# iostat -xn 1 5
    extended device statistics 
r/s   w/s kr/s    kw/s wait actv wsvc_t asvc_t %w  %b  device 
293.1 0.0 37510.5 0.0  0.0  31.7 0.0     108.3  1 100  c0t0d0
294.0 0.0 37632.9 0.0  0.0  31.9 0.0     108.6  0 100  c0t0d0
293.0 0.0 37504.4 0.0  0.0  31.9 0.0    1032.0  0 100  c0t0d0
294.0 0.0 37631.3 0.0  0.0  31.8 0.0     108.1  1 100  c0t0d0
294.0 0.0 37628.1 0.0  0.0  31.9 0.0     108.6  1 100  c0t0d0

The various options that can be used with iostat are :

-x --> Extended disk statistics. This prints a line per device and provides the breakdown that includes r/s, w/s, kr/s, kw/s, wait, actv, svc_t, %w, and %b.
-t  -->  Print terminal I/O statistics.
-n --> Use logical disk names rather than instance names.
-c --> Print the standard system time percentages: us, sy, wt, id.
-z --> Don't print lines having all zeros.

The meaning of each column value in the iostat output is :

r/s       reads per second
w/s       writes per second
kr/s      kilobytes read per second
kw/s      kilobytes written per second
wait      average number of transactions waiting for service (queue length)
actv      average number of transactions actively being 
          serviced (removed from the queue but not yet completed)
svc_t     average response time of transactions, in milliseconds
%w        percent of time there are transactions waiting for service (queue non-empty)
%b        percent of time the disk is busy (transactions  in progress)
wsvc_t    average  service  time  in  wait  queue,  in  milliseconds
asvc_t    average service time of  active  transactions,  in milliseconds
wt        the I/O wait time is no  longer  calculated  as  a percentage 
          of CPU time, and this statistic will always return zero.

The first line in the iostat output is the summary since boot. This line will give you a rough idea of average server I/O on the server. This could be very useful to compare the server I/O performance at the time of performance bottleneck. Now if you see the asvc_t column you would see a constant high value. Generally a value more than 30 to 40 ms is considered to be high. But you can safely ignore a spike of 200 ms in the asvc_t column. Here the interval was 1 sec with a count of 5.

Check for Disk Failures

Disk failure could also be a major, infact the only reason in many disk I/O bottleneck issues. To check the disk failure :

# iostat -xne
                            extended device statistics       ---- errors ---
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b s/w h/w trn tot device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0   0   0   0   0 fd0
    1.8    0.5   34.7    2.6  0.0  0.0    0.0   19.3   0   2   0   0   0   0 c1t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0   0   0   0   0 c0t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0   0   0   0   0 geeklab01:vold(pid555)

Check the columns s/w (soft errors), h/w (hard errors), trn (transport errors) and tot (total errors). The meanings of various errors is :

Soft error : A disk sector fails the CRC check and needs to be re-read
Hard error : Re-read fails several times for CRC check
Transport error : Errors reported by I/O bus
Total errors : Soft error + Hard error + Transport errors

A large number of any of these errors (especially increasing hard errors) may point that the disk is either failed already or is on its way to fail. Another command to check the errors on the disk is :

# iostat -E
sd0       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: VMware,  Product: VMware Virtual S Revision: 1.0  Serial No:
Size: 10.74GB [10737418240 bytes]
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 9 Predictive Failure Analysis: 0
sd1       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: VMware,  Product: VMware Virtual S Revision: 1.0  Serial No:
Size: 24.70GB [24696061952 bytes]
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 6 Predictive Failure Analysis: 0

System activity reporter (sar) to check disk I/O

Sar (system activity reporter) is another command to check the disk I/O. Before start using sar enable the sar service using svcadm if not already enabled :

# svcadm enable sar
# svcs sar
STATE          STIME    FMRI
online          4:15:08 svc:/system/sar:default

To check the disk I/O statistics using sar, we can run it by giving an interval of 2 sec and 10 counts.

# sar -d 2 10
SunOS geeklab 5.11 11.1 i86pc    12/13/2013

10:11:46   device        %busy   avque   r+w/s  blks/s  avwait  avserv

10:11:48   ata1              0     0.0       0       0     0.0     0.0
           iscsi0            0     0.0       0       0     0.0     0.0
           mpt0              0     0.0       0       0     0.0     0.0
           scsi_vhc          0     0.0       0       0     0.0     0.0
           sd0               0     0.0       0       0     0.0     0.0
           sd0,a             0     0.0       0       0     0.0     0.0
           sd0,b             0     0.0       0       0     0.0     0.0
           sd0,h             0     0.0       0       0     0.0     0.0
           sd0,i             0     0.0       0       0     0.0     0.0
           sd0,q             0     0.0       0       0     0.0     0.0
           sd0,r             0     0.0       0       0     0.0     0.0

The sar -d command reports almost the same data what iostat would report except for the reads + writes per second ( r+w/s ) and no. of blocks (512 bytes) per second (blks/s). The other column parameters which are important are average wait queue length (avque), average wait queue time (avwait), average service time (avserv) and % busy (%busy).

Now you can also use top command to get the % I/O wait time. Solaris 11, by default has the top package installed. For solaris 10, you will have to install the third party top package.

# top
last pid:  7448;  load avg:  0.01,  0.13,  0.11;  up 0+13:54:41
60 processes: 59 sleeping, 1 on cpu
CPU states: 99.5% idle,  0.0% user,  0.5% kernel,  0.0% iowait,  0.0% swap
Kernel: 187 ctxsw, 1 trap, 516 intr, 421 syscall, 1 flt
Memory: 2048M phys mem, 205M free mem, 1024M total swap, 1024M free swap
12 iostat examples for Solaris performance troubleshooting

Filed Under: Solaris

Some more articles you might also be interested in …

  1. Solaris ZFS : How to replace a failed disk in rpool (x86)
  2. Solaris 11 : How to monitor network traffic using “ipstat”, “tcpstat” and “netstat” commands
  3. Solaris Volume Manager (SVM) : Understanding metadb Flags
  4. Managing boot environments in Solaris 11
  5. Troubleshooting solaris 10 boot issues related to SMF and milestones
  6. Solaris 10 boot process : x86/x64
  7. How to Boot Single User Mode from the Grub Boot Loader in Solaris 10
  8. Solaris : How to create processor set (pset) and associate it with a pool
  9. Determining which network interface will be used for jumpstart installation / network boot
  10. How to Identify ZFS Snapshot Differences using “zfs diff”

You May Also Like

Primary Sidebar

Recent Posts

  • powertop Command Examples in Linux
  • powertop: command not found
  • powerstat: command not found
  • powerstat Command Examples in Linux

© 2023 · The Geek Diary

  • Archives
  • Contact Us
  • Copyright