Troubleshoot high iowait issue on Linux
Updated: Aug 30
High iowait issue is usually related to io performance which includes disk performance and nfs performance. We will look at how to check disk performance and nfs performance today.
Check Disk IO Performance on Linux
In Linux system, we can use iostat command to get performance data for disks. If the issue happed in the past, we can use sar to get the historical data to analyze what was going on at that time. We can also use monitor tool like telegraf to collect these metrics like disk iops, disk io bytes, disk time.
Next we can use iotop check which process is generating workloads to our disks. More info about iotop here.
10 iostat commands on Linux
10 Linux iostat Command to Report CPU and I/O Statistics are listed below. The most commonly used option is -xk + interval. For example: iostat -xk /dev/sda 3 means print performance data for disk sda very 3 seconds until we press ctr+c.
iostat: Get report and statistic. iostat -x: Show more details statistics information. iostat -c: Show only the cpu statistic. iostat -d: Display only the device report. iostat -xd: Show extended I/O statistic for device only. iostat -k: Capture the statistics in kilobytes or megabytes. iostat -k 2 3: Display cpu and device statistics with delay. iostat -j ID mmcbkl0 sda6 -x -m 2 2: Display persistent device name statistics. iostat -p: Display statistics for block devices.
Check NFS IO performance issue on Linux
Poor nfs performance can also cause high iowait issue. Nfsiostat is a commonly used command to check NFS performance. This command can tell us the workload like IOPS, network latency, kernel latency etc. More details about this command are here.
High iowaits on specific CPU cores
The io workload on the CPU cores are not evenly distributed. This is a expected behavior of a Linux kernel. When any cpu deals with any task and that task needs to do a IO transaction then cpu issues a request to a IO controller and now it is IO controllers responsibility to serve this request, so as much time the IO controller will take to accomplish the request that much of time that task will be in 'D' state and cpu will be just waiting for IO (called as IO_WAIT).
And if system has number of processors then the cpu which is serving to that particular task will wait for IO and will be idle for that amount of time, the other processors will be assigned to other running tasks, so seeing a IOWAIT for particular CPU's is expected behavior of a Linux kernel.
Linux Troubleshooting Guide:
Linux Learning Guide: