I have been looking into the iowait property shown in top utility output as shown below.
top - 07:30:58 up 3:37, 1 user, load average: 0.00, 0.01, 0.05 Tasks: 86 total, 1 running, 85 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
iowait is generally defined as follows:
“It is the time during which CPU is idle and there is some IO pending.”
It is my understanding that a process is run on a single CPU. After it gets de-scheduled either because it has used up its time slot or after it gets blocked, it can eventually be scheduled again on any one CPU again.
In case of IO request, a CPU that puts a process in uninterruptible sleep is responsible for tracking the iowait time. The other CPUs would be reporting the same time as idle time on their end as they really are idle. Is this assumption correct?
Furthermore, assuming there is a long IO request (meaning the process had several opportunities to get scheduled but didn’t get scheduled because the IO wasn’t complete), how does a CPU know there is “pending IO”? Where is that kind of information fetched from? How can a CPU simply find out that some process was put to sleep some time for an IO to complete as any of the CPUs could have put that process to sleep. How is this status of “pending IO” confirmed?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
The CPU doesn’t know any of this, the task scheduler does.
The definition you quote is somewhat misleading; the current procfs(5) manpage has a more accurate definition, with caveats:
iowait(since Linux 2.5.41)(5) Time waiting for I/O to complete. This value is not reliable, for the following reasons:
- The CPU will not wait for I/O to complete;
iowaitis the time that a task is waiting for I/O to complete. When a CPU goes into idle state for outstanding task I/O, another task will be scheduled on this CPU.- On a multi-core CPU, the task waiting for I/O to complete is not running on any CPU, so the
iowaitof each CPU is difficult to calculate.- The value in this field may decrease in certain conditions.
iowait tries to measure time spent waiting for I/O, in general. It’s not tracked by a specific CPU, nor can it be (point 2 above — which also matches what you’re wondering about). It is measured per CPU though, as far as possible.
The task scheduler “knows” there is pending I/O, because it knows that it suspended a given task because it’s waiting for I/O. This is tracked per task in the in_iowait field of the task_struct; you can look for in_iowait in the scheduler core to see how it is set, tracked and cleared. Brendan Gregg’s recent article on Linux load averages includes useful background information. The iowait entry in /proc/stat, which is what ends up in top, is incremented whenever a timer tick is accounted for, and the current process “on” the CPU is idle; you can see this by looking for account_idle_time in the scheduler’s CPU time-tracking code.
So a more accurate definition would be “time spent on this CPU waiting for I/O, when there was nothing better to do”…
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0