The python
threading
documentation states that “…threading is still an appropriate model
if you want to run multiple I/O-bound tasks simultaneously”,
apparently because I/O-bound processes can avoid the GIL that prevents
threads from concurrent execution in CPU-bound tasks.
But what I dont understand is that an I/O task still uses the CPU. So
how could it not encounter the same issues? Is it because the I/O
bound task will not require memory management?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
All of Python’s blocking I/O primitives release the GIL while waiting for the I/O block to resolve — it’s as simple as that! They will of course need to acquire the GIL again before going on to execute further Python code, but for the long-in-terms-of-machine-cycles intervals in which they’re just waiting for some I/O syscall, they don’t need the GIL, so they don’t hold on to it!
Method 2
The GIL in CPython1 is only concerned with Python code being executed. A thread-safe C extension that uses a lot of CPU might release the GIL as long as it doesn’t need to interact with the Python runtime.
As soon as the C code needs to ‘talk’ to Python (read: call back into the Python runtime) then it needs to acquire the GIL again – that is, the GIL is to establish protection/atomic behavior for the “interpreter” (and I use the term loosely) and is not to prevent native/non-Python code from running concurrently.
Releasing the GIL around I/O (blocking or not, using CPU or not) is the same thing – until the data is moved into Python there is no reason to acquire the GIL.
1 The GIL is controversial because it prevents multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations. Note that potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0