Python-based harvesters sometimes deadlock
Sometimes, the Python-based harvesters somethimes but not always deadlock while waiting for the parent process to write data to their standard input, i.e. sys.stdin.readline()
.
This suggest that their standard output is not properly flushed by the It actually happens even with print(json.dumps(..), flush=True)
sequence. This is further corroborated by the problem apparently going away when Python's -u
command line flag or PYTHONUNBUFFERED
environment variable is used. However, since we always write a long JSON string followed by a single newline, buffering is highly beneficial to efficiency and should stay enabled either automatically or managed explicitly.-u
set...
Edited by Adam Reichold