Currently preprocessing workers are implemented as separate processes, which makes harder to exchange complex data. This led to batching dependent item preprocessing on workers when preprocessing cache was added, severely hurting preprocessing parallelism in some cases.
To improve this situation preprocessing manager/workers must be rewritten to use thread based workers. The design goal is to replicate current preprocessing with minimum overhead and maximum parallelism. In future this could be improved further by moving user macro resolving to workers and having dynamic worker management (creating/destroying workers as required by the preprocessing load).