I’ve been thinking recently about Python API design (as one does, in their mid 20s). I’m someone who cares deeply writing performant code, so I often turn to threading, multiprocessing or asyncio when dealing with IO-bound work (which is the majority of python applications).

I was looking at the API design for concurrent.futures.ProcessPoolExecutor/ThreadPoolExecutor.submit(), which looks something like:

def cpu_bound_function(val, *, keyword_arg):
    ...
 
# or ThreadPoolExecutor()
with ProcessPoolExecutor() as executor:
    task = executor.submit(cpu_bound_function, 5, keyword_arg=10)

And asyncio.TaskGroup.create_task(), which looks something like:

async def io_bound_function(val, *, keyword_arg):
    ...
 
async with TaskGroup() as tg:
    task = tg.create_task(io_bound_function(5, keyword_arg=10))

The question is: Why are these APIs different? Or, why can’t ProcessPoolExecutor/ThreadPoolExecutor use the nicer, ergonomic API of asyncio.TaskGroup.create_task()?

I like the question because if you understand how async code works (not even necessarily in python, this same logic applies to Javascript promises or Rust Futures) then it’s immediately obvious. Additionally, if you have a minimum understanding of threading/multiprocessing as a concept, it’s obvious.

If it isn’t obvious to you: time to study :)

Appendix: What LLMs say

This isn’t a particularly esoteric question or anything, but I wanted to see what LLMs think about it. The prompt was exactly the text above, with the first and last paragraphs excluded. Here’s what a few of them say.

Verdict: the LLMs know, and they all give what I would classify as great answers.