This talk was part of Developer Growth Summit 2022. Go to the DGS2022 page to view recordings of all sessions.
About the talk
If you've heard that Python is not good at solving async problems, you'll be surprised by what modern Python has to offer. You’ll learn new async and await keywords and how to leverage async to solve parallel problems faster.
This talk will cover
- Intro to async programming support in modern Python
- Async and await keywords
- Other helpful libraries you can use
About the speaker
Michael is an avid Python and MongoDB enthusiast, entrepreneur, host of Talk Python to Me & Python Bytes, and a Python Software Foundation Fellow.
Highlights of the talk
What is asynchronous programming?
Since version 3.5, Python has been enhanced with asyncio and async and await keywords. When you think about traditional concurrent programming, you think about work happening at the same time in multiple places. For example, you might have a multi-core processor and you can do two threads on both of them instead of one, that would be stuff happening at the same time.
According to Wikipedia, asynchrony in computer programming refers to the occurrence of events independent of the main program flow and ways to deal with such events. These may be “outside” events such as the arrival of signals, or actions instigated by a program that take place concurrently with program execution, without the program blocking to wait for results. No where in this do you see you must create multiple threads that run on different cores, or create multiple processes that then coordinate in some way, or even scale out across machines and some grid computer thing. It’s just stuff happening at the same time without blocking.
There are two core uses for asynchronous programming. One is for speed, where you want a single thing to take better advantage of the hardware you’re on. Around 2005 and onward, the Moore's law’s curve has diverged and now what you have is computers that are just as fast as they were about 5-10 years ago and have way more cores. If you want to take advantage of the processors and the computer you got, you have to use some other mechanisms.
Oftentimes, people talk about asynchronous programming not for making an individual thing to go faster, but to allow for more of it to happen. So if you’re in a data science world, and you want your computation to go faster, this CPU- bound story is what you’re thinking about. You’ve got a million records, you want to do a bunch of computation. It’d be nice to make that happen in one minute instead of 10 minutes. Over on most of the other places it’s about “I’ve got some service and I want it to be able to have more users” or “as I add more users I don’t want it to slow down. Or even in a more focused way, maybe what you need to do is to access a couple of microservices and or some external APIs in a database and you want those things to happen at the same time but not in a computational way.
Why don’t threads add computational speed?
The reason threads don’t live in both categories is because of the gill, the global interpreter lock is this constraint on Python and it feels like it’s a threading constraint that Python put in place but that’s not what it is. It’s actually a memory management feature that has threading implications. So when you allocate objects or assign variables in Python, there’s no memory management, there’s no alloc and free. It just does the tracking automatically, and it does that through reference counting.
Anytime something is incremented or decremented, you want to make sure that it's thread safe. So python doesn’t lock every assignment, instead it just allows one instruction to run at a time, the global interpreter lock, so the memory doesn’t have to have threading. But that means that basically the threading story for the CPU version isn’t that great in Python.