Intro to Python Asyncio

by John | January 10, 2024

 

Asyncio can seem like such a tough topic to understand, at times it can feel like it is a completely new language even though the majority of the code is familiar. The asyncio documentation is also likely to lead to despair, so hopefully this post will serve as a useful non-technical introduction to those that want to learn more about asyncio!

 

Let's first break down the two terms that make up the Asyncio module name. 

 

What does Asynchronous mean?

 

There is a very nice explanation I found at on the etymology of what this term means at techtarget.com

"In general, asynchronous -- pronounced ay-SIHN-kro-nuhs, from Greek asyn-, meaning "not with," and chronos, meaning "time" -- is an adjective describing objects or events that are not coordinated in time."

 

In terms of computer programming this would mean that a set of instructions that are able to be executed in any order and not dependent on each other. If you are reading this article, I will assume you are used to procedural programming, in which one set of instructions follows another, executing in a sequential (synchronous) flow. The closest example to this might be threading, if you are not familiar with the term threading, this essentially means that another process (thread) is started to run one or more tasks. 

 

What does IO actually stand for? 

 

Now that we have the first part (async) defined , what does the io actually mean? Well, this relates to input/output tasks that our computer programs carry out. Perhaps it may be useful to think of these types of operations, as the processes that load and save data in some form or another. Like reading from files, getting http responses among other things. 

 

 

When should we use Async?

 

Now that we have the terms defined, we can think of some examples of times when we might want to use this method of programming. 

 

  • When our programs are I/O bound, it may be a good opportunity to use asynchronous programming. 

 

  • Perhaps it is a good rule of thumb to think of asyncio as a replacement/alternative to threading. 

 

  • When the events we are using asyncio for are independent of one another. Example, let's say we want to make 5 database calls, these instructions can be carried out in any order, as long as they are all completed at the end, and we probably want it done as soon as possible. 

 

If any of the above is true, then maybe we can speed up our code by using asyncio. 

 

 

 

Generator Example 

 

Before moving on to using the actual async and await keywords , it may be useful to get a feel for some terminology through a code example. Here we want to demonstrate the concept of yielding control or suspension of execution of a function. 

 

def suspender():
    nums = [1,2,3,4,5,6,7,8,9, 10]
    for i in range(len(nums)):
        yield nums[i] #suspend here after returning


f = suspender()

for i in [1, 2]:
    print(next(f))


'''
1
2

'''

 

In the generator function above, we create the function with f = suspender() , this doesn't actually do anything, if we were to print(f) we would just see <generator object> along with the memory address. 

Once we call next() on the function we get the values returned. So what is happening here is the generator suspends execution once it has yielded the number and waits to be called again. Note we can of course do any other tasks we want and then come back to the function it will resume the loop at the place it left off. 

 

##################
# more code here
##################

#come back to generator

third_val = next(f)

#####
#do more unrelated stuff
#####

fourth_val = next(f) 



print(third_val, fourth_val)

'''
3 4

'''

 

 

This is a useful example for understanding how the execution of a method is suspended and then resumed. Although the behaviour is not identical, it may be useful to think of the await keyword discussed below as somewhat similar to the yield keyword in the generator example above, both essentially allow other aspects of the program to run. 

 

Another similarity is that concept of non-blocking , code that blocks is code that must finish before the program as a whole is able to continue, recall from the example above, the suspender function only runs when we tell it to, when not directly instructed to it simply waits idly by and not blocking other parts of the code from running. 

 

 

 

What is a Coroutine?

 

First it may be useful to define what a 'routine' is. In computer programming a routine is a like a function or a procedure, it is designed to carry out a set of instructions when the program is executed. Therefore we can think of just a standard function as a routine.

 

A coroutine is also routine, just a special type. In keeping with the etymology we started this post with, the prefix co/con is derived from Latin that means 'with', 'joint' or 'together' , and you may recognize it from certain words like collaborate, cooperate , cohabitate , coworker and concert among many others.

 

I think coworkers is the most useful analogy for our purposes since the async functions suspend and yield control back so other coroutines can begin their work, we can think of async functions as a group of coworkers. Yes dear reader, if learning asyncio isn't traumatic enough, we'll get you on the ancient languages, or better yet, remind you of your coworkers!

 

 

 

First Asyncio Coroutine

 

In this section we will just give some examples, note that to declare a function as async we simply put the async keyword before it. 

 

import asyncio

async def asyncio_function(simulated_work_time):
    await asyncio.sleep(simulated_work_time)
    return 1

 

Ok now we have our function that doesn't do much of anything let's try calling it with some value 

 

print(asyncio_function(10))

'''

RuntimeWarning: coroutine 'asyncio_function' was never awaited
print(asyncio_function(10))
RuntimeWarning: Enable tracemalloc to get the object allocation traceback


'''

 

When we see this error, it means that we have missed an await somewhere in the code. We may be tempted to try something like the following:

 

result = await asyncio_function(10)

print(result)

'''
result = await asyncio_function(10)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: 'await' outside function
'''

 

Yet another annoying error, a hello world in asyncio is surely more difficult than we are used to! 

 

So how do we run an async function?

 

Well we need to use a special asyncio method run() to do something called start the event loop and finally run our code! 

 

simulated_work_time = 0.5
result = asyncio.run(asyncio_function(simulated_work_time=simulated_work_time))


print(f"The simulated work time took {simulated_work_time} seconds")


'''
The simulated work time took 0.5 seconds

'''


print(f"The result was {result}")

'''
The result was 1
'''

 

 

OK now we know how to run and declare an async function let's take a look at an example of how it can be useful. We will stick with the same example for the time being and demonstrate how the await keyword works. 

 

If the sleep is confusing, it might be useful, where you see the asyncio.sleep() command think of either making an http request, querying a database or reading from a list of files. 

 

 

How  is Asyncio Useful? 

 

If the sleep is confuses you in the following examples, it may be useful, where you see the asyncio.sleep() command think of either making an http request, querying a database or reading from a list of files. 

Well the main use case of asyncio lies in the fact that we can combine many async functions together with the gather method , let's take an example. 

 

import asyncio
import time


async def asyncio_function(simulated_work_time):
    await asyncio.sleep(simulated_work_time) #here we call database, read file etc
    return 1


async def main():
    t = time.time()

    res = await asyncio.gather(
        asyncio_function(5), asyncio_function(5), asyncio_function(5)
    )

    time_taken = time.time() - t

    print(
        f"It took {time_taken} seconds to run 3 async functions with a work time of 2 seconds each "
    )


asyncio.run(main())



'''

It took 5.00431752204895 seconds to run 3 async functions with a work time of 5 seconds each 
'''

 

 

It looks like all 3 of the functions that were doing simulated work for 5 seconds each all finished in a total of 5 seconds rather than the 15 seconds it would take if we had of run the program in a sequential manner.

 

In fact we could write this in a way to show that we can have a very large number of these coroutines and the program will only take as long as the longest simulated work for an individual coroutine, see example below. 

 

import asyncio
import time
import random


async def asyncio_function(simulated_work_time):
    await asyncio.sleep(simulated_work_time)
    return 1


async def main_mod(num_coroutines):
    t = time.time()

    # make a list of random sleep times

    rand_times = [random.randint(1, 30) for _ in range(num_coroutines)]

    # assign each random time to a coroutine
    coros = [asyncio_function(x) for x in rand_times]

    # unpack the coroutines with gather
    res = await asyncio.gather(*coros)

    time_taken = time.time() - t

    print(f"It took {time_taken} seconds to run { len(coros)} async functions ")

    print(f"The maximum time a single function worked for {max(rand_times)} seconds")


asyncio.run(main_mod(100))


'''

It took 30.002750158309937 seconds to run 100 async functions 
The maximum time a single function worked for 30 seconds

'''

 

We have ran 100 functions each sleeping for a random time of between 1-30 seconds and the entire program has only taken 30 seconds!

 

 

Here may be a good place to pause and highlight the similarities and differences between the suspender() generator example and the asyncio_function() we have created above. The async function is doing something similar to the generator in that it is yielding control at the await statement. Notice below that the async function yields control at the await statement, it is here, at this point that the other coroutines (think coworkers) are able to begin execution and then python comes back at the end, once everything is finished, does some magic and returns 1 (unrealistic I know)

 

async def asyncio_function(simulated_work_time):
    await asyncio.sleep(simulated_work_time) #yield control here, let other stuff happen
    return 1

 

The difference between this and the generator is that the generator suspends until it is called, whereas the await just runs everything until completion. 

 

 

 

When not to use Asyncio

 

Take the example below, in which the cruncher function calculates the average value of a list with 1 million elements. 

 

import asyncio
import random
import time


async def asyncio_num_cruncher():
    res = [random.randint(0, 100) for x in range(1_000_000)]
    return sum(res) / len(res)


async def main():
    t = time.time()
    results = await asyncio.gather(*[asyncio_num_cruncher() for _ in range(20)])
    print(f"Asyncio cruncher took {time.time() - t}")


asyncio.run(main())


'''
Asyncio cruncher took 5.991809844970703

'''

 

Ok it looks like that has worked, but it is important to note that there were no benefits to running this code async, below we show the normal python equivalent for comparison. 

 



def normal_function():
    res = [random.randint(0, 100) for x in range(1_000_000)]
    return sum(res) / len(res)


def normal_main():
    t = time.time()
    results = []

    for i in range(20):
        results.append(normal_function())

    print(f"Normal took {time.time() - t}")


normal_main()

'''

Normal took 5.970175266265869
'''

 

 

The reason there is no performance increase when using async in this case can be attributed to the fact that the task that we are running is cpu bound (read number crunching) as opposed to I/O bound in which asyncio is useful for. 

 

 

Another example regarding how it may be understandable to presume that we are running async when we actually are not is given below 

 

 

import asyncio
import time


async def asyncio_function():
    await asyncio.sleep(2)
    return 1


async def main():
    t = time.time()
    results = await asyncio.gather(*[asyncio_function() for _ in range(20)])
    print(f"Asyncio.sleep took {time.time() - t}")
    print(results)


async def main_py_sleep():
    time.sleep(2)  ## This is the problem,it will be blocking code preventing async

    return 1


async def main_py():
    t = time.time()
    results = await asyncio.gather(*[main_py_sleep() for _ in range(20)])
    print(f"Async with python time.sleep took  {time.time() - t}")
    print(results)


if __name__ == "__main__":
    asyncio.run(main())

    asyncio.run(main_py())



'''
Asyncio.sleep took 2.0011250972747803
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Async with python time.sleep took  40.003366470336914
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

'''

 

What is happening here is that since time.sleep() is not awaitable , it is not yielding control of the program such that others things can be started while it is working on its task, in this sense it is blocking , therefore, any functions called within an asyncio coroutine must also be awaitable functions else they will block the execution of the event loop, leading to the same performance as if we had run the code sequentially. 

 

 

 

Summary

 

- Async functions are useful for I/O bound tasks, however, there will be no benefit for CPU bound tasks. 

 

- We can use the asyncio.run() method to run a coroutuine. 

 

- We can use the asyncio.gather() method to stage a number of coroutines to run. 

 

- We should be careful that we haven't mistakenly entered blocking code in to our coroutines, as then we have all the pain of writing async with none of the benefits. 

 

 

 

 


Join the discussion

Share this post with your friends!