Using asyncio for concurrent http requests with Python (asyncio.ensure_future and asyncio.gather)

George Crisan

Cover Image for Using asyncio for concurrent http requests with Python (asyncio.ensure_future and asyncio.gather)

George Crisan

Posted on: July 27, 2023

Python has a few ways for implementing fetching data using http with various packages. "requests", "asyncio", etc. In this article will make a practical example that demonstrates how to run 100 requests concurrently as asyncio tasks and then how to save the results to a file as json, using asyncio.ensure_future and asyncio.gather.

Why would I want to use asyncio.ensure_future and asyncio.gather?

If you need to make requests that do not have to be executed subsequently and if you want to dramatically increase the performance then you can use this asynchronous approach.

This is not an article where I will explain every single step in details but I will add code comments as a guide.

Important

- This article assumes that you know how to install Python packages and how to set up a virtual environment (optional)

- The solution below does not include error handling in order to keep it simple and to be able to focus on the scope of this task.

- The URL for the request comes from a dummy API which does not require an API key "https://meowfacts.herokuapp.com" and can be used immediately.

- The benchmarking results are based on the very slow internet connection I have in France, where I am on holiday at the moment.

Solution script

import aiohttp
import asyncio
import time
import json

# Get the time at which the script starts the run
start_time = time.time()


# Utils functions
def saveToFile(content): 
    saveFile = open("cat_facts.json", "w")
    json.dump(content, saveFile, indent=4)
    saveFile.close()

async def getCatFacts(session, url):
    async with session.get(url) as resp:
        cat_fact = await resp.json()
        return cat_fact['data'][0]


# Main code to execute
async def main():
    # Initiate session
    async with aiohttp.ClientSession() as session:

        # Prepare the list of requests
        tasks = []
        for i in range(0, 100):
            url = "https://meowfacts.herokuapp.com"
            tasks.append(asyncio.ensure_future(getCatFacts(session, url)))

        result = await asyncio.gather(*tasks)

        # Save request to json
        saveToFile(result)

        # Print the results one by one to the console
        for facts in result:
            print(facts)

asyncio.run(main())

# Print the time required for the requests to complete
print("--- %s seconds ---" % (time.time() - start_time))

Requests completion time

...
When well treated, a cat can live twenty or more years but the average life span of a domestic cat is 14 years.
Statistics indicate that animal lovers in recent years have shown a preference for cats over dogs!
You check your cats pulse on the inside of the back thigh, where the leg joins to the body. Normal for cats: 110-170 beats per minute.

--- 2.289151906967163 seconds ---

vs the synchronous approach

Retractable claws are a physical phenomenon that sets cats apart from the rest of the animal kingdom. In the cat family, only cheetahs cannot retract their claws.
The worlds largest cat measured 48.5 inches long. https://www.youtube.com/watch?v=gc5M0aGc_EI
--- 54.436748027801514 seconds ---

over 52 seconds is a good performance gain.

Synchronous approach code

import requests
import time

start_time = time.time()

for _i in range(0, 100):
    url = "https://meowfacts.herokuapp.com"
    resp = requests.get(url)
    data = resp.json()
    print(data['data'][0])

print("--- %s seconds ---" % (time.time() - start_time))

Support HypeJS.

Go to homepage

HypeJS.

Back to homepage

Using asyncio for concurrent http requests with Python (asyncio.ensure_future and asyncio.gather)

Important

Solution script

Requests completion time