“The Python Requests library is one of the most versatile and valuable libraries in the Python ecosystem. Its features and incredible simplicity makes it worthwhile in powering numerous python applications.”
This tutorial will teach us how to download a file from a given URL using the requests package.
Installation and Setup
Before using the requests method, you must ensure it is installed in your environment. You can do this by running the pip command as shown:
or
On Windows, you can run the command:
If you have conda installed, run the command:
Example 1 – Download a Simple File
We can download a file using the requests module by specifying the URL to the file and using the Python file module to write the content to a given file name.
An example illustration is shown below:
Edit the file and add the code:
import requests
url = 'https://upload.wikimedia.org/wikipedia/commons/a/af/Tux.png'
r = requests.get(url, allow_redirects=True)
with open('tux.png', 'wb') as f:
f.write(r.content)
In the code above, we start by importing the requests module. We then create a variable holding the URL to the file we wish to download. In this case, we want to download an image.
In the third line, we create an HTTP response object and make a GET request to the specified URL. We also set the allow_redirects to True to allow the client to follow redirects (if any). The response object is then saved into the variable called r.
Finally, save the received object into a file as tux.png in binary mode.
You can then check the directory where the script is located for the tux.png file.
Example 2 – Download Large Files
In the above example, we use the r.content func, which stores the file as a string. This is practical when downloading small files. However, when downloading large files, the function will not handle writing the requested data at once.
To resolve this, we need to download the file as streams. Hence, we can use the r.iter_content function and set the stream parameter to true.
This is because using the r.content() function with stream parameter to true will only keep the connection and response open and not read the specified file.
The r.iter_content function allows us to resolve this.
An example is as shown:
url = "https://hastie.su.domains/ISLR2/ISLRv2_website.pdf"
r = requests.get(url, allow_redirects=True, stream=True)
with open('ISLRv2.pdf', 'wb') as file:
for chunk in r.iter_content(chunk_size=1024):
if chunk:
file.write(chunk)
The code above uses a for loop to write the data chunks (1024 bytes each) to the specified file.
Example 3 – Checking File Type Before Download
We can check the type of content received from the headers. We can then use this value to check if the file we wish to download is correct.
For example, if we want to download an iso file and receive HTML or text type, we know this is an incorrect file and close the connection.
An example code is as shown:
url = "https://cdimage.debian.org/debian-cd/current/amd64/iso-dvd/debian-11.4.0-amd64-DVD-1.iso"
r = requests.get(url, allow_redirects=True, stream=True)
h = requests.head(url, allow_redirects=True)
header = h.headers
content_type = header.get('content-type')
if 'html' in content_type.lower():
r.close()
print("Incorrect file type")
if 'text' in content_type().lower():
r.close()
print("Incorrect file type")
else:
with open('ISLRv2.iso', 'wb') as file:
for chunk in r.iter_content(chunk_size=1024):
if chunk:
file.write(chunk)
Closing
You learned how to download a file using the Python requests module in this article. You also learned how to download large files.
Thanks for reading!!