Python

Python Hashlib MD5

Hash is a function that changes a group/sequence of bytes with different lengths into a group/sequence with a fixed length. A hash function gives a result called hash, message digest, checksum, or hash value. This coded message or hash value can be utilized to determine if the data has been changed. One example of a hash function is MD5.

This write-up will present:

What is Python MD5 Hash?

Python’s “hashlib” module has a cryptographic hash function called “MD5” hash. It takes a string of data and produces a 128-bit hexadecimal number. Hash can also be utilized to construct caches of large data sets, check passwords, fingerprints, file integrity, and more. It is essential to select the appropriate character encoding to convert/transform text data to binary before hashing. This is because hashing algorithms operate on binary data.

MD5 hash has three associated functions:

  • encode(): This function turns a string into bytes for the hash function to use.
  • digest(): It retrieves the encrypted data in bytes.
  • hexdigest(): This function returns the encrypted data in hexadecimal format.

Example 1: Calculating MD5 Hash of String Objects

The following code is used to determine the MD5 hash of the specified string objects:

import hashlib
str1=b'Welcome to Python Guide!'
res = hashlib.md5(str1)
print(res.digest())

In the above code:

  • The “hashlib” module is imported, and a specified byte string literal is initialized.
  • After that, the “hashlib.md5()” function of the “hashlib” module is used to create a hash object. Here, the md5() function implements the MD5 hash algorithm.
  • Lastly, the “digest()” method returns the hash value as a byte string.

Output

The MD5 hash algorithm has been implemented on the input string object.

We can also display the string value in Hexadecimal equivalent to MD5 Hash using the “hexdigest()” function:

import hashlib
str1=b'Welcome to Python Guide!'
res = hashlib.md5(str1)
print(res.hexdigest())

The hexadecimal representation of the encrypted data of the MD5 hash is shown below:

Example 2: Calculating MD5 Hash of Files

We can also determine the MD5 hashes of files using the “hashlib” module. To get the hash value for larger files, we need to process it in chunks for memory efficiency.

Let us utilize the following/below example code to comprehend it:

import hashlib
res = hashlib.md5('newfile.txt'.encode('UTF-8'))
print(res.hexdigest())

In this code, the “hashlib.md5()” function from the “hashlib” module is used to create a hash object. Here, the “encode()” method is used to encode the file object to a byte string. Lastly, the “hexdigest()” method is used to get the hash output in hexadecimal representation.

Output

The MD5 hash value of a file has been represented in hexadecimal.

We can also hash big files that are greater than “10GB”, like video games or log files. To create an MD5 hash without using all memory, we need to break the file into smaller chunks of bytes. The size of the chunks depends on things like the size of the file and the computer’s memory. We process each chunk one at a time and update the hash as we go. If there are 100 chunks, the MD5 hash will be updated 100 times:

import hashlib
md5 = hashlib.md5()
with open(r"Video.mp4", "rb") as f:
    while chunk := f.read(4096):
        md5.update(chunk)
print(md5.hexdigest())

In the above-given code, we first create a hash object using the “hashlib.md5()” function. Then, the code opens the file in read-binary mode and reads the file contents in chunks. Next, each chunk is passed to the update() method of the hash object to update the hash value.

Once the entire file has been read, the “hexdigest()” method of the hash object is called to get the hash value in hexadecimal format:

Conclusion

In Python, the “hashlib.md5()” function of the “hashlib” module is used to create a cryptographic hash by taking the string of data or files. It can produce a 128-bit hexadecimal number. This function is used along with the “encode()”, “digest()”, and “hexdigest()” functions to calculate MD5 Hash of strings or files. This blog illustrated a detailed tutorial on Python “hashlib” md5 using numerous examples.

About the author

Haroon Javed

Hi, I'm Haroon. I am an electronics engineer and a technical content writer. I am a tech geek who loves to help people to the best of my knowledge.