Python

Python Regex Examples

The full form of regex is Regular Expression. It is an important feature of any programming language. It is a string pattern that is used to match, search, or replace the strings in a string value. The regex pattern can be used in the Python script using the “re” module of Python. This module has many types of functions to do different string operations. Different metacharacters and special sequences are used to define the regex patterns to search or replace the tasks. The purposes of using some commonly used metacharacters, special sequences, and regex methods in the Python script are shown in this tutorial.

Some commonly used metacharacters in regex:

 

Characters Purpose
‘+’ It is used to match one or more occurrences of a particular character in a string.
‘*’ It is used to match zero or more occurrences of a particular character in a string.
‘?’ It is used to match zero or one occurrence of a particular character in a string.
‘^’ It is used to match the particular character or string at the beginning of the string.
‘$’ It is used to match the particular character or string at the end of the string.
‘|’ It is used to match any of the multiple strings in a string. It works like the OR logic.
‘[]’ It is used to match a range of characters.
‘{}’ It is used to match a specific number of characters.

 

Some commonly used special sequences in regex:

 

Sequences Purpose
‘\A’ It is used to match the particular character at the start of the string. It works like the “^” character.
‘\b’, ‘\B’ The “\b” is used to match the string that contains the particular character or word at the beginning or end of the string. The “\B” works opposite to “\b”.
‘\d’, ‘\D’ The “\d” is used to match the decimal number in the string that is similar to “[0-9]”. The “\D” works opposite to “\d”.
‘\s’, ‘\S’ The “\s” is used to match the whitespace in the string that is similar to “[\n\t\r\v]”. The “\S” works opposite to “\s”.
‘\w’, ‘\W’ The “\w” is used to match the alphabetic and numeric characters in the string. The “\W” works opposite to “\w”.
‘\Z’ It is used to match the particular character at the end of the string. It works like the “$” character.

Example 1: Match the String Using the Match() Function

The match() function is used to match a regex pattern at the beginning of the string. The syntax of this function is given as follows:

Syntax:

re.match(pattern, string, flags=0)

 

Here, the first argument is used to define the regex pattern. The second argument is used to define the main string. The third argument is optional and is used to define different types of flags.

Create a Python file with the following script that matches a regex pattern with a defined string using the match() function. First, a defined regex pattern is used to match. Next, a search word is taken from the user and is used as a regex pattern to match with the string value. If any match is found, the search word is printed. Otherwise, the “No matching value found” string is printed.

#Import necessary module
import re

#Define the function to print the matching result
def matchString():
    #Check the return value of the match() function
    if mat != None:
       print ("'" + mat.group() + "' is found in '" + strValue + "'")
    else:
       print ("No matching value found.")

#Define the string value
strValue = "First in first out."
#Match the string based on the pattern
mat = re.match('^First', strValue)
#Call function to print the match result
matchString ()

#Take the search string
inValue = input("Enter the search value: ")
mat = re.match(inValue + , strValue)
#Call function to print the match result
matchString ()

 

The following output appears for the “first” input value:

Example 2: Find the String Using the Findall() Function

The findall() function is used to return all matching words that are found in the main string as a tuple.

Syntax:

re.findall(pattern, string, flags=0)

 

Here, the first argument is used to define the regex pattern. The second argument is used to define the main string. The third argument is optional and is used to define different types of flags.

Create a Python file with the following script that takes a main string value and a search string value from the user. Next, use the search word  in the regex pattern to find the search word in the main string. The number of total matches are printed in the output.

#Import necessary module
import re

#Take a string value
inValue = input("Enter a string: ")

#Take a search word
srcValue = input("Enter a search word: ")

#Search the word in the string
srcResult = re.findall(srcValue + "\w*", inValue)
#Print the search result
print ("The word '" + srcValue + "' is found in the string "
       + str(len(srcResult)) + " times.")

 

According to the output, the search word “eat” is found twice in the “We eat to live and don’t live to eat” main string.

Example 3: Search the String Using the Search() Function

The search() is another function to search a particular pattern in a string value. It contains the same arguments as the match() and findall() functions. Create a Python file with the following script that searches the word “Python” in a string value that will be taken from the user. If the search word exists in the input value, a success message is printed. Otherwise, a failure message is printed.

#Import re module
import re

#Take a string value
inValue = input("Enter a string: ")
#Search the particular word in the string value
srcResult = re.search( r'Python\w*', inValue)

#Check the search word is found or not
if srcResult:
    print ("'" + srcResult.group() + "' is found in '" + inValue + "'")
else:
    print ("The search string is not found.")

 

Output:

The following output appears if the input string is “I like Python programming”:

The following output appears if the input string is “I like PHP programming”:

Example 4: Replace the String Using the Sub() Function

The sub() function is used to search a particular string based on the pattern and replace it with another word. The syntax of this function is given as follows:

Syntax:

re.sub(pattern, replace_string, main_string)

 

The first argument of this function contains the pattern that is used to search the particular string in the main string.

The second argument of this function contains the “replace” string value.

The third argument of this function contains the main string.

This function returns the replaced string if any matching word exists in the main string based on the first argument.

Create a Python file with the following script that searches for two digits at the end of the string. If the string contains two digits at the end, the digits are replaced by the “$50” string.

#Import re module
import re

#Define the main string
strValue = "The book price is 70"

#Define the search pattern
pattern = '[0-9]{2}'

#Define the replace value
replaceValue = '$50'

#Search and replace the string based on the pattern
modified_strValue = re.sub(pattern, replaceValue, strValue)
#Print the original and modified string values
print ("Original string: " + strValue)
print ("Modified string: " + modified_strValue)

 

Output:

There were 70 at the end of the main string. So, the 70 is replaced by $50 in the replaced string.

Example 5: Replace the String Using the Subn() Function

The subn() function works like the sub() function, except it returns the output as a tuple where the first index contains the replaced value and the second index contains the total number of matches.

Create a Python file with the following script that searches the alphabets A to L in the “LinuxHint.com” string using the subn() function:

#Import re module
import re

#Define the main string
strValue = "LinuxHint.com"

#Define the search pattern
pattern = '[A-L]'

#Define the replace value
replaceValue = '*'

#Search and replace the string based on the pattern
modified_strValue = re.subn(pattern, replaceValue, strValue)
#Print the original string and the output of the subn()
print ("Original string:\n" + strValue)
print ("Output of subn() function: ")
print (modified_strValue)

 

Output:

According to the following output, the “L” and “H” characters are replaced by the “*” character.

Example 6: Split the String Using the Split() Function

Create a Python file with the following script that used the split() function to divide the main string into multiple parts based on the regex pattern:

#Import re module
import re

#Define string value
strVal = "Rupa Akter;Nira Chowdhury;Mazharul Islam"
#Define the pattern that will be used to split the data
pattern = '[^A-Za-z ]'
#Store the split values in a list
split_result = re.split(pattern, strVal)
print ("Output of the split() function:")
print (split_result)

 

Output:

According to the output, the main string is divided into three parts based on the “[^A-Za-z ]” pattern that is used in the script.

Conclusion

The purpose of the most commonly used metacharacters, ranges, and Python built-in functions to search, replace, and split the strings are shown in this tutorial using simple Python scripts.

About the author

Fahmida Yesmin

I am a trainer of web programming courses. I like to write article or tutorial on various IT topics. I have a YouTube channel where many types of tutorials based on Ubuntu, Windows, Word, Excel, WordPress, Magento, Laravel etc. are published: Tutorials4u Help.