Python

Python Regex Match

A Notation (RE) is a particular content expression that is utilized to indicate a specific pattern in Python. It’s phenomenal for data extraction from different types of text. The first point to mention when using a regular expression shows that everything is a letter and we’re creating patterns to relate a certain sequence of words, commonly known as a string. The letters contain all special characters, as well as integers and punctuation marks. These are used to match the content. In this article, we will examine how to do the regex match in python.

Example no 1:

The search() function of an item looks for every regex matching in the string it is provided with. The group() function on matching entities retrieves the same content from the retrieved text.

import re

MobNumRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')

num = MobNumRegex.search('Number is 031-837-5061.')

print('Mob number: ' + num.group())

To start the program, we must integrate one library ‘re’ to use the regex function in the code. We have utilized the Regex function for the random mobile number. We declare the compile(). Within the compile, we indicate the format of writing the mobile number. There is ‘r’ before the format.

In the next step, we initialize a new variable ‘num’ to store the mobile number. Here, we apply the function Regex.search(). This function contains the required mobile number. In the end, we called the print() command to get the output. We have provided the parameter in the form +num.group() to retrieve the entire mobile number:

Example no 2:

In this example, we will match the items using a grouping format. Suppose we want to segregate a section of the mobile number from the remaining portion. In the regex, inserting brackets creates sets. Then, we will utilize the group() matching item function to get the identical data from the only single set.

import re

MobNumRegex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')

num = MobNumRegex.search('Number is 031-837-5061.')

print(num.group(2))

Text Description automatically generated

At the beginning of the code, we must first acquire the ‘re’ package which will allow us to utilize the regex method in this instance. For any random mobile phone number, we have been using the Regex method. The compile() is specified. We define the syntax for inserting the mobile number in the code.

Before the pattern, there is an alphabet ‘r’. Now, to create a new variable called ‘num’ that will be used to hold the contact number, use the Regex.search(). The needed mobile number is passed in this method. To obtain the result, we employ the print() command at the end. To access the second portion of the entered mobile number, we’ve specified an argument in the style +num.group(2).

Example no 3:

In pattern matching, parentheses hold a specific purpose, but even if we will have to compare a parenthesis in the message.  The area code could be specified in brackets for the mobile number that we are trying to imitate. In this instance, a backslash is required to separate the elements. In the raw line supplied to compile() function, the escaping letters would match the real elements of the brackets.

import re

MobNumRegex = re.compile(r'(\(\d\d\d\)) (\d\d\d-\d\d\d\d)')

num = MobNumRegex.search ('My phone number is (015) 932-0394.')

print(num.group(1))

After introducing the library ‘re’, we are going to state first the format of entering the number by using the alphabet ‘r’. We divide the format into two halves brackets. Then we provided any number. The number is given as a parameter of the function Regex.search(). The last line of the code contains the print() statement. Within this command, we indicate that part of the number which we want to display. So, we add num.group(1). Here 1 shows that we want to retrieve the first segment of the number.

Example no 4:

We are going to match the elements by using numerous sets with the help of the symbol ‘|’. The ‘|’ symbol is known as a pipe. It could be used when we need to compare one of a variety of terms.

import re

CountryRegex = re.compile (r'Australia|Spain')

cu1 = CountryRegex.search('Australia and Spain.')

print(cu1.group())

Text Description automatically generated

In this example, we include the ‘re’ framework then we enter the expression within compile() by using the | symbol in the form of ‘Australia|Spain’. It would be searching for either ‘Australia’ or ‘Spain’. This can be done with the help of the Regex.search() method. The print() command will be applied to the entered string. If both Australia and Spain appear in the retrieved expression, the matched entity would be provided as the first part of the corresponding content.

Example no 5:

Curly Braces could be used to correlate certain iterations. If we have a set that we would like to replicate a certain multitude of times, we will put that figure in curly braces after using the regex. To keep the lowest or highest undefined, we eliminate the first or second portion from the curly braces. We may indicate a range rather than a single data point by entering the lowest, a comma, and the highest number within the curly braces.

import re

ITRegex = re.compile(r'(IT){6}')

au1 = ITRegex.search('ITITITITITIT')

print(au1.group())

Text Description automatically generated

Here, we match the repetitions by using the brackets. So, we insert the argument IT as (IT){6} to the function compile(). The value 6 shows that we want 6 times IT in the output. The regex (IT){6} would be the same as the string ‘ITITITITITIT’. Whereas it would not match ‘ITITITITIT, as this (IT) set will only be repeated five times in the latter. The print() statement prints the whole repetitions of IT.

Conclusion

In this article, we have discussed how to match the regex items in python and get the specific portion of the matched content. We have executed different programs regarding the techniques used to match the data. We also see how to match the elements of the sets by using curly braces and the | pip symbol. Regexes are the abbreviation used for regular strings, which do analyze textual sequence. The re component includes all the regex functionality. Regular commands enable the users to search for a certain sequence of content.

About the author

Kalsoom Bibi

Hello, I am a freelance writer and usually write for Linux and other technology related content