A Regular Expression is an object in which patterns are given to match with the desired string.
The syntax for a regular expression is very simple, and can be written as follows:
A pattern is a string in which you provide a pattern to match another string.
Flags are optional attributes that serve varied purposes. For example, the flag “g” stands for “global,” among many others.
The scope of regular expressions is very broad. We will show you the basic ones that are most necessary for programming through a step-by-step explanation and some real-life examples.
We will first suppose the string:
We have repeated the same word “great” and “linuxhint” in the phrase. The purpose of this weird string will become obvious in a moment.
As you can see, it shows the index from where the given substring “work” began. Now, we will move on and try doing the same thing with the regex syntax.
Step 1: Search and Replace a Substring
You can search for a matching string using a regular expression by simply placing the substring between the two slashes in the expression.
As you can see, it has also given us the same output.
Alright! Now, we will see what we can do with the regular expression. Let us try to replace the word “great” with, say, “awesome” using the replace() method.
Here, you can see the problem: the first occurrence of “great” has been replaced, but the second one has not.
In the first step, you simply learned how to search for a string using a regular expression. Now, we will move towards the next step and learn about the concept of flags.
Step 2: Flags
If you want to replace all the occurrences of “great,” you can use the regular expression with the ‘g’ flag, which is short for global.
Perfect, all the occurrences of “great” are now changed. But, you may face a problem if you try to change all the occurrences of “linuxhint” to, say, “our website” using the same technique.
We will try to do that first, then we will see how can we resolve this issue.
Although we have provided the global flag with the regular expression, the first occurrence does not change. This is because of case-sensitivity. So, we will also need to provide the case-insensitivity flag ‘i,’ in this case. You can do this simply by adding the ‘i’ flag along with the ‘g’ flag.
Great. As you can see, all occurrences of the term “linuxhint” have been changed to the term “our website,” regardless of the case-sensitivity.
Alright! The function worked fine. The split() method has returned the array of substrings, based on the “linuxhint” term. But, if you want to include the separators, as well, in the array of the substring, you will have to play with the patterns.
So, in this step, we have learned about the flags and how they help us. There are more flags available. For example, “m” is for multiline matching, “s” is for dot all, etc. Now, we will move on to the concept of patterns and learn how to use these items.
Step 3: Patterns
In this step, you will learn how to utilize the patterns and related options.
To include the separators in the array of the substring, simply add parentheses around the pattern, as can be seen in the following image:
Perfect! As you can see, the separators are also included in the array of substrings.
To split the base of two separators, you can give multiple substrings in a regular expression using the OR “|” operator.
All right! The operator worked great, as we expect it to split.
Now, to split between the base of the space “ “ or the dot “.” meaning to add special characters in the regular expression, add a backslash “\” before any special characters.
Okay, so far, so good. For example, say, you want to change the dots into commas in the following expression:
Backslashes are also used for another purpose. To search any word, digit, or space, you can use \w, \d, and \s, respectively. For example, to replace spaces with dashes, the following expression is used:
Awesome! You can really see the potential in regular expressions, now.
Square Brackets [ ]
If you want to replace multiple characters in a string, you can provide all of them in a single square bracket, and they will be replaced by the given substring. For example, if you want to replace three letters in a string and you do not want to put a lot of OR “|” operators in the regular expression, you can use square bracket syntax, in which you can give multiple letters, like this:
You can even give a range of letters, like this:
Or, a range of numbers:
And, if you want to exclude the provided characters in the square brackets, you can use the caret character, like this:
This comes in handy when getting data from users and testing and validating that data, especially in email, phone, or date validation.