In this article, we will explore the definition and syntax of regular expressions in C and how to use them in your code.
What is the Regular Expression in C
A regular expression consists of a series of characters used to search for specific patterns, such as matching strings, to find strings in a string. In C, the POSIX library supports regular expressions because C itself does not have regular expressions. Regular expressions support different operations, including “?” and “*”, and are mostly used in string manipulation.
How to Work with the Regular Expressions in C
As we know that regular expressions in C are supported by the POSIX library so there are some expressions in POSIX that are used in every C regular expression program. These expressions include:
- []: We use this expression to find numbers or characters written into these square brackets.
- [:number:]: We use this expression to search for any digit in the specified numbers.
- [:lower:]: We use this expression to identify lowercase alphabets.
- [:word:]: We use this expression to search for a specific word that consists of a combination of numbers, letters, or underscores in the given text.
Some functions are used to compile the regular expressions in C and are listed below:
- regcomp(): This function is used to compile regular expressions. It requires three parameters: a pointer to a memory location in which the pattern to match is stored, a string type pointer to the pattern, and a flag that specifies the type of compilation. When the compilation is successful, it returns 0; otherwise, it throws an error.
- regexec(): Use this function to match a string with the specified pattern of a string. It has 5 arguments, including a precompiled pattern, a second parameter that will accept a string that requires to be searched for, a third parameter that contains information about where matches were found, a fourth parameter that contains information about searches, and a fifth parameter that contains a flag that indicates when the matching behavior has changed. If the matching is successful, the regexec() function returns 0, and if it fails, it returns REG_NOMATCH.
- regfree(): When a memory address associated with a preg that was allocated by the regcomp() function needs to be freed and the preg is no longer a compiled regular expression, we utilize the regfree() function.
- regerror(): This function is used to return an error message when the regcomp() or regexec() functions fail. The string that this function store is always ended with a null character.
Example
The following example illustrates the working of the C regular expressions that finds all occurrences of a specific word in a string.
#include <stdio.h>
int main() {
regex_t regex;
int reti;
char string[] = "Hello this is Linux Hint website";
char pattern[] = "Hello";
reti = regcomp(®ex, pattern, 0);
if (reti) {
printf("Could not compile regex\n");
return 1;
}
reti = regexec(®ex, string, 0, NULL, 0);
if (!reti) {
printf("Match found: %s\n", pattern);
} else if (reti == REG_NOMATCH) {
printf("No match found\n");
} else {
printf("Error: Could not execute regex\n");
}
regfree(®ex);
return 0;
}
The above code shows a simple example of using regular expressions in C programming. Using the regcomp() and regexec() functions from the regex.h library, it searches for “Hello” in the string “Hello, this is a Linux Hint website”. If a match is found, it states “Match found: Hello” to the console; otherwise, it states “No match found”. Finally, it uses regfree() to free memory allocated by regcomp().
Conclusion
Regular expressions are a valuable tool for searching and manipulating text in C programming. With the POSIX library, C programmers can define complex search patterns and efficiently search for specific patterns. This article covered the definition and syntax of regular expressions in C, along with some commonly used expressions and functions. By using regular expressions, C programmers can streamline their string manipulation tasks and increase the efficiency of their code.