String: String is an array of characters. Couple of examples of strings are:
“Bob is studying in Stanford University”
Delimiter: Any character or set of characters can be considered as delimiter. If a string is to be split based on delimiter, then delimiter should be a part of String or else full string will be the output string.
Commonly used delimiter examples are: “ “ (space), ,(comma), ‘\n’(new line) and many more.
Splitting the String Based on Delimiter:
Let us consider an example string as “Fox lives in woods” and delimiter as “ “ (space), then the string will split into multiple strings. Multiple strings after split will be “Fox” “lives” “in” “woods”.
So now, we are clear on the concept of splitting and also, we are clear now on the string and delimiter definition. Let us proceed with exploring the implementation of splitting in C.
Standard C Function for Split Based on Delimiter:
C provides the strtok() function, which can be used to split the string into tokens based on the selected delimiter.
Function prototype:
Header to be included:
C Program to Split the String Based on Delimiter Using strtok():
#include
int main()
{
char string[] = "Bob is studying in Stanford University";
char *delim = " ";
unsigned count = 0;
/* First call to strtok should be done with string and delimiter as first and second parameter*/
char *token = strtok(string,delim);
count++;
/* Consecutive calls to the strtok should be with first parameter as NULL and second parameter as delimiter
* * return value of the strtok will be the split string based on delimiter*/
while(token != NULL)
{
printf("Token no. %d : %s \n", count,token);
token = strtok(NULL,delim);
count++;
}
return 0;
}
C Program Snapshot:
Output of the program:
Now, let us discuss our own implementation to split string based on delimiter without using the standard C function(strtok()).
We must search the delimiter presence in the string and can return the address of the first character of the string token just before the delimiter.
C function to search the token based on delimiter can be implemented as below:
{
static char *remember = NULL;
int string_length = 0;
int i= 0;
int search_hit=0;
if(delim == NULL)
return NULL;
if((string == NULL) && (remember == NULL))
return NULL;
if(string == NULL)
string = remember;
string_length = strlen(string) + 1;
for(i=0;i<string_length;i++)
{
if(string[i] == delim[0])
{
search_hit = 1 ;
break;
}
}
if(search_hit != 1)
{
remember = NULL;
return string;
}
string[i] = '\0';
if((string+i+1) != NULL)
remember = string + i + 1;
else
remember = NULL;
return string;
}
Above is the search function to search for the token, once token is found character before the token can be copied and fetched from the source string buffer.
Complete C program with our implementation will look like below:
#include
char *search_token(char *string, char *delim)
{
static char *remember = NULL;
int string_length = 0;
int i= 0;
int search_hit=0;
if(delim == NULL)
return NULL;
if((string == NULL) && (remember == NULL))
return NULL;
if(string == NULL)
string = remember;
string_length = strlen(string) + 1;
for(i=0;i<string_length;i++)
{
if(string[i] == delim[0])
{
search_hit = 1 ;
break;
}
}
if(search_hit != 1)
{
remember = NULL;
return string;
}
string[i] = '\0';
if((string+i+1) != NULL)
remember = string + i + 1;
else
remember = NULL;
return string;
}
int main()
{
char string[] = "Bob is studying in Stanford University";
char *delim = " ";
unsigned count = 0;
char *token;
printf("Full String = %s \n",string);
/* First call to search_toekn should be done with string and delimiter as first and second parameter*/
token = search_token(string,delim);
// printf("Token no. %d : %s \n",count, token);
count++;
/* Consecutive calls to the strtok should be with first parameter as NULL and second parameter as delimiter
* * return value of the strtok will be the split string based on delimiter*/
while(token != NULL)
{
printf("Token no. %d : %s \n", count,token);
token = search_token(NULL,delim);
count++;
}
return 0;
}
Output from the above program with the same input set as of standard C strtok function:
Full String = Bob is studying in Stanford University
Token no. 1 : Bob
Token no. 2 : is
Token no. 3 : studying
Token no. 4 : in
Token no. 5 : Stanford
Token no. 6 : University
bash-4.2$
Snapshots of the Complete Program:
Output Snapshot:
Conclusion:
So far, we discussed about the Splitting the string based on delimiter. There are already available library ways to do that. The library function which can be used to split the string based on delimiter is strtok. We took an example use case to understand the library function strtok. Also, we wrote an example program to understand the usage of library function.
Second part, we implemented our own method of splitting the string based on delimiter. We wrote a function which is like C function strtok. Explanation of the functioning of the custom function was provided and demonstrated with the help of the same main function which was taken in case of C library function. Example output of the program is also provided with the Example program.
We have also gone through the concept of string split based on delimiter, just to summarize any character which is search in the main string can be considered as a token and can be searched till the token is encountered. Once the token is found, string before the token is returned to the caller function.