Python

Pandas str Replace

“Values within a dataframe or other objects of pandas can be replaced using the replace() method in Pandas. In Python, we can use this method for data manipulation and cleaning the data. This tutorial will teach you how we can replace values or data from the given dataframe using the Pandas str.replace() function. In the dataframe, we can replace string data and even regular expressions (regex) by using the Pandas dataframe.str.replace() function. Similar to python’s replace() function, the str.replace() method in pandas also works on Series. To distinguish it from Python’s built-in replace() method, .str must be prefixed with replace() function on a pandas series.”

How do we Substitute the String Data/Values in Pandas?

The most common way to replace or swap the new value in place of an old value or original value is by using the replace() function of pandas which is an inbuilt function provided by pandas. However, we will use the str.replace() method in this post. The method str.replace() is used to substitute a string or regex with a string value or data. The replace() function can substitute anything with anything else, also the strings and regex. Take a look at the str.replace() function’s syntax.

Syntax


Where,

pat: compiled regex or str.

Regular expressions or character sequences can both be used as strings.

repl: callable or str.

Substitute string or callable. A replacement string must be returned by the callable for it to be used after receiving the regex match object.

n: By default, set as -1, int.

Total replacements to be made.

case: None by default, bool.

Finds out whether “replace” is case-sensitive:

    • Case sensitive, if True.
    • For case insensitivity, set to False
    • If pat is specified as compiled regex, it can”t be set.

flags: 0 or no flags by default, int.

Flags in the regex module, like re.IGNORECASE. If pat is specified as a compiled regex, it cannot be set.

regex: True by default, bool.

Identifies whether a regular expression is present in the passed-in pattern:

    • If True, the passed pattern is considered to be a regular expression.
    • If False, the pattern is treated as a literal string.

Example # 1: Substituting the String Values With String Data Using the str.replace()

We will create a dataframe first with the help of pd.DataFrame() function after importing the pandas as pd.


The new dataframe is created with three rows “name”, “age”, and group. The random names (“Hanna”, “Gilly”, “Rob”, “Jamie”, “Arya”, “Hopper”, “Lance”) are stored in the “name” column as a string. The column “age” is storing the ages of individuals (19, 17, 22, 21, 19, 18, 23) and the data values (“A”, “A”, “A”, “B”, “B”, “A”, “B”) are stored in the “group” column of dataframe. Now we can replace specific data values or values by using the str.replace() function.


We have simply selected the column “group” and then called the str.replace() function to replace the specified values. We used the value “A” as the first parameter (to be replaced). The second parameter is specified as “Alpha”, which will replace the value “A” in the group column. As it can be seen in the last column, “group”, the value “A” is replaced by “Alpha”.

Example # 2: Replace a Particular Character in a Single Column of a Dataframe

In the prior example, we have seen how to replace a string value with another string value using the str.replace() function. We will replace a specific character from a string value in this example.


With the help of the pd.DataFrame() method, we have created our dataframe. We have created three columns in our dataframe. The column problem consists of values (1, 2, 3, 4, 5, 6, 7, 8), whereas the column “division” and “result” consist of values (“6/2”, “4/2”, “15/5”, “9/3”, “12/2”, “8/4”, “12/4”, “21/3”) and (3, 2, 3, 3, 6, 2, 3, 7) respectively. Let’s replace the “/” character from the “division” column with another character or string value. Let’s first replace “/” with another special character.


We have replaced the character “/” with the division sign “÷”. In this example, we have replaced only a part of the string value rather than the entire string value. Now let”s replace the ÷ sign with a string value.


The character “÷” is now replaced by the string “divided by” in the column “division”.

Example # 3: Replace a Substring or Characters Sequence With the String Values

Now we will replace a substring or a sequence of characters using the str.replace() method.


A new dataframe has been created with the columns “team”, “score”, and “win”. In the column “team” the data values (“star_1”, “warrior_1”, “tiger_1”, “thunder_1”, “stunner_1”, “war_1”, “killer_1”) are stored as a string. The numerical values (10, 11, 9, 8, 8, 10, 7) and (2, 0, 1, 3, 0, 0, 2) are stored in the columns “score” and “win” respectively. We will now replace the sequence of characters with another string.


From the column “team,” we have replaced the substring “r_1” with the string “r champions” using the str.replace() function.

Example # 4: Replace Multiple Strings From the Dataframe Column Using replace() Function

To replace the string values, we can also use replace(), a built-in function of pandas. So far, we have specified a single or substring to replace the data. We will now use the replace() function to replace multiple string values from the given dataframe.


The dataframe has been created with the values (“Harry”, “Tony”, “Bruce”, “Peter”, “Robert”, “Nancy”, “Adam”), (“py”, “py”, “js”, “AI”, “Js”, “AI”, “js”) and (20, 19, 19, 17, 16, 18, 14) in the dataframe columns “student”, “subject”, and “marks” respectively. Now let’s substitute the multiple string values using the replace() method.


Inside the replace() function, we put two lists ([“py”,”js”] and [“Python”, “Javascript”]). The string values/items in the first list are the initial values already present in the given dataframe, while the string items in the second list will replace/substitute the initial values. We can also replace the numeric values by using the replace() function. This time we will use a dictionary to replace the values in the column “marks” with a string value.


In the column “marks”, we have replaced the value “19” with the string “nineteen”. As it can be seen that the numeric values can also be replaced with string values by using the replace() function.

Example # 5: Replace Strings From the Series Using Regex

In the function replace() the parameter regex = True by default. So, we can specify a regex to replace a value from the pandas series.


We have created a series with the values “Moris”, “Alex”, “Max”, “James”, “John”, and “Bran”. Now let’s specify a regex to replace the values.


We have used a compiled regex after importing the “re” module to replace the value “a” with the string value “o”. We specified the parameter “flags” as re.IGNORECASE to replace the value regardless of the case.

Conclusion

In this tutorial, we have tried to teach you how to substitute/replace the string values in pandas. We have discussed the syntax of the str.replace() method to understand its functionality. We implemented a few examples in this tutorial to teach you how to substitute the string values with string data, replace a particular character, replace a substring or characters sequence with the string values, and replace multiple strings from the dataframe column using str.replace() and replace() functions. Also, how to replace strings from the series using regex.

About the author

Aqsa Yasin

I am a self-motivated information technology professional with a passion for writing. I am a technical writer and love to write for all Linux flavors and Windows.