Python

Pandas Change the Column Type to String

A DataFrame’s column types may need to be changed after creation for a variety of reasons, such as to convert a column to a numerical format that may be used for modeling and classification. This tutorial shows you how to convert the values of the column to a string datatype using Python’s Pandas package. We will try to teach you how to change the Pandas floating and integer values into strings. Additionally, you’ll learn the benefits of using the string datatype in Pandas as well as how the strings have evolved in Pandas. We will use different functions to change the DataFrame column’s dataypes to a string.

String Datatype in Pandas

Pandas use the object datatype by default to store strings. Strings and mixed data types are both handled by the object data type, however, it is not explicit particularly. There has been a dedicated datatype of string in Pandas in version 1.0. Although this datatype does not yet provide any explicit resource or performance improvements, the Pandas development team said that this will happen in the future. As a result, this tutorial will exclusively employ the string datatype. Use the “str” in place of string whenever possible if you’re using a Python version less than 1.0.

How to Change the Pandas Column to String

Different functions can be used to change the column in Pandas to string datatype. Using the astype() method is the most common way to do it. Let’s have a look at the astype() function to see how it works.

Syntax: df.astype({“Column_name”: str}, errors=’raise’)

df.astype(): A method to call the Pandas astype function.

“Column name”: str: Columns to be converted to a different format (String datatype). The column name is the column whose datatype is going to be changed. The column values need to be transformed to the required datatype, which is String. Any built-in datatype of Python or datatype is acceptable.

errors=’raise’: To define how exceptions should be handled during convertion. Only the possible values of the cells are converted; “raise” will display an error, and “ignore” will ignore it.

We have seen the syntax of the astype() method. Now, in the following examples, you will learn how to use the astype(), other functions, and attributes to convert the DataFrame columns to strings.

Example 1: Using the Astype() Method

A Pandas object can be converted to a specific datatype using the astype() method. Any appropriate existing column can be converted to a categorical type using the astype() function. When we need to convert the datatype of a specific column to another datatype, the astype() method is very useful. In this example, we will change the column datatype to String using the astype() function for we have to create a DataFrame. For creating a DataFrame, we will first import the Pandas library to use its features and functionalities.


We created our DataFrame by passing a dictionary to the pd.DataFrame() function as an argument. The keys of the dictionary become the labels for each column after passing them in the pd.DataFrame() function and the values of keys become the values of the DataFrame’s columns. To visualize the DataFrame, we use the print() function.


In the previous DataFrame, we have four DataFrame columns. The first column “Student” contains the names of students – “Jack”, “Tony”, “Marty”, “Alex”, “Rob”. In the second column “Age”, the ages of each student is stored “16, 15, 18, 17, 17”. While the column “fee” stores the fee of each course “7000.0, 6500.0, 7100.0, 7000.0, 6900.0 ”. The column Subject consists of the names of subjects – “English”, “Statistics”, “Maths”, “English”, “Science” . We can check the datatype of each row by using the dtypes attribute as follows:


The datatype of the columns “Student” and “Subject” is “object”.Whereas the datatype of the columns “Age” and “fee” is int64 and float64, respectively. Now, let’s change the datatype of column “fee” from float to string.


As we applied the astypes() method and passed the datatype “string” to change the datatype of column “fee”, let’s see whether the column is converted to a string or not.


You can notice that the datatype of the column “fee” is converted to string from float64.

Example 2: Using the Map() and Apply() Method

The map() method is used to convert the Series values into their corresponding inputs. To change each data value in a series with a different value, the map() function is used. That value may be obtained from a series, a dict, or a function while the users can pass a function and apply it to every value in the Pandas series using the pandas apply() function. However, these functions can be used to change the datatype of columns. Let’s use the same DataFrame that we created in the previous example.


Now, we check the datatypes of the columns using the dtypes attribute.


Now, let’s change the datatypes of column “Age” using the map() function and change the datatypes of the column “fee” with apply() function.


We applied both the map() and the apply() function to column “fee” and “Age”, respectively. Let’s use the dtypes attribute on our “df” DataFrame to see the results.


Here, we can see that the “string” datatype cannot be used when using the map() and apply() method. Both columns “Age” and “fee” are now converted to object datatype as a result. However, Pandas use the object datatype by default to store the “strings” but we want the result in “String” datatype. Because of this, we wouldn’t advise using these methods in the new versions of Python to change the datatypes of the columns.

Example 3: Using the Astype() Method

Last but not the least, we use the astype() method to change the datatype of the DataFrame into strings. We have seen how to change the datatypes of specific columns of the DataFrame into “string” in the previous examples. Now, we change the datatypes of all columns to “string” in this example. Again, we utilize the “df” DataFrame for this example.


By using the dtypes attribute, let’s first check the datatypes of our “df” DataFrame columns.


None of the columns in the previous DataFrame have a column with string datatype. Now, we use the astype() method to change the datatype of all columns of the DataFrame into the datatype “string”.


As can be seen, by simply using the astype() property with the DataFrame while passing the “string” inside the function, you can easily change all columns of the DataFrame to string.


The datatypes of every column are changed to string.

We can also use the applymap() function or the .values.astype() to convert the datatypes but they will return the “object” datatype instead of “string”.

Conclusion

In this tutorial, we discussed what a datatype string in Python is and how you can change the Pandas column into the string. We learned the syntax of the astype() function to understand how it works. After going through this tutorial, you may be able to change the columns into the string by yourself. We implemented different examples to teach you how the astype the map() and apply() method can be used to change the Pandas columns to string datatypes.

About the author

Aqsa Yasin

I am a self-motivated information technology professional with a passion for writing. I am a technical writer and love to write for all Linux flavors and Windows.