Python

Pandas Convert Column to Int

“Data processing and assessment are made available by the Python software package known as Pandas. A datatype is an underlying building block that computer languages use to comprehend how to save and modify data. When a pandas data frame is created with exterior data, numerical columns are typically designated as data type objects rather than int or float, which makes it challenging to perform numerical computations on them.

We’ll show you how to convert a Pandas column to an int in this piece and will discuss different methods to achieve the desired output in this piece of writing.”

Example 1: Convert Pandas Column to Int Utilizing the Pandas astype() Function

The first approach we are going to employ here to convert pandas Dataframe to int is using the astype() method from python’s Pandas library. We could supply any Python, Numpy, or Pandas datatype to change the type of all columns in a dataframe, or we could pass a dictionary with column names as keys as well as datatype as values to change the type of selected columns.

Here, the astype() method allows us to specify the data type we require. It is quite adaptable in that you may attempt to switch from one sort to another.

The syntax of the astype() function, which enables converting column type to int, is as follows.

Let us understand the parameters of this function one by one.

The first parameter here is “dtype” which refers to Data Type. The entire pandas object could be converted to the appropriate type using dtype or python type. Conversely, you may convert a single or multiple of the columns in a dataframe to a particular type for that column by using the syntax “[colmn:dtype,…]” wherein the column is the name of the column while dtype is an np.dtype or even python type. The second parameter is “copy”. It takes Boolean values as input. As a default, True is used. The copy value must be true to return a copy. The last parameter of the astype() method is “error”. Both “raise” and “ignore” are possible. However, “raise” is the default setting for this parameter.

Conversion of a Single Column of a Pandas Dataframe Into an Int Utilizing astype() Method

In this illustration, we will change the data type of a single column of the Dataframe into int. let’s look at how it works.

First, import the pandas library into the python file and then assign the alias to pd to avail the pandas features. When done with it, we have now created a dataframe object and named it a “frame,” and assigned it the output of calling the dataframe function, which is used to generate a Pandas DataFrame. The pd.DataFrame function initialized with three columns, “Student”, “Marks”, and “Points” is invoked. We have assigned the same length of values for each column of the dataframe. The print() function is utilized to print the Pandas Dataframe.

You can see the Dataframe with three columns in the image below:

Once the dataframe has been successfully created, we will then check the data types for all the columns.

The data type of the Pandas Dataframe can be displayed by using the “dtypes” property. For invoking this property, write the name of the dataframe object we have created above with the “.dtype” property; in our example, it is “frame.dtypes”. So it will check the data types for the specified dataframe. As we want to view the data types, we have to write “frame.dtypes” inside the braces of the print() function.

The print() function with the dtypes property will get us data types of all the columns of the “frame” dataframe.

You can view the output image showing three columns, all with the data type “object”.

We will now learn to change the datatype of the Dataframe by converting the predefined datatype to int. To achieve the desired datatype, we have to employ the “astype()” method. Between its parentheses, we provide the datatype to which we want to convert the earlier datatype. Here we will perform datatype conversion for only one column.

The syntax to utilize this method is to write the name of the dataframe object with the column’s name inside the long braces “[]”. Put the assignment operator “=” then the dataframe object having the same column name with the “.astype()” function and within its braces provide the required datatype. As in the above instance, we have “frame[‘Marks’].astype(int)”. That means we want to convert the datatype of the “Marks” column from “object” to “int”. Lastly, we will display the updated datatype for the “frame” dataframe by employing the “dtype” property inside the print() function’s braces.

This yields us the following output:

Conversion of More Than One Column of a Pandas Dataframe Into an Int Utilizing astype() Method

As we have learned to convert pandas single column of a dataframe into an int, now we will get on learning to convert multiple column’s datatype conversion to int.

We will use the same dataframe that we prepared in the first example for this instance. For the purpose of checking the datatype of the dataframe, the dtypes property is used. In the previous example, we provided one column that we wanted to convert into an int; however, where we needed to change the data type of more than one column. The columns we have chosen to change the datatype of are “Marks” and “Points”.

With the dataframe object, we have given the name of both columns. And assigned it the output of invoking the “astype()” function. We have set the datatype to int in the astype() method. You can choose even 3 or more columns as per your requirement to change the datatype. When we execute the print() method, we have provided the name of the dataframe object with the dtype property so that it will display the new datatype of the columns of the dataframe “frame”.

The terminal shows an output that holds a dataframe, the initial and actual datatype of each column of the dataframe, and then it displays the updated data type of the “Marks” and the “Points” columns.

Example 2: Convert Pandas Column to Int Utilizing the Pandas to_numeric() Function

Utilizing pandas to_numeric function is one of the finest techniques for converting single or more columns in a DataFrame to numeric values. This method will attempt to convert strings or other non-numeric items into acceptable integer or floating-point values.

Let’s see its practical implementation.

For the demonstration of this method, we first created a dictionary “data” which holds three strings, “Name”, “Score”, and “Attempt”. We have employed the Pandas.DataFrame() method to convert this dict into a dataframe and store this dataframe to a dataframe object “demo”. We then checked this Dataframe’s datatype by the dtype property in the print() function. We have chosen the column “Score” whose datatype we want to change to int. We then utilized the pandas to_numeric() function, and within its braces, we provided the dataframe object with the column name. Finally, the print statement with the dtype property will show the updated datatype of the dataframe.

This is the output image:

Conclusion

In this article, we have tried to make you familiar with the concept of converting a Dataframe’s datatype to int. We utilized two pandas functions for the said purpose. For the first approach, we have implemented two practical codes on the Spyder tool as well as for the second illustration. Practice will make your concepts strong, and your knowledge will grow better.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list