Python

pandas Astype

The pandas object is typically cast to a specific dtype.astype() function using the astype() method. This tool can also change any eligible column to a categorical type. We use it when we wish to convert a specific column datatype to another datatype. The input to the Python dictionary can also be used to change multiple column types simultaneously. The column name and the new datatypes we wish to use in the columns are represented in the dictionary by the key and values labels, respectively. The technique of pandas astype has several applications.

This tool allows us to modify the datatype of:

  1. A series of pandas
  2. Each column in a pandas dataframe
  3. Several columns in a dataframe

How To Use the astype() Function in pandas

To use the astype() function, we must first understand its syntax. Both series objects and dataframes can be used with the astype method. The syntax for astype on the series and astype on the columns of the dataframe are as follows:

Syntax of astype() Function on Series

Simply type the series name followed by the astype() function’s “dot syntax” to invoke the method for a series.

Syntax:

Syntax of astype() Function on DataFrame Columns

We must first type the data frame’s name before calling the astype() method “dot syntax”.

Syntax:

DataFrame.astype(dtype, copy=True, errors=’raise’, **kwargs)

dtype: The pandas object is cast to the same type using the Python type or numpy.dtype. One or more DataFrame’s columns can be cast to column-specific types. It is also possible to use {col:dtype,?} as an alternative, where col represents the column’s name and dtype is a NumPy.

copy: It provides a copy if copy=True. When copy= False, we have to be careful since changes to values may spread to other objects of pandas.

error: If you try to apply an incorrect or invalid datatype on the object, the technique will either raise an exception or not, depending on the value of the error parameter. The two responses to this are as follows:

raise(if there is an issue, it will raise an exception)

ignore ( if there is a problem, it will suppress exceptions. it will return the original object if there is a mistake)

In astype() function raise = True by default.

kwargs: The constructor receives a keyword argument (kwargs) as an input.

Now we have seen the syntax. We will teach the use of the astype() function in the following examples.

Example 1: Changing Datatype of Series of pandas

First of all, we will import pandas modules then we will use the DataFrame() function to create the dataset, passing it a dictionary of column names and some lists containing the information we want the variables to hold:

Additionally, we’ll create a series that solely holds the dataframe’s expenses variable:

To illustrate our dataframe, we will use the print() function.

There are four variables: name, country, sales, and expenses. As a result, the dataframe includes sample or dummy sales and expenses information for some individuals from different countries.

Let’s check the datatype of our series called e_variable using dtype.

This indicates that our e_variable Series has the datatype “object”. We will now change the series datatype to int64 using Pandas astype.

The output’s datatype is dtype: int64, as you can see if you pay close attention to the output’s bottom. Remember that the e variable has not been modified directly by this. Because the output of astype() was printed to the console, the series e_variable still has the datatype of “object”. We would have to reassign the output to the actual variable name with the following code if we wanted to alter the data permanently directly:

Now, the datatype of our series is altered permanently.

Example 2: Changing DataFrame Column’s Datatype

We will now operate on a dataframe’s column. Comparing this to example 1, where we worked with a pandas Series, will be slightly different. In this case, we’re going to work with a dataframe. Therefore, the syntax will differ a little. First, we will check the current datatype of our dataframe s_data. To check the datatype of the dataframe, we will use dtype.

To examine the output’s data kinds, we will additionally use the .dtypes attribute.

You’ll see that sale’s datatype has been modified to int64 in the output. To accomplish this, we used the astype function, passing an argument of a dictionary. The column’s name appears on the dictionary’s left side, and the new datatype appears on the right. Let’s change the datatype of another column of our dataframe.

The datatype of the nation column has been completely transformed to “string,” as you can see.

Example 3: Changing Multiple Column Datatype in a Dataframe

Let’s now change the datatype of a number of our dataframe’s columns. This may be done in a manner that is quite similar to how we changed the column in example 2. We will employ a different dictionary in this example.

Once more, let’s examine the original datatypes with the use of the .dtype attribute before doing the operation:

Once more, note that the object is the datatype of every column in the dataframe.

We will now modify the datatype of several columns. To accomplish this, we’ll create a dictionary with the variable’s name and datatype as its “key” and “value” pairs. After the attribute astype, we’ll additionally call the .dtypes attribute so we can observe the new datatypes.

You’ll see that three columns’ datatypes have changed in the output. The datatype of the country column is changed to string, the datatype of sales is changed to int64, and the datatype of expenses is changed to int32.

In the parenthesis, we added a dictionary and invoked the astype method. Some key/value pairs with the format “column”: “datatype” were in the dictionary. The new data types and the column names only need to be provided in a dictionary.

Example 4: Casting Datatypes of All Columns in a DataFrame

The pandas astype() method default tries to cast every dataframe column to the Python types (int, text, float, date, and datetime) or numpy.dtype that are supplied. The operation fails, and the error “ValueError: invalid literal” is raised if any columns cannot be cast because of invalid data or nan. For this example, let’s create a new dataframe from dictionaries.

All dataframe column’s names and dtypes are returned by the DataFrame.dtypes function. Remember that each column has an object type in the dataframe shown above. Now, we will cast the data type to string.

As previously seen, the dtype is updated for all columns of the df dataframe.

Conclusion

In this tutorial, we tried to teach you how to use the Python pandas DataFrame.astype() method. We changed the datatype of the dataframe’s column to a different data type and then examined the dataframe. We implemented a few examples in this article, so you may be able to change the series’s datatype and the data frame column’s datatype using the pandas astype method in Python.

About the author

Aqsa Yasin

I am a self-motivated information technology professional with a passion for writing. I am a technical writer and love to write for all Linux flavors and Windows.