Method 1: Using To_Numpy()
When we apply the to_numpy() method on Pandas DataFrame, an object of the NumPy data type, ndarray is returned as output. Typically, a 2-dimensional ndarray is returned. Let’s have a look at the function’s syntax before seeing the working of the function in the following examples.
Syntax:
Parameters:
- dtype: NumPy.dtype, str, or optional. The datatype is passed to numpy.asarray().
- copy: Bool, False by default. Whether to check that the output/returned data/value isn’t a view on the other arrays. The to_numpy() is not guaranteed to be no-copy when copy=False is used. Instead, copy=True makes a copy even if it is not strictly necessary.
- na_value: Any option. The value to replace if there are missing values. The value, by default, depends on the dtypes of the columns in the DataFrame.
Example 1:
Let’s have a DataFrame having 5 rows and 3 columns and convert it to a NumPy array using the to_numpy() method.
import numpy
# Consider the pandas DataFrame
actual=pandas.DataFrame([[1,"cooking",200],
[2,"music",3004],
[3,"hand loom",1000],
[4,"hand loom",2000],
[5,"dressing",3000]],
columns = ['id','work','wages'],
index=['person 1','person 2','person 3','person 4','person 5'])
# Display the converted DataFrame
print(actual,"\n")
# Convert to Numpy array
converted=actual.to_numpy()
# Display the type of numpy array
print(type(converted),"\n")
print(converted)
Output:
Explanation:
After converting to the NumPy array, we use the type() function to display the type of converted array. You can see that 5 rows are stored in a NumPy array.
Example 2:
Convert only two columns in the DataFrame to the NumPy array using the to_numpy() method. Here, we have to specify the column names to be converted to the NumPy array in a list.
import numpy
# Consider the pandas DataFrame
actual=pandas.DataFrame([[1,"cooking",200],
[2,"music",3004],
[3,"hand loom",1000],
[4,"hand loom",2000],
[5,"dressing",3000]],
columns = ['id','work','wages'],
index=['person 1','person 2','person 3','person 4','person 5'])
# Convert only 'work' and 'wages' columns to numpy array
print(actual[['work','wages']].to_numpy())
Output:
Explanation:
We can see that only two columns [“work”,”wages”] are converted to the NumPy array.
Method 2: Using the Values Attribute
The “values” is an attribute that converts the Pandas DataFrame to the NumPy array directly.
Syntax:
Example 1: Convert the Entire DataFrame to NumPy Array
Consider the previous DataFrame and convert it to a NumPy array using the to_numpy() method.
import numpy
# Consider the pandas DataFrame
actual=pandas.DataFrame([[1,"cooking",200],
[2,"music",3004],
[3,"hand loom",1000],
[4,"hand loom",2000],
[5,"dressing",3000]],
columns = ['id','work','wages'],
index=['person 1','person 2','person 3','person 4','person 5'])
# Use values attribute to convert the above DataFrame to numpy array.
print(actual.values)
print(type(actual.values))
Output:
Explanation:
You can see all the columns in the DataFrame to the NumPy array.
Example 2: Convert Some Columns to NumPy Array
Convert only two columns in the DataFrame to the NumPy array using the to_numpy() method. Here, we have to specify the column names to be converted to the NumPy array in a list.
import numpy
# Consider the pandas DataFrame
actual=pandas.DataFrame([[1,"cooking",200],
[2,"music",3004],
[3,"hand loom",1000],
[4,"hand loom",2000],
[5,"dressing",3000]],
columns = ['id','work','wages'],
index=['person 1','person 2','person 3','person 4','person 5'])
print(actual[['work','wages']].values)
Output:
We can see that only two columns [“work”,”wages”] are converted to the NumPy array.
Method 3: Using the To_Records()
The “to_records()” directly converts the existing DataFrame to a NumPy array which is of record array type. The advantage of using this method is that for each converted row, the index also comes in the record array.
Syntax:
Example 1: Convert the Entire DataFrame to NumPy Array
Consider the previous DataFrame and convert it to a NumPy array using the to_records() method.
import numpy
# Consider the pandas DataFrame
actual=pandas.DataFrame([[1,"cooking",200],
[2,"music",3004],
[3,"hand loom",1000],
[4,"hand loom",2000],
[5,"dressing",3000]],
columns = ['id','work','wages'],
index=['person 1','person 2','person 3','person 4','person 5'])
# Use to_records() to convert the above DataFrame to numpy array.
print(actual.to_records(),"\n")
# Get the data type
print(type(actual.to_records()))
Output:
Explanation:
You can see all the columns in the DataFrame to the NumPy array and the returned array is a record array. In each record, you can also see the index.
Example 2: Convert Some Columns to NumPy Array
Use the to_records() method to convert the first 2 columns in the DataFrame to a NumPy array.
import numpy
# Consider the pandas DataFrame
actual=pandas.DataFrame([[1,"cooking",200],
[2,"music",3004],
[3,"hand loom",1000],
[4,"hand loom",2000],
[5,"dressing",3000]],
columns = ['id','work','wages'],
index=['person 1','person 2','person 3','person 4','person 5'])
# Use to_records() to convert the first 2 columns in the DataFrame to a numpy array.
print(actual[['id','work']].to_records(),"\n")
Output:
The first two columns are converted to the NumPy array.
Conclusion
We discussed what arrays are and how the DataFrames in Pandas can be converted to NumPy columns. We used three methods to change the DataFrame columns into an array. In the examples of this article, we tried to teach you how to convert the specific columns or the entire DataFrame into a NumPy array using the to_numpy() function. We also used the values attribute and to_records() method to convert the DataFrame columns into a NumPy array.