Python

Python Read CSV Into 2D Array

As we know, when we talk about the 2D array, we are talking about the NumPy array. The NumPy array is basically used by computer scientists and machine learning engineers to deal with the huge amounts of data stored in the CSV file. As a result, NumPy enables them to process large amounts of data in a CSV file in a very convenient manner. Python also helps in the same way by providing different methods to read the CSV file data into a NumPy array. So, we are going to learn about these different kinds of methods in this article.

  1. Using numpy loadtxt () method
  2. Using numpy genfromtxt () method
  3. Using pandas dataframe
  4. Using the list data structure
  5. Using pandas dataframe values () method

What is a CSV File?

A CSV is a (comma separated values) file in which data is in the form of a tabular. The extension of the CSV file is .csv. This csv file is mostly used in the data analytics. Apart from the data analytics, the CSV file also used in the e-commerce application because it’s very easy to handle in all different types of programming languages.

Method 1: Using the numpy loadtxt () Method

In this method, we are going to use the numpy.loadtxt () method which converts the CSV data into a 2D array. The below is a sample CSV file which we will use in this program.

1,2
3,4
5,6
7,8
9,10

Python code:

import numpy as np

CSVData = open("sampleCSV.csv")
Array2d_result = np.loadtxt(CSVData, delimiter=",")

print(Array2d_result)

Output:

[[ 1. 2.]

[ 3. 4.]

[ 5. 6.]

[ 7. 8.]

[ 9. 10.]]

Line 1: We import the NumPy library.

Line 3-4: We open the sampleCSV file and we pass both CSVData and the delimiter to np.loadtxt () function, which returns the data into a 2D array.

Line 6: We finally print the result which shows that now our CSV data converted into a 2D array.

Method 2: Using the numpy genfromtxt () Method

In this method, we are going to use the numpy.genfromtxt () method which converts the CSV data into a 2D array. The below is a sample CSV file which we will use in this program.

1,2

3,4

5,6

7,8

9,10

Python code:

import numpy as np

CSVData = open("sampleCSV.csv")
Array2d_result = np.genfromtxt(CSVData, delimiter=",")

print(Array2d_result)

Output:

[[ 1. 2.]

[ 3. 4.]

[ 5. 6.]

[ 7. 8.]

[ 9. 10.]]

Line 1: We import the NumPy library.

Line 3-4: We open the sampleCSV file and we pass both CSVData and the delimiter to NumPy np.genfromtxt () function, which returns the data into a 2D array.

Line 6: We finally print the result which shows that now our CSV data converted into a 2D array.

Method 3: Using the Pandas Dataframe

In this method, we are going to use the pandas which converts the CSV data into a 2D array. Below is a sample CSV file which we will use in this program.

1,2

3,4

5,6

7,8

9,10
import pandas as pd
df = pd.read_csv('sampleCSV.csv')
print(df)
Array2d_result = df.to_numpy()
print(Array2d_result)

Output:

1 2

0 3 4

1 5 6

2 7 8

3 9 10

[[ 3 4]

[ 5 6]

[ 7 8]

[ 9 10]]

Line 1: We import the pandas library as pd.

Line 2-3: We read the CSV file using the pandas read_csv method and then print the newly created dataframe (df) on the screen as shown in the above output.

Line 4-5: We then use the dataframe method to_numpy which converts the whole dataframe values into a 2d array as shown in the output.

Method 4: Using the List Data Structure

In this method, we are going to use the list data structure. The list can also help us to get the CSV data into a 2-D array. The below program demonstrate the same method.

import csv
import numpy
with open("sampleCSV.csv", newline='') as file:
    result_list = list(csv.reader(file))
print(result_list)
result_2D = numpy.array(result_list)

print(result_2D)

Output:

[['1', '2'], ['3', '4'], ['5', '6'], ['7', '8'], ['9', '10']]

[['1' '2']

['3' '4']

['5' '6']

['7' '8']

['9' '10']]

Line 1: We import the CSV and numpy libraries.

Lines 3-5: We open the sampleCSV file and then read each CSV file’s data using the CSV.reader () method and convert the results into a list of lists.

Line 6: Now, we use the numpy.array method to convert the whole list of lists into a 2-D array. The result in the output shows that our CSV data has now been successfully converted into a 2-D array.

Method 5: Using Pandas Dataframe Values

In this method, we are going to use the very basic method to convert the CSV data into a NumPy array by using the dataframe values () function. The below programme will demonstrate the same.

import pandas as pd
df = pd.read_csv('sampleCSV.csv')

print(df)
Array2d_result = df.values
print(Array2d_result)

Output:

1 2

0 3 4

1 5 6

2 7 8

3 9 10

[[ 3 4]

[ 5 6]

[ 7 8]

[ 9 10]]

Line 1: We import the pandas library as pd.

Line 2-4: We read the CSV file using the pandas read_csv method and then print the newly created dataframe (df) on the screen as shown in the above output.

Line 5-6: We then use the dataframe values () function which converts the dataframe into a NumPy 2-D array as shown in the output.

Conclusion

In this article, we have seen different methods to read CSV data into a 2D array. We have shown all the methods that are currently used by different programmers and computer scientists. Some of the methods are in-built, and some of the methods are created by combing different methods from different libraries. But all the above methods you can use according to your requirements. If you know how to read the CSV file, you can create some of your own methods too.

About the author

Shekhar Pandey