Python

Python csv skip header row

In this article, we will learn how we can remove the header of the CSV file data while reading the CSV itself because sometimes we don’t need the header of the CSV file data. So we are going to learn these four methods, which are given below:

  1. Using the next () method
  2. Use the DictReader () method
  3. Pandas skiprows based on a specific row number
  4. Pandas skiprows based on an index position

Let’s explain each of the above methods in detail.

Method 1: Using next () method

In this method, we will use the next () method and see how this method will discard the header row before we print all the other csv data.

CSV File: The below csv file (test.csv) we will be using for this blog.

Month,1958,1959,1960

JAN,340,360,417

FEB,318,342,391

MAR,362,406,419

APR,348,396,461

JAN,340,360,417

FEB,318,342,391
importcsv

withopen("test.csv", "r") as record:    
# We are creating an object of the csv reader
csvreader_object=csv.reader(record)
# The line will skip the first row of the csv file (Header row)
next(csvreader_object)

# We are now printing all rows except the first row of the csv
for row incsvreader_object:
print(row)

Output:

['JAN', '340', '360', '417']

['FEB', '318', '342', '391']

['MAR', '362', '406', '419']

['APR', '348', '396', '461']

['JAN', '340', '360', '417']

['FEB', '318', '342', '391']

Line 1: We import the CSV module.

Line 3 -7: We open the test.csv file in read mode (‘r’) as a record, and then we create an object of the csv.reader() method. The next () method, when we call it, automatically discards the first row from the csv reader object and the rest of the data we can use as we need.

Lines 10–11:Now, we are iterating the csv reader object and printing each row. The above output shows that now there is no header row.

Method 2: Using DictReader () method

Now, we are going to see how we can read the csv as a dictionary format. But after reading the csv file as a direct format, we will print only the value, not the key, which will solve our problem of printing all data without the header row. We are using the same test.csv file as we used before. An example of this method is given below:

importcsv

withopen("test.csv", "r") as record:
# We are creating an object of the csv reader
csvreader_object=csv.DictReader(record)
# The line will skip the first row of the csv file (Header row)
# because it works as a dict and we are printing only values not keys
for row incsvreader_object:
print(row["Month"], row["1958"], row["1959"],row["1960"])

Output:

JAN 340 360 417

FEB 318 342 391

MAR 362 406 419

APR 348 396 461

JAN 340 360 417

FEB 318 342 391

Line 1: We import the CSV module.

Line 3 -5: We open the test.csv file in read mode (‘r’) as a record, and then we create an object of the csv.DictReader() method.

Lines 8–9: Now, we are iterating the csv DictReader object and printing each row. But this line automatically discards the first row from the csv reader object because DictReader converts each row in a dict (key and value) form. When we print only value, not key, which only shows the data, not the k,v, which was our primary objective.

Method 3: Using Pandas read_csv skiprows attributes

In this method, we are going to use the Pandas read_csv attribute skiprows. In the skiprows, we will mention the header row number, which is obviously 1, so we define the value of the skiprows as 1 as shown in the below program. This way, we can ignore the header row from the csv while reading the data.

importpandasaspd
skipHeaderDf=pd.read_csv('test.csv', skiprows=1)

print(skipHeaderDf)

Output:

JAN 340 360 417

0 FEB 318 342 391

1 MAR 362 406 419

2 APR 348 396 461

3 JAN 340 360 417

4 FEB 318 342 391

Line 1: We import the Pandas library as a pd.

Line 2: We read the csv file using the pandas read_csv module, and in that, we mentioned the skiprows=1, which means skipping the first line while reading the csv file data.

Line 4: Now, we print the final dataframe result shown in the above output without the header row.

Method 4: Using Pandas, remove the header of the csv using index position

In this method, we are going to use the Pandas read_csv attribute skiprows. In the skiprows, we will mention the header index position number, which is obviously 0, so we define the value of the skiprows in square brackets ([ 0 ]) as shown in the below program. This way, we can ignore the header row from the csv while reading the data.

importpandasaspd
skipHeaderDf=pd.read_csv('test.csv', skiprows=[0])

print(skipHeaderDf)

Output:

JAN 340 360 417

0 FEB 318 342 391

1 MAR 362 406 419

2 APR 348 396 461

3 JAN 340 360 417

4 FEB 318 342 391

Line 1: We import the Pandas library as a pd.

Line 2: We read the csv file using the pandas read_csv module, and in that, we mentioned the skiprows=[0], which means skip the first line while reading the csv file data.

Line 4: Now, we print the final dataframe result shown in the above output without the header row.

Conclusion:

This article has seen four different methods to skip the header row while reading the csv file. All the methods in the above article are perfectly fine and are used by the Python programmer to skip the header of the CSV file while reading the CSV data. The Pandas library method not only allows us to remove the header of the CSV file data but can also be used to remove other rows if we specify their number or index position to the skiprows. So the skiprows will be able to remove all those rows whose numbers will be assigned to them. So the Pandas module to skip header is the best to use, and it is also very convenient for removing the other rows.

The other methods using the DictReader and reader are also available, but these are only for the header rows, so if we want to remove some other rows, we have to write some other code too.

About the author

Shekhar Pandey