3 Ways to Read Multiple CSV Files: For-Loop, Map, List Comprehension

Written by Matt Dancho on September 21, 2021



Reading many CSV files is a common task for a data scientist. In this free tutorial, we show you 3 ways to streamline reading CSV files in Python. You’ll read and combine 15 CSV Files using the top 3 methods for iteration.

Python Tips Weekly

This article is part of Python-Tips Weekly, a bi-weekly video tutorial that shows you step-by-step how to do common Python coding tasks.

Here are the links to get set up. 👇

Video Tutorial
Follow along with our Full YouTube Video Tutorial.

This 5-minute video covers reading multiple CSV in python.

(Click image to play tutorial)

Read 15 CSV Files [Tutorial]

This FREE tutorial showcases the awesome power of python for reading CSV files. We’ll read 15 CSV files in this tutorial.

Before we get started, get the Python Cheat Sheet

The Python Ecosystem is LARGE. To help, I’ve curated many of the 80/20 Python Packages, those I use most frequently to get results. Simply Download the Ultimate Python Cheat Sheet to access the entire Python Ecosystem at your fingertips via hyperlinked documentation and cheat sheets.

(Click image to download)


Onto the tutorial.

Project Setup

First, load the libraries. We’ll import pandas and glob.

  • Pandas: The main data wrangling library in Python

  • glob: A library for locating file paths using text searching (regular expressions)

Second, use glob to extract a list of the file paths for each of the 15 CSV files we need to read in.

Libraries

Get the code.

Method 1: For-Loop

The most common way to repetitively read files is with a for-loop. It’s a great way for beginners but it’s not the most concise. We’ll show this way first.

Python For-Loop for Reading CSV Files

Get the code.

We can see that this involves 3-steps:

  1. Instantiating an Empty List: We do this to store our results as we make them in the for-loop.

  2. For-Each filename, read and append: We read using pd.read_csv(), which returns a data frame for each path. Then we append each data frame to our list.

  3. Combine each Data Frame: We use pd.concat() to combine the list of data frames into one big data frame.

PRO-TIP: Combining data frames in lists is a common strategy. Don’t forget to use axis=0 to specify row-wise combining.

Method 2: Using Map

The map() function is a more concise way to iterate. The advantage is that we don’t have to instantiate a list. However, it can be more confusing to beginners.

How it works:

Map takes in two general arguments:

  1. func: A function to iteratively apply

  2. *iterables: One or more iterables that are supplied to the function in order of the functions arguments.

How Map Works (Python)

Get the code.

Let’s use it.

Ok, so let’s try map().

Python Map for Reading CSV Files

Get the code.

We use 3-steps:

  1. Make a Lambda Function: This is an anonymous function that we create on the fly with the first argument that will accept our iterable (each filename in our list of csv file paths).

  2. Supply the iterable: In this case, we provide our list of csv files. The map function will then iteratively supply each element to the function in succession.

  3. Convert to List: The map() function returns a map object. We can then convert this to a list using the list() function.

PRO-TIP: Beginners can be confused by the “map object” that is returned. Just simply use the list() function to extract the results of map() in a list structure.

Method 3: List Comprehension

Because we are returning a list, even easier than map(), we can use a List Comprehension. A list comprehension is a streamlined way of making a for-loop that returns a list. Here’s how it works.

Python List Comprehension for Reading CSV Files

Get the code.

  1. Do this: Add the function that you want to iterate. The parameter must match your looping variable name (next).

  2. For each of these: This is your looping variable name that you create inside of the list comprehension. Each of these are elements that will get passed to your function.

  3. In this: This is your iterable. The list containing each of our file paths.

Summary

There you have it. You now know how to read CSV files using 3 methods:

  1. For-Loops
  2. Map
  3. List Comprehension

But there’s a lot more to learning data science. And if you’re like me, you’re interested in a fast track system that will advance you without wasting time on information you don’t need.

The solution is my course, Data Science Automation with Python

Data Science Automation with Python Course

Tired of struggling to learn data science? Getting stuck in a sea of neverending resources? Eliminate the confusion and speed up your learning in the process.

Businesses are transitioning manual processes to Python for automation. We teach you skills that organizations need right now.

Learn how in our new course, Python for Data Science Automation. Perform an end-to-end business forecast automation using pandas, sktime, and papermill, and learn Python in the process.