Skip to content

Pandas cheat sheet

Pandas is a Python library for working with different data sources. It enables to select and manipulate data, and is primarily used for working with CSV files.

Installing pandas:

Terminal window
pip install pandas

Then, we can import it. The standard is to import it as pd.

A data frame is the basis for everything. The in short, df stores the data within Pandas, for enabling us to manipulate the data.

import pandas as pd
df = pd.read_csv("homes.csv")
df.head()

df.head() prints the column labels and five more rows. It is often used to verify Pandas is working.

data = {
"a": [1, 2, 3],
"b": [4, 10, 2],
"c": [6, 15, 13]
}
df = pd.DataFrame(data)

The provided dictionary generates the following data frame:

abc
123
4102
61513
df["name"]
df.loc[:,"name":]

While Pandas is not mainly for plotting and analysing data, it has some capabilites for it. For example, we can retrieve information about the distribution of a data column.

name_info = df['name'].value_counts(normalize=True)