How to use Pandas DataFrames

1 min

To use Pandas DataFrames, you first need to install the pandas library by running pip install pandas in your terminal or command prompt. Then, you can import pandas into your script and use it to create, manipulate and analyze data stored in dataframes.

Here are some common operations you can perform using Pandas DataFrames:

  1. Creation: Create a DataFrame from a dictionary, list, or CSV file using pandas.DataFrame()
  2. Indexing/Selection: Select data using labels (loc) or positions (iloc)
  3. Sorting: Sort data by one or multiple columns
  4. Filtering: Filter data using boolean indexing
  5. Grouping: Group data and aggregate using functions such as sum() or mean()
  6. Transformation: Apply functions to data to transform it, such as using the apply() method or map()
  7. Cleaning: Clean and pre-process data by handling missing values, converting data types, etc.
  8. Visualization: Plot data using functions such as plot() or hist().

Here's a simple example to create a DataFrame from a dictionary and access the data:

import pandas as pd

data = {'Name': ['John', 'Jane', 'Jim', 'Joan'],
        'Age': [30, 29, 31, 33],
        'City': ['New York', 'London', 'Paris', 'Berlin']}

df = pd.DataFrame(data)


This will output:

   Name  Age     City
0  John   30  New York
1  Jane   29    London
2   Jim   31     Paris
3  Joan   33    Berlin

If you want to know more about DataFrames and their capabilities you can find a lot of ressources here:

Learn about the best methods to read, clean, plot and save data.