How to sample a DataFrame using Pandas
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Sometimes you might need to sample a dataset, either to perform some statistics over a sample of your population dataset.
To do so, Pandas comes with a .sample() method.
The method takes as argument the size of the sample you want.
To do so we define our sample dataframe
import pandas as pd
# We read a sample dataset from the web.
df = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
We draw a sample using size
We draw a sample size of 20.
df.sample(size=20)
We draw a sample using a fraction
We draw a sample that will amount to 20% of the dataset size.
df.sample(frac=0.2)
Here you are, you now know how to sample a dataframe !
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Related Articles
Continue your learning journey with these related topics
Master Data Science in Days, Not Months 🚀
Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚