How to do statistics on a DataFrame
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Pandas library provides a wide range of statistical methods that can be applied to a DataFrame.
Here are some common statistics you can perform on a DataFrame:
- mean
- median
- mode
- std
- var
- min, max
- sum
- count
- describe
Mean
df.mean()
Median
df.median()
Mode
df.mode()
Standard Deviation
df.std()
Variance
df.var()
Minimum or Maximum
df.min()
df.max()
Sum
df.sum()
Count
df.count()
All at a time
df.describe()
You can also use more advanced statistical methods such as correlation, covariance, etc. using the .corr()
, .cov()
etc.
Note that these are just a few examples of the statistical methods available in Pandas. The library offers many more methods for performing more complex statistics on DataFrames.
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Related Articles
Continue your learning journey with these related topics
Master Data Science in Days, Not Months 🚀
Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚