You will often need to compute the standard deviation of a DataFrame column.
Furthermore, in statistics the standard deviation is referred as sigma.
An quite famous analysis is to approximate the range of a value given its two sigma value.
Well as you can see 95% of your data will be located between our mu - 2 sigma and mu + 2 sigma.
So it is most probable that our variable value will end up in this range.
Let's have a look at real world example data.
Reading an example dataframe
import pandas as pd # We read a sample dataset from the web. df = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
Here we have a example dataset that is about iris flowers.
If we look at the sepal_length of versicolor irises and compute the mean and the std.
This is how we compute the standard deviation using the DataFrame.std() method.
Computing the standard deviation
Computing the two sigma range
Here you are, you now know how to compute the two sigma range and will be able to perform statistical tests about it.