How to generate normally distributed data with NumPy using Python
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Generating data can be useful in many ways. It is often used when you want to
- Do a Monte-Carlo simulation (E.g. to test out whether a model is working or not)
- Create a statistical model that utilizes random noise. (E.g. random walk)
- Create synthetic data. (E.g. based on an existing distribution)
- etc...
We call those processes "Data Generative Processes" or DGA.
When data is generated according to what we observe in the real world.
Enough talking, let's code
The formula
The code
Let's generate data using NumPy's normally distributed data generator numpy.random.normal().
To this function, we have to pass three arguments.
- The mean of our distribution: mu
- A standard deviation: std
- How many random numbers do we want: n
import numpy as np
# We set our three arguments
mu = 0
std = 1
n = 50
# We generate a list of n randomly distributed observation
randomly_generated_data = np.random.normal(mu, std, n)
Here you are! You now know how to generate normally distributed data with NumPy using Python.
More on statistics
If you liked what you read and want to know more about how to apply Statistics in Python and avoid a few headaches... check out the other articles I wrote by clicking just here:
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Related Articles
Continue your learning journey with these related topics
Master Data Science in Days, Not Months 🚀
Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚