Generating data can be useful in many ways. It is often used when you want to
- Do a Monte-Carlo simulation (E.g. to test out whether a model is working or not)
- Create a statistical model that utilizes random noise. (E.g. random walk)
- Create synthetic data. (E.g. based on an existing distribution)
We call those processes "Data Generative Processes" or DGA.
When data is generated according to what we observe in the real world.
Enough talking, let's code
Let's generate data using NumPy's normally distributed data generator numpy.random.normal().
To this function, we have to pass three arguments.
- The mean of our distribution: mu
- A standard deviation: std
- How many random numbers do we want: n
import numpy as np # We set our three arguments mu = 0 std = 1 n = 50 # We generate a list of n randomly distributed observation randomly_generated_data = np.random.normal(mu, std, n)
Here you are! You now know how to generate normally distributed data with NumPy using Python.
More on statistics
If you liked what you read and want to know more about how to apply Statistics in Python and avoid a few headaches... check out the other articles I wrote by clicking just here: