Plotting Distributions with Histograms

In [1]:
import numpy as np
import matplotlib.pyplot as plt

Make $n$ realizations of a uniformly distributed 'random variable' between $[0, 1)$:

In [25]:
n = 1000000
In [26]:
x = np.random.rand(n)

Now plot a histogram of the values of $x$. Explore the normed=True and bins=N arguments to hist().

In [35]:
plt.hist(x, normed=True, bins=20)
Out[35]:
(array([ 1.00050314,  0.99952314,  0.99558313,  1.00068314,  0.99910314,
         1.00654316,  0.99696313,  1.00544316,  1.00206315,  0.99656313,
         1.00090314,  0.99814313,  0.99738313,  1.00246315,  0.99800313,
         0.99722313,  1.00034314,  1.00536316,  1.00016314,  0.99710313]),
 array([  1.90800837e-06,   5.00017510e-02,   1.00001594e-01,
          1.50001437e-01,   2.00001280e-01,   2.50001123e-01,
          3.00000966e-01,   3.50000809e-01,   4.00000652e-01,
          4.50000495e-01,   5.00000338e-01,   5.50000181e-01,
          6.00000024e-01,   6.49999867e-01,   6.99999710e-01,
          7.49999553e-01,   7.99999396e-01,   8.49999239e-01,
          8.99999082e-01,   9.49998925e-01,   9.99998768e-01]),
 <a list of 20 Patch objects>)

Now plot the distributions of some expressions involving $x$:

In [47]:
#expr = -1 + 2 * x
#expr = x**2
#expr = np.sqrt(np.abs(x))
#expr = np.sin(2*np.pi*x)
#expr = np.log(evens)

if 1:
    evens = x[::2]
    odds = x[1::2]
    
    r = np.sqrt(-2*np.log(evens))
    
    expr = np.empty_like(x)
    expr[::2] = r*np.cos(2*np.pi*odds)
    expr[1::2] = r*np.sin(2*np.pi*odds)
    
In [48]:
plt.hist(x, label="$x$", normed=True, bins=20)
plt.hist(expr, label="Expression", normed=True, bins=20)
plt.legend(loc="best")
Out[48]:
<matplotlib.legend.Legend at 0x7f6a43fba080>
  • Any observations about the bin widths?
  • About that last one: What's the range of r?
  • This last one is called the Box-Muller transform. From two uniformly-distributed random variables, it yields two random variables with a standard normal distribution.
In [ ]: