numpy: Introduction¶

A Difference in Speed¶

Let's import the numpy module.

import numpy as np

n = 10  # CHANGE ME
a1 = list(range(n))
a2 = np.arange(n)

if n <= 10:
    print(a1)
    print(a2)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0 1 2 3 4 5 6 7 8 9]

%timeit [i**2 for i in a1]

100000 loops, best of 3: 2.41 µs per loop

%timeit a2**2

The slowest run took 34.86 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 400 ns per loop

Numpy Arrays: much less flexible, but:

much faster
less memory

Why the difference?¶

# (This cell contains a bunch of voodoo that
# helps with the graphics below. You don't need to
# know what this does.)

%load_ext gvmagic
from objgraph_helper import dot_refs

The gvmagic extension is already loaded. To reload it, use:
  %reload_ext gvmagic

%dotstr dot_refs([a1])

%dotstr dot_refs([a2])

a2.strides

(8,)

Ways to create a numpy array¶

Casting from a list

np.array([1,2,3])

array([1, 2, 3])

linspace

np.linspace(-1, 1, 10)

array([-1.        , -0.77777778, -0.55555556, -0.33333333, -0.11111111,
        0.11111111,  0.33333333,  0.55555556,  0.77777778,  1.        ])

zeros

np.zeros((10,10), np.float64)

array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

Operations on arrays¶

These propagate to all elements:

a = np.array([1.2, 3, 4])
b = np.array([0.5, 0, 1])

Addition, multiplication, power, .. are all elementwise:

a+b

array([ 1.7,  3. ,  5. ])

a*b

array([ 0.6,  0. ,  4. ])

a**b

array([ 1.09544512,  1.        ,  4.        ])

Matrix multiplication is np.dot(A, B) for two 2D arrays.

Important Attributes¶

Numpy arrays have two (most) important attributes:

a = np.random.rand(5, 4, 3)
a.shape

(5, 4, 3)

The .shape attribute contains the dimensionality array as a tuple. So the tuple (5,4,3) means that we're dealing with a three-dimensional array of size $5 \times 4 \times 3$.

(numpy.random.rand just generates an array of random numbers of the given shape.)

a.dtype

dtype('float64')

Other dtypes include np.complex64, np.int32, ...