# How are numpy arrays stored?¶

In [1]:
import numpy as np


Numpy presents an n-dimensional abstraction that has to be fit into 1-dimensional computer memory.

Even for 2 dimensions (matrices), this leads to confusion: row-major, column-major.

In [4]:
A = np.arange(9).reshape(3, 3)
print(A)

[[0 1 2]
[3 4 5]
[6 7 8]]


## Strides and in-memory representation¶

How is this represented in memory?

In [6]:
A.strides

Out[6]:
(24, 8)
• strides stores for each axis by how many bytes one needs to jump to get from one entry to the next (in that axis)
• So how is the array above stored?
• This captures row-major ("C" order) and column-major ("Fortran" order), but is actually much more general.

We can also ask for Fortran order:

In [10]:
A2 = np.arange(9).reshape(3, 3, order="F")
A2

Out[10]:
array([[0, 3, 6],
[1, 4, 7],
[2, 5, 8]])

numpy defaults to row-major order.

In [11]:
A2.strides

Out[11]:
(8, 24)

## Strides and Contiguity¶

How is the stride model more general than just saying "row major" or "column major"?

In [15]:
A = np.arange(16).reshape(4, 4)
A

Out[15]:
array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15]])
In [18]:
A.strides

Out[18]:
(32, 8)
In [14]:
Asub = A[:3, :3]
Asub

Out[14]:
array([[ 0,  1,  2],
[ 4,  5,  6],
[ 8,  9, 10]])

Recall that Asub constitutes a view of the original data in A.

In [19]:
Asub.strides

Out[19]:
(32, 8)

Now Asub is no longer a contiguous array!

From the linear-memory representation (as show by the increasing numbers in A) 3, 7, 11 are missing.

This is easy to check by a flag:

In [20]:
Asub.flags

Out[20]:
  C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False