Copyright (C) 2020 Andreas Kloeckner
import numpy as np
import matplotlib.pyplot as pt
import scipy.sparse as sps
COOrdinate format is typically convenient for building ("assembling") a sparse matrix:
data = [5, 6, 7]
rows = [1, 1, 2]
columns = [2, 4, 6]
A = sps.coo_matrix(
(data, (rows, columns)),
shape=(10, 10), dtype=np.float64)
A
/usr/local/lib/python3.5/dist-packages/IPython/core/formatters.py:92: DeprecationWarning: DisplayFormatter._ipython_display_formatter_default is deprecated: use @default decorator instead. def _ipython_display_formatter_default(self): /usr/local/lib/python3.5/dist-packages/IPython/core/formatters.py:669: DeprecationWarning: PlainTextFormatter._singleton_printers_default is deprecated: use @default decorator instead. def _singleton_printers_default(self):
<10x10 sparse matrix of type '<class 'numpy.float64'>' with 3 stored elements in COOrdinate format>
A.todense()
matrix([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 5., 0., 6., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 7., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
A.nnz
3
pt.spy(A)
<matplotlib.lines.Line2D at 0x7fcfcdd38f28>
For a COO matrix, the juicy attributes are data
, row
, and col
.
print("row:", A.row)
print("col:", A.col)
print("data:", A.data)
row: [1 1 2] col: [2 4 6] data: [ 5. 6. 7.]
COOrdinate format is not the only format.
There is also Compressed Sparse Row:
Acsr = A.tocsr()
Acsr
<10x10 sparse matrix of type '<class 'numpy.float64'>' with 3 stored elements in Compressed Sparse Row format>
For Compressed Sparse Row, look in data
, indptr
, and indices
.
print("indptr:", Acsr.indptr)
print("indices:", Acsr.indices)
print("data:", Acsr.data)
indptr: [0 0 2 3 3 3 3 3 3 3 3] indices: [2 4 6] data: [ 5. 6. 7.]
The following code randomly generates a sparse matrix that has a given fill_percent
percentage of nonzero entries:
fill_percent = 5
size = 1000
nentries = size**2 * fill_percent // 100
data = np.random.randn(nentries)
rows = (np.random.rand(nentries)*size).astype(np.int32)
columns = (np.random.rand(nentries)*size).astype(np.int32)
B_coo = sps.coo_matrix(
(data, (rows, columns)),
shape=(size, size), dtype=np.float64)
B_csr = sps.csr_matrix(B_coo)
B_dense = B_coo.todense()
Next, we time matrix-vector multiplication for different versions of B
:
vec = np.random.randn(size)
from time import time
start = time()
for i in range(2000):
B_dense.dot(vec)
print("time: %g" % (time() - start))
time: 1.96073