[SciPy-user] Iteration over scipy.sparse matrices?

9 views
Skip to first unread message

Joseph Turian

unread,
May 1, 2008, 3:54:13 PM5/1/08
to SciPy Users List
Is there a (storage-format agnostic) method for iterating over the elements of a sparse matrix?
I don't care what order they come in. I just want to make sure that I can iterate over the matrix in time linear in nnz, and have the (row, col) and data for each non-zero entry.

Thanks!

  Joseph

--
Academic: http://www-etud.iro.umontreal.ca/~turian/
Business: http://www.metaoptimize.com/

Nathan Bell

unread,
May 1, 2008, 5:32:11 PM5/1/08
to SciPy Users List
On Thu, May 1, 2008 at 2:54 PM, Joseph Turian <tur...@gmail.com> wrote:
> Is there a (storage-format agnostic) method for iterating over the elements
> of a sparse matrix?
> I don't care what order they come in. I just want to make sure that I can
> iterate over the matrix in time linear in nnz, and have the (row, col) and
> data for each non-zero entry.

In the current SVN version of SciPy all sparse matrices may be
converted to the "coordinate" format using the .tocoo() member
function. Alternatively, one may pass any matrix (sparse or dense) to
the coo_matrix constructor.

Using the COO format makes iteration trivial:

M = .... #sparse or dense matrix
A = coo_matrix(M)

for i,j,v in zip(A.row, A.col, A.data):
print "row = %d, column = %d, value = %s" % (i,j,v)


Some sparse matrices support a rowcol() method that does something
similar without making a conversion. However, rowcol() will
deprecated in the next release since it's much slower than doing a
single .tocoo().

--
Nathan Bell wnb...@gmail.com
http://graphics.cs.uiuc.edu/~wnbell/
_______________________________________________
SciPy-user mailing list
SciPy...@scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user

Nathan Bell

unread,
May 1, 2008, 5:40:05 PM5/1/08
to SciPy Users List
On Thu, May 1, 2008 at 4:32 PM, Nathan Bell <wnb...@gmail.com> wrote:
> for i,j,v in zip(A.row, A.col, A.data):
> print "row = %d, column = %d, value = %s" % (i,j,v)

I should mention that since A.row, A.col, and A.data are all numpy
arrays you can often vectorize sparse computations with them.

For instance, suppose we wanted to eliminate all entries of a
coo_matrix A that are less than 5 and store the result in a matrix B:

A = coo_matrix(....)
mask = A.data < 5
B = coo_matrix( (data[mask],(row[mask],col[mask])), shape=A.shape)

As another example, extract all the entries above the diagonal:

A = coo_matrix(....)
mask = A.col > A.row
B = coo_matrix( (data[mask],(row[mask],col[mask])), shape=A.shape)

Since conversions to and from the COO format are quite fast, you can
use this approach to efficiently implement lots computations on sparse
matrices.

Reply all
Reply to author
Forward
0 new messages