It is not well documented, not really part of the public API and too
low-level, but you can use scipy.sparse.sparsetools. As it is
implemented in C++, it should be both cpu and memory efficient:
I am using the following function to normalize each row of a CSR matrix:
def normalize_pairs(pairs):
"""Normalized rows of the pairs matrix so that sum(row) == 1 (or 0 for
empty rows).
Note
----
Does the modificiation in-place."""
factor = pairs.sum(axis=1)
nnzeros = np.where(factor > 0)
factor[nnzeros] = 1 / factor[nnzeros]
factor = np.array(factor)[0]
if not pairs.format == "csr":
raise ValueError("csr only")
csr_scale_rows(pairs.shape[0], pairs.shape[1], pairs.indptr,
pairs.indices,
pairs.data, factor)
return pairs
I don't advise using this function if reliability is a concern, but it
works well for matrices bigger than the ones you are mentioning,
cheers,
David
_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user