Julia has support for sparse vectors and sparse matrices in the SparseArrays stdlib module. Sparse arrays are arrays that contain enough zeros that storing them in a special data structure leads to savings in space and execution time, compared to dense arrays.
In Julia, sparse matrices are stored in the Compressed Sparse Column (CSC) format. Julia sparse matrices have the type SparseMatrixCSCTv,Ti, where Tv is the type of the stored values, and Ti is the integer type for storing column pointers and row indices. The internal representation of SparseMatrixCSC is as follows:
The compressed sparse column storage makes it easy and quick to access the elements in the column of a sparse matrix, whereas accessing the sparse matrix by rows is considerably slower. Operations such as insertion of previously unstored entries one at a time in the CSC structure tend to be slow. This is because all elements of the sparse matrix that are beyond the point of insertion have to be moved one place over.
If you have data in CSC format from a different application or library, and wish to import it in Julia, make sure that you use 1-based indexing. The row indices in every column need to be sorted, and if they are not, the matrix will display incorrectly. If your SparseMatrixCSC object contains unsorted row indices, one quick way to sort them is by doing a double transpose. Since the transpose operation is lazy, make a copy to materialize each transpose.
In some applications, it is convenient to store explicit zero values in a SparseMatrixCSC. These are accepted by functions in Base (but there is no guarantee that they will be preserved in mutating operations). Such explicitly stored zeros are treated as structural nonzeros by many routines. The nnz function returns the number of elements explicitly stored in the sparse data structure, including non-structural zeros. In order to count the exact number of numerical nonzeros, use count(!iszero, x), which inspects every stored element of a sparse matrix. dropzeros, and the in-place dropzeros!, can be used to remove stored zeros from the sparse matrix.
Sparse vectors are stored in a close analog to compressed sparse column format for sparse matrices. In Julia, sparse vectors have the type SparseVectorTv,Ti where Tv is the type of the stored values and Ti the integer type for the indices. The internal representation is as follows:
The simplest way to create a sparse array is to use a function equivalent to the zeros function that Julia provides for working with dense arrays. To produce a sparse array instead, you can use the same name with an sp prefix:
The sparse function is often a handy way to construct sparse arrays. For example, to construct a sparse matrix we can input a vector I of row indices, a vector J of column indices, and a vector V of stored values (this is also known as the COO (coordinate) format). sparse(I,J,V) then constructs a sparse matrix such that S[I[k], J[k]] = V[k]. The equivalent sparse vector constructor is sparsevec, which takes the (row) index vector I and the vector V with the stored values and constructs a sparse vector R such that R[I[k]] = V[k].
The inverse of the sparse and sparsevec functions is findnz, which retrieves the inputs used to create the sparse array. findall(!iszero, x) returns the Cartesian indices of non-zero entries in x (including stored entries equal to zero).
Arithmetic operations on sparse matrices also work as they do on dense matrices. Indexing of, assignment into, and concatenation of sparse matrices work in the same way as dense matrices. Indexing operations, especially assignment, are expensive, when carried out one element at a time. In many cases it may be better to convert the sparse matrix into (I,J,V) format using findnz, manipulate the values or the structure in the dense vectors (I,J,V), and then reconstruct the sparse matrix.
The following table gives a correspondence between built-in methods on sparse matrices and their corresponding methods on dense matrix types. In general, methods that generate sparse matrices differ from their dense counterparts in that the resulting matrix follows the same sparsity pattern as a given sparse matrix S, or that the resulting sparse matrix has density d, i.e. each matrix element has a probability d of being non-zero.
Create a sparse matrix S of dimensions m x n such that S[I[k], J[k]] = V[k]. The combine function is used to combine duplicates. If m and n are not specified, they are set to maximum(I) and maximum(J) respectively. If the combine function is not supplied, combine defaults to + unless the elements of V are Booleans in which case combine defaults to . All elements of I must satisfy 1
Parent of and expert driver for sparse; see sparse for basic usage. This method allows the user to provide preallocated storage for sparse's intermediate objects and result as described below. This capability enables more efficient successive construction of SparseMatrixCSCs from coordinate representations, and also enables extraction of an unsorted-column representation of the result's transpose at no additional cost.
This method consists of three major steps: (1) Counting-sort the provided coordinate representation into an unsorted-row CSR form including repeated entries. (2) Sweep through the CSR form, simultaneously calculating the desired CSC form's column-pointer array, detecting repeated entries, and repacking the CSR form with repeated entries combined; this stage yields an unsorted-row CSR form with no repeated entries. (3) Counting-sort the preceding CSR form into a fully-sorted CSC form with no repeated entries.
Note that some work buffers will still be allocated. Specifically, this method is a convenience wrapper around sparse!(I, J, V, m, n, combine, klasttouch, csrrowptr, csrcolval, csrnzval, csccolptr, cscrowval, cscnzval) where this method allocates klasttouch, csrrowptr, csrcolval, and csrnzval of appropriate size, but reuses I, J, and V for csccolptr, cscrowval, and cscnzval.
Create a sparse vector S of length m such that S[I[k]] = V[k]. Duplicates are combined using the combine function, which defaults to + if no combine argument is provided, unless the elements of V are Booleans in which case combine defaults to .
Create an uninitialized mutable array with the given element type, index type, and size, based upon the given source SparseMatrixCSC. The new sparse matrix maintains the structure of the original sparse matrix, except in the case where dimensions of the output matrix are different from the output.
Create a sparse vector of length m or sparse matrix of size m x n. This sparse array will not contain any nonzero values. No storage will be allocated for nonzero values during construction. The type defaults to Float64 if not specified.
Parent of and expert driver for spzeros(I, J) allowing user to provide preallocated storage for intermediate objects. This method is to spzeros what SparseArrays.sparse! is to sparse. See documentation for SparseArrays.sparse! for details and required buffer lengths.
Note that some work buffers will still be allocated. Specifically, this method is a convenience wrapper around spzeros!(Tv, I, J, m, n, klasttouch, csrrowptr, csrcolval, csccolptr, cscrowval) where this method allocates klasttouch, csrrowptr, and csrcolval of appropriate size, but reuses I and J for csccolptr and cscrowval.
Construct a sparse diagonal matrix from Pairs of vectors and diagonals. Each vector kv.second will be placed on the kv.first diagonal. By default, the matrix is square and its size is inferred from kv, but a non-square size mn (padded with zeros as needed) can be specified by passing m,n as the first arguments.
Construct a sparse matrix with elements of the vector as diagonal elements. By default (no given m and n), the matrix is square and its size is given by length(v), but a non-square size mn can be specified by passing m and n as the first arguments.
This method was added in Julia 1.8. It mimics previous concatenation behavior, where the concatenation with specialized "sparse" matrix types from LinearAlgebra.jl automatically yielded sparse output even in the absence of any SparseArray argument.
Create a random length m sparse vector or m by n sparse matrix, in which the probability of any element being nonzero is independently given by p (and hence the mean density of nonzeros is also exactly p). The optional rng argument specifies a random number generator, see Random Numbers. The optional T argument specifies the element type, which defaults to Float64.
By default, nonzero values are sampled from a uniform distribution using the rand function, i.e. by rand(T), or rand(rng, T) if rng is supplied; for the default T=Float64, this corresponds to nonzero values sampled uniformly in [0,1).
You can sample nonzero values from a different distribution by passing a custom rfn function instead of rand. This should be a function rfn(k) that returns an array of k random numbers sampled from the desired distribution; alternatively, if rng is supplied, it should instead be a function rfn(rng, k).
Create a random sparse vector of length m or sparse matrix of size m by n with the specified (independent) probability p of any entry being nonzero, where nonzero values are sampled from the normal distribution. The optional rng argument specifies a random number generator, see Random Numbers.
Return a vector of the structural nonzero values in sparse array A. This includes zeros that are explicitly stored in the sparse array. The returned vector points directly to the internal nonzero storage of A, and any modifications to the returned vector will mutate A as well. See rowvals and nzrange.
4a15465005