Merging two distributed matrices?

18 views
Skip to first unread message

bcsj

unread,
Jul 27, 2023, 5:13:25 AM7/27/23
to SLATE User
If I have two slate matrices A and B, is there a simple built-in method in slate to join them into a matrix C by horizontal or vertical block stacking? 

Assuming of course that the matrices have similar block sizes in the relevant dimension.

Ideally without reallocation, but I'll take copies too if it saves me from writing a function for it myself.

Mark Gates

unread,
Jul 28, 2023, 11:51:46 AM7/28/23
to bcsj, SLATE User
It's easier to make a single matrix C and then split it into 2 matrices A and B.

Matrix C ...
// C = [ A, B ]
auto A = C.sub( i0, i1, j0, j1 );  // by tile indices i0, i1, j0, j1
auto B = C.sub( i1+1, i2, j0, j1 );
or
auto A = C.slice( ii0, ii1, jj0, jj1 );  // by row/col indices ii0, ii1, jj0, jj1
auto B = C.slice( ii1+1, ii2, jj0, jj1 );

If A and B are declared separately, then their distribution may not match the distribution of C, making things complicated. E.g., consider the usual 2D block-cyclic distribution with a 1 x 2 MPI process grid, and A has 1 x 3 tiles, B has 1 x 3 tiles, C has 1 x 6 tiles.

A tiles are on MPI processes [ 0, 1, 0 ]
B tiles are on MPI processes [ 0, 1, 0 ]
C tiles are on MPI processes [ 0, 1, 0, 1, 0, 1 ]

So B would necessarily need to be redistributed from [ 0, 1, 0 ] to [ 1, 0, 1 ]. Using the C.sub above avoids that issue; B will have tiles on [ 1, 0, 1 ].

Mark

Mark Gates

unread,
Jul 28, 2023, 11:55:31 AM7/28/23
to bcsj, SLATE User
Sorry, I specified horizontal stacking, C = [ A, B ], but wrote code for vertical stacking. I fixed it below, as noted.

On Fri, Jul 28, 2023 at 11:51 AM Mark Gates <mga...@icl.utk.edu> wrote:
It's easier to make a single matrix C and then split it into 2 matrices A and B.

Matrix C ...
// C = [ A, B ]
auto A = C.sub( i0, i1, j0, j1 );  // by tile indices i0, i1, j0, j1
auto B = C.sub( i0, i1, j1+1, j2 );  // fixed
or
auto A = C.slice( ii0, ii1, jj0, jj1 );  // by row/col indices ii0, ii1, jj0, jj1
auto B = C.slice( ii0, ii1, jj1+1, jj2 );  // fixed

If A and B are declared separately, then their distribution may not match the distribution of C, making things complicated. E.g., consider the usual 2D block-cyclic distribution with a 1 x 2 MPI process grid, and A has 1 x 3 tiles, B has 1 x 3 tiles, C has 1 x 6 tiles.

A tiles are on MPI processes [ 0, 1, 0 ]
B tiles are on MPI processes [ 0, 1, 0 ]
C tiles are on MPI processes [ 0, 1, 0, 1, 0, 1 ]

So B would necessarily need to be redistributed from [ 0, 1, 0 ] to [ 1, 0, 1 ]. Using the C.sub above avoids that issue; B will have tiles on [ 1, 0, 1 ].

Mark

--
Innovative Computing Laboratory
University of Tennessee, Knoxville

bcsj

unread,
Jul 31, 2023, 4:54:54 AM7/31/23
to SLATE User, mga...@icl.utk.edu, SLATE User, bcsj
That makes a lot of sense, thank you!

If I slice the matrix. Regarding the "bottom" tiles, which might be only part of a full tile, is there anything I should be aware off when writing data to those? 
You know, since the array must be getting split up into chunks in memory? 
Like if I do

auto A = C.slice(...);
auto tile = A(<"last index">, 0);
auto tiledata = tile.data();

for (int jj; ...) {
  for (int ii; ...) {
    tiledata[ii + jj * tile.stride()] = <"some_data">;
  }
}

will indexing and stride and so on just work?
What if I do A.tileGetForReading(...) with a different slate::LayoutConvert::Row-/ColMajor?

Should I worry about anything? or just expect that it will work correctly under the hood?

On a related, but slightly different node. If I have the data for populating the matrix in a csv file, does slate have any built-in method for populating matrices from files? csv or other types?
Maybe that is a question better for another thread though? Let me know and I'll make a new one.


Mark Gates

unread,
Jul 31, 2023, 8:32:59 AM7/31/23
to bcsj, SLATE User
Indexing should work fine with slicing. If the top-left tile is sliced, some SLATE algorithms will fail. Most algorithms on the CPU should work fine, although some may make an assumption that tiles are a fixed size. Most algorithms on the GPU won't currently work because we use fixed-size batch BLAS.

We don't have any routines for reading data right now, though that would certainly be useful.

Mark

bcsj

unread,
Jul 31, 2023, 11:37:24 AM7/31/23
to SLATE User, mga...@icl.utk.edu, SLATE User, bcsj
Great!

I wrote the following function which reads a text file (comma delimited by default) into a matrix A. In case it is of interest to anyone.
I'm pretty new to c++, relatively speaking, so perhaps there are much better ways to do this, but at least it seems to work in my initial small scale testing.

/// header.hh
enum class Status : int {
    Error = 0,
    Success = 1,
};

template <typename scalar_t>
Status matrixFromFile(slate::Matrix<scalar_t> A,
    const std::string filename, const char delimiter = ',');

/// code.cc
template <typename scalar_t>
Status matrixFromFile(
    slate::Matrix<scalar_t> A,
    const std::string filename,
    const char delimiter)
{
 
    std::ifstream file(filename);
    std::stringstream ss("");

    std::string line, element;

    int64_t mt = A.mt();
    int64_t nt = A.nt();

    int64_t skips, skip_counter = 0;
    int64_t mb, nb, i, j, ii, jj;
    for (i = 0; i < mt; i++) {
        mb = A.tileMb(i);
        for (ii = 0; ii < mb; ii++) {

            if (!std::getline(file, line))
                return Status::Error;
           
            ss << line;

            skip_counter = 0;
            for (j = 0; j < nt; j++) {
                nb = A.tileNb(j);
                if (A.tileIsLocal(i, j)) {

                    // Skip elements for non-local tiles.
                    for (skips = 0; skips < skip_counter; skips++)
                        if(!std::getline(ss, element, delimiter))
                            return Status::Error;
                    skip_counter = 0;

                    // First time accessing tile, getForWriting in RowMajor;
                    if (ii == 0) A.tileGetForWriting(i, j, slate::LayoutConvert::RowMajor);

                    // Get tile data
                    auto tile = A(i, j);
                    auto tiledata = tile.data();

                    // Insert line of elements
                    for (jj = 0; jj < nb; jj++) {
                       
                        if (!std::getline(ss, element, delimiter))
                            return Status::Error;

                        tiledata[jj + ii * tile.stride()] = stod(element); // string-to-double

                    }

                } else { skip_counter += nb; }
            }
           
            ss.str("");
            ss.clear();
        }
       
    }
    return Status::Success;

}
Reply all
Reply to author
Forward
0 new messages