Reusing an LU factorization in time-dependent problems, building on part of a matrix

Kyle Schwiebert

unread,

Nov 13, 2021, 5:20:29 PM11/13/21

to deal.II User Group

Hello all,

I am coding up a large time-dependent flow problem with many variables coupled together. The matrix is composed of 16 blocks, some of which are zero. However, only one of the blocks will change between time steps. Thus it seems I am wasting a lot of time reassembling the whole matrix. Can someone provide a reference to how I can assemble just one block of a BlockSparseMatrix? I know I can do it the obvious way--write a separate assembly routine, make a new matrix, fe_system, dof_handler and so on. Is there a better way?

Additionally, due to that structure, where most of the matrix does not change, I could probably solve much more efficiently too. Currently I am using UMFPACK. Figuring out and implementing a nice preconditioner is another project for another time. However, what I'm hoping is possible is some sort of scheme like:

1) Solve the system directly once, store the LU decomposition and how long it took to solve.

2) Switch to GMRES (nonsymmetric problem) and precondition the next solve with the precomputed LU decomposition. Time the computation.

3) If that solve took less time that the direct one took, do go back to stop 2 for the next solve. If it took longer go to step 1.

I understand every part of this except for how to store and reuse the LU. From what (little) I understand, UMFPACK does literally store the entire LU. How can I get access to it directly. If I can't for whatever reason, I was thinking about using a similar setup with an ILU decomposition. However, I have the same problem there and am back to hoping that someone can tell me how to get access to such a thing. Note that I would like to avoid making a copy of the whole system matrix--as I notice many of the preconditioner types are pass by reference, I can't change it and then use the old values.

Thank you for any help that you can give. A quick copy-paste to a relevant example and a couple words about where to find the information would be more than sufficient help! I was not using the correct search terms to find such examples.

Wolfgang Bangerth

unread,

Nov 15, 2021, 6:13:26 PM11/15/21

to dea...@googlegroups.com

Kyle:

> Additionally, due to that structure, where most of the matrix does not
> change, I could probably solve much more efficiently too. Currently I am
> using UMFPACK. Figuring out and implementing a nice preconditioner is
> another project for another time. However, what I'm hoping is possible
> is some sort of scheme like:
>
> 1) Solve the system directly once, store the LU decomposition and how
> long it took to solve.
> 2) Switch to GMRES (nonsymmetric problem) and precondition the next
> solve with the precomputed LU decomposition. Time the computation.
> 3) If that solve took less time that the direct one took, do go back to
> stop 2 for the next solve. If it took longer go to step 1.

Yes, that scheme makes sense. The SparseDirectUMFPACK::factorize() and
SparseDirectUMFPACK::solve() functions make that possible where you
would call the solve() function that only takes vector(s) as argument.

> I understand every part of this except for how to store and reuse the
> LU. From what (little) I understand, UMFPACK does literally store the
> entire LU. How can I get access to it directly.

You don't. UMFPACK stores it internally, and solve() uses it.

> If I can't for whatever
> reason, I was thinking about using a similar setup with an ILU
> decomposition. However, I have the same problem there and am back to
> hoping that someone can tell me how to get access to such a thing. Note
> that I would like to avoid making a copy of the whole system matrix--as
> I notice many of the preconditioner types are pass by reference, I can't
> change it and then use the old values.

The situation is actually the same with ILUs: There are separate
functions that *compute* a factorization and that *use* a factorization.

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Kyle Schwiebert

unread,

Nov 25, 2021, 11:03:32 PM11/25/21

to deal.II User Group

Hello all,

I wasn't sure if I should start a new thread as this is a very related and basic question. Before I got the impression that switching from an exact LU (UMFAPCK) to an ILU would be a good first try when UMFPACK started too much memory for my problem. However, it appears that the SparseILU class does not play nicely with BlockSparseMatrix. I know there are probably better options out there, but I am not yet to the point of needing a highly optimized code with an optimal preconditioner. Are there any simple workarounds that would allow me to compute an incomplete LU decomposition of a BlockSparseMatrix interpreted as one big matrix? I hope I'm not missing anything too obvious: It doesn't look like BlockSparseMatrix is a derived class of SparseMatrix. It also doesn't look like there is an easy way to convert a BlockSparseMatrix to SparseMatrix. At first I thought that I could just apply SparseILU to each block (make a BlockSparseMatrix with each block being of type SparseILU), but I do not think that is equivalent, and perhaps may not even be very helpful at all.

Thank you.

Wolfgang Bangerth

unread,

Nov 29, 2021, 11:46:32 PM11/29/21

to dea...@googlegroups.com

> I wasn't sure if I should start a new thread as this is a very related and
> basic question. Before I got the impression that switching from an exact LU
> (UMFAPCK) to an ILU would be a good first try when UMFPACK started too much
> memory for my problem. However, it appears that the SparseILU class does not
> play nicely with BlockSparseMatrix.

That is correct. But if you want to just build a preconditioner for the entire
matrix, then there is also no reason to substructure the matrix into
individual blocks -- you might as well work with one matrix and one vector
that correspond to all variables at once.

Block matrices/vectors are a tool to build preconditioners that work on
specific blocks of a matrix. If that's not your goal, there is no need to use
this tool.

Kyle Schwiebert

unread,

Dec 4, 2021, 12:14:00 PM12/4/21

to deal.II User Group

Hello,

Thank you for the reply. I totally get what you are saying about the block structure existing purely as a tool for solvers such as Schur complements. However, after some thought, I wouldn't know how to do some of the "tricks" I'm performing when it comes to matrix assembly outside of a block context.

Before I said that my system is a 4 by 4 block system (some blocks having dim components, others being scalar). However, only one block changes with time. Thus I can save lots of time in assembly using

system_matrix.block(0,0) = 0;

Followed by only assembling the corresponding block and skipping all other calculations with an if statement. I think the assembly function would still work if system_matrix were not a block matrix, but I haven't been able to discover an alternative for setting just one part of the matrix to zero. Unfortunately, forming only part of the matrix is such a massive speedup that it would be painful to have to drop that optimization. Could you suggest an alternative?

Regards,

Kyle

Wolfgang Bangerth

unread,

Dec 5, 2021, 11:35:54 PM12/5/21

to dea...@googlegroups.com

On 12/4/21 10:14 AM, Kyle Schwiebert wrote:
>
> Followed by only assembling the corresponding block and skipping all other
> calculations with an if statement. I think the assembly function would still
> work if system_matrix were not a block matrix, but I haven't been able to
> discover an alternative for setting just one part of the matrix to zero.
> Unfortunately, forming only part of the matrix is such a massive speedup that
> it would be painful to have to drop that optimization. Could you suggest an
> alternative?

Not really. You've discovered the limits of the separate use cases for block
and non-block matrices :-(

The best I can offer is to use DoFTools::extract_dofs() that correspond to a
specific vector component, and then loop through these DoFs with variables i,j
to set the entries (i,j) in the matrix to zero. That will ultimately not scale
well to large problems (it's quadratic in the number of variables that
correspond to this vector component), but it may be good enough for you. If
that's too inefficient, you can do the same on a cell-by-cell basis, based on
calling fe.component_to_system_index().

Reply all

Reply to author

Forward