Pencil Matrices of IVP and LBVP

316 views
Skip to first unread message

Paul Michael Mannix

unread,
Feb 22, 2022, 11:14:11 AM2/22/22
to Dedalus Users
Hi all, 

When I access the matrices of either the IVP or LBVP solver routines using:

inst = getattr(self.pencil[p] , 'L') 

for example, I notice that the left preconditioned 'pre_left' matrix is non-square but has shape (Nz,2*Nz), while all the other matrices including 'L' and 'L_exp', which are already left preconditioned and fully left and right preconditioned respectively? are square (Nz,Nz). 

Correspondingly the evaluator corresponding to the right hand side terms

self.F.get_pencil(self.pencils[pp]).shape

returns a vector of length (2*Nz).

Could you explain why these dimensions are different and if it is possible to return/reshape all vectors and matrices to shapes (Nz) and (Nz,Nz) respectively.

My question is motivated by the need to compute the discrete adjoint of IVP and LBVP routines. When taking the Hermitian conjugate of the preconditioned operators the order of matrix multiplication changes, which results in problems of the form Ax=b with non-square matrices.

Thank you for your help in advance and I'm happy to give more details about the problem of course.

Best,
Paul

Keaton Burns

unread,
Feb 22, 2022, 11:27:04 AM2/22/22
to dedalu...@googlegroups.com
Hi Paul,

The matrix names are a bit confusing, but putting that aside for a second, the linear system being solved for the LBVP (and similar for the IVP but with M and dt’s) can be written as:

L . X = F

If we left and right precondition this with sparse preconditioners PL and PR, we get:

PL . L . PR . PR^{-1} . X = PL . F

The preconditioners incorporate a number of things, includeing changes of bases, reordering the entries, and dropping the equations for certain modes due to the equation “conditions”, etc. This is why the preconditioners are not square in general. We can rewrite this as

(PL . L . PR) . Y = (PL . F)

This is the system that’s is efficient to solve. So the solution stages are

Y = (PL . L . PR) \ (PL . F)
X = PR . Y

However in some stages of the algorithm, we need to explicitly compute (PL . L . X) for a given X, so we store both (PL . L) as “pencil.L” and (PL . L . PR) as “pencil.L_exp” — there’s technical reasons for this notation, but not worth getting into it.

Hopefully that helps.  I’d suggest working with the fully preconditioned version if possible. The only caveat is that you never want to actually form PR^{-1} since it will be dense in general. We get away without doing that because we just solve for Y, and then apply PR to recover X.

Best,
-Keaton








--
You received this message because you are subscribed to the Google Groups "Dedalus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dedalus-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dedalus-users/591df133-c7f6-45a8-a343-657a125f5b7fn%40googlegroups.com.

Paul Michael Mannix

unread,
Feb 23, 2022, 4:42:26 AM2/23/22
to Dedalus Users
Hi Keaton,

Thanks for your reply that definitely clarifies the matrix names associated with each pencil. 

I appreciate your suggestion to work with the preconditioned system, however for the optimal control problem I'm working on this isn't possible. The equation I must solve is derived from Dedalus's preconditioned implementation of the LVBP and IVP solvers. I'm using these solvers to compute an objective function, .e.g. the time integrated kinetic energy of plane Couette flow, and in order to compute the gradient of this objective function, I must then derive an adjoint equation which is essentially the conjugate transpose of the Jacobian of the preconditioned IVP.

Deriving this equation results in a system of the form for each pencil, with ^H denoting the conjugate transpose

(PL . L . PR)^H . PL . X   = PR^H . F

where the matrices/vectors have shapes

( Nz X Nz) . (Nz x 2*Nz) . (Nz) = (Nz x Nz) . (2*Nz)

when accessed as per my previous message. Forming PR^-1 or other matrices doesn't arise fortunately, but their is the issue that this linear system doesn't make sense.

It's for this reason I'm wondering if it is possible to access/reshape PL and F to be (Nz  x Nz) and (Nz) respectively, or perhaps there's an alternative solution?

In essence while preconditioning makes the dual system banded to solve there's a trade-off for these type of optimal control problems as the adjoint operators get a bit uglier. 

Thanks again for your help,

Best,
Paul

Paul Michael Mannix

unread,
Feb 25, 2022, 6:05:36 AM2/25/22
to Dedalus Users
Hi, 

Could anyone offer any further guidance on this problem? Perhaps in addition to the methods paper there's additional documentation detailing the steps in constructing these non-square matrices and the right hand side evaluator?

I will naturally share any code which makes progress on this problem in due course, such that optimal control problems become more accessible for others in Dedalus. Furthermore I think the above issue is closely related to a previous convergence issue  (large gradient norm) flagged in this post: https://groups.google.com/g/dedalus-users/c/JjszhGB2usQ/m/4REB7zSFBAAJ 

Thanks again in advance for your help.

Best,
Paul

Daniel Lecoanet

unread,
Feb 25, 2022, 9:54:32 AM2/25/22
to Dedalus Users
Hi Paul,

I think the current documentation of this is limited to what we describe in the methods paper (probably section IX is most relevant). Otherwise, you can take a look at the code and see how the PL and PR are generated and what they look like.

Daniel

Calum Skene

unread,
Mar 1, 2022, 7:28:45 AM3/1/22
to Dedalus Users
Hi
I have some thoughts about this.

As Paul points out PL is non-square (of size (Nz x 2*Nz)). As PR is square and size (Nz x Nz), then for the matrix multiplication (PL . L . PR) to make sense it must be that L is size (2Nz x Nz) which means the original unpreconditioned system L.X = F is not square.

I believe this is not true, but there is a missing matrix. F should have the same size as X, i.e. (Nz), but self.F.get_pencil(self.pencils[pp]) is size (2*Nz). So I think that in essence self.F.get_pencil(self.pencils[pp]) is a linear operator of size (2Nz,Nz) which we can call T. So the original system L.X = F, is square (everything of size Nz). But in Dedalus LBVP we see (T.L).X=T.F, which is then preconditioned and solved. Therefore, to adjoint the system we need to take T into account, and then all the matrix dimensions should work out. Is there any documentation on what T is? We can access it via self.F.get_pencil(self.pencils[pp]), but for adjoint purposes we will require a transpose operation.

I hope this makes sense,
Best,
Calum

Keaton Burns

unread,
Mar 1, 2022, 9:35:39 AM3/1/22
to dedalu...@googlegroups.com
Hi everyone,

It’s actually correct that the “original” system L . X = F is not square. The dimension of X is the total variable degrees of freedom (Nz * number of fields), while the dimension of F is the total equation degrees of freedom (Nz * number of equations).  This is made into a square system via the equation conditions and a few other operations (dropping the last row for differential equations, and dropping all but the first row for boundary conditions).  When you do “F.get_pencil”, that’s fetching and concatenating the corresponding pencil data for the RHS of every equation. But not all these equations apply to each pencil, and some modes we want to drop/exclude as described above. This is done by the left preconditioner.

This system is all meant for maximum efficiency and vectorization when copying between the fields and pencil data, etc. So I think for most cases, it really makes the most sense to just work with the fully preconditioned system, since that is always square and invertible (it's what the sparse linear solvers are actually solving), i.e.:

L_exp . Y = F_exp

where L_exp = PL . L . PR, X = PL . Y, and F_exp = PL . F.

Best,
-Keaton



Keaton Burns

unread,
Mar 1, 2022, 9:37:42 AM3/1/22
to dedalu...@googlegroups.com
Whoops typo at the end, should be X = PR . Y

Calum Skene

unread,
Mar 1, 2022, 11:37:32 AM3/1/22
to Dedalus Users
Ah, my mistake. Thanks for the detailed explanation. 
So I think now that the adjoint system is completely fine, but the sizes of X and F are reversed for the transposed solve, i.e. the original system is L.X=F, and the transposed system is L^H.X_adj=F_adj where X_adj has the size of F and F_adj has the size of X. Then all matrix operations agree.
Best,
Calum

Paul Michael Mannix

unread,
Mar 3, 2022, 10:26:17 AM3/3/22
to Dedalus Users
Hi All,

Thanks that is really helpful, apologies for my slow response I just wanted to be certain that I've understood correctly. I understand your point about working with the preconditioned system Keaton, but as I'm concerned with taking the adjoint of Dedalus's IVP implementation, I'm constrained to take the adjoint of the preconditioned IVP as it's implemented and whatever this works out to be is what I must solve. 

My current understanding is that while you write:

L . X = F

as

PL . L . PR . PR^{-1} . X = PL . F

the PL on the left and right hand side of this equation are really not the same, but instead, akin to what Callum suggested, we have something like

PL . L . PR . PR^{-1} . X = PL . T . F

where PL is dim (Nz* number of fields,Nz* number of fields)  and T dim (Nz* number of fields,Nz* number of equations) is mapping of the vector F size (Nz * number of equations) to X a vector of size (Nz * number of fields). If this is the case?? then the system I want to solve, when I work through the algebra for the problem, becomes

(PL . L . PR)^H . PL . X   = PR^H . T . F

which makes sense as the matrices are now of the same rank. In order to solve this however I need to split what Dedalus calls "pre_left" which is non-square into a square PL and a non-square T as I've termed them above.

Following Daniel's suggestion I noticed that these are defined in the _build_coupled_matrices() method of the Pencil class defined in pencil.py. Do you foresee any problems constructing these within this class for each pencil and adding them to the namespace? Or perhaps there's an easier way to extract these?

Thanks again for your help and suggestions,
Paul

Calum Skene

unread,
Mar 21, 2022, 8:02:18 AM3/21/22
to Dedalus Users
Hi
I think that the PL matrices are the same on both sides. I'm not 100% sure about IVP as I was looking into the LBVP part, but I think it will be similar. There is no T in the way that I originally stated as Keaton clarified for L X=F then F has size Nz * number of equations and X has size Nz*number of fields in the state. So F is potentially larger because of the boundary conditions and substitutions. Then L must have size (Nz*number of equations,Nz * number of fields). So the transposed system can be solved in the method you proposed, except that as we are now solving L.T X_adj=F_adj, then F_adj has size Nz*number of fields and X_adj has size Nz * number of equations . So taking account of the new sizes the method you propose for the transposed system should work. 

I think there is a sort of T that can come in for IVP, as I imagine there is a function that takes the state X to the non-linear RHS + bcs, stacked before the linear system is solved. So in a simplified manner going forward one time step looks like X_n ----(form RHS)---> F ---(solve linear system of type L X_n = F)----> X_(n+1). So adjointing would be  X_adj(n+1) ------->(solve adjoint linear system L.T F_adj = X_adj(n+1)-------> F_adj ------>(multiply by Jacobian of form RHS)-----> X_adj n . In this way the Jacobian of the non-linear operator form RHS takes the larger state F_adj back to the state X_adj n.

Does this make sense for what you are encountering in your problem?
Calum

Paul Michael Mannix

unread,
Mar 22, 2022, 12:03:03 PM3/22/22
to Dedalus Users
Hi Calum,

The problem is similar for both the IVP and LBVP as the first always boils down to the second. 

Unfortunately PL is (Nz * number of fields, 2 * Nz * number of fields ) and F  (2 * Nz * number of fields ) in all the examples (using Chebyshev) I've checked (per pencil of course), while all other vectors and matrices are (Nz * number of fields) and (Nz * number of fields, Nz * number of fields) respectively. While it makes sense that F has size Nz * number of equations, I don't see where this comes in? 

As the equation I must solve is 

(PL . L . PR)^H . PL . X   = PR^H .  F
 
the problem doesn't make sense, as its stated, unless of course PL can be made square and F given the size (Nz * number of fields). Also I'd imagine it's best not to change the dimension of X.

I've attached the script for the example poisson problem which shows the sizes are as I've stated above. 

Best,
Paul


poisson.py

Calum Skene

unread,
Mar 22, 2022, 1:30:47 PM3/22/22
to dedalu...@googlegroups.com
Hi
I think the bc's count as equations. So 2 equations + 2bcs = 4 overall
equations.
Best,
Calum

On Tue, 22 Mar 2022 at 16:03, 'Paul Michael Mannix' via Dedalus Users
> You received this message because you are subscribed to a topic in the Google Groups "Dedalus Users" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/dedalus-users/GKf8SoTvla0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to dedalus-user...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/dedalus-users/ebaae6ac-5b37-4baf-a270-0fbb98c10797n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages