Hello,
To explain my question, I first describe the background.
I am working on a bi-level optimization problem that contains one variable required to be orthogonal. The problem can be formulated as:
min_{W, U} f(V*; W, U)
s.t. U'U = I
V*= argmin_{V} g(X; U)
where U' is the transpose of matrix U, and I is the identity matrix.
To optimize this problem, I follow the algorithm as:
-----------------------------
Step 1: Given the initial U, solve the lower function g(X; U) and get V*;
Step 2: Given V*, calculate the derivative of upper function w.r.t W and U;
Step 3: Update W and U by one gradient-based method with these derivatives;
Step 4: Repeat these steps until convergence. And output the optimal W and U.
-------------------------------
However, in my last post, I was failed to ensure the orthogonality of target U during its update.
Lately, I have an idea that uses the core tools in the Manopt to manually ensure such orthogonality. And I hope anyone can give me some advice if this idea can work or not.
My idea is described as:
-------------------------------------------
Step 1: given the t-th iteration U_t and the euclidean derivative d(f; U_t);
Step 2: assume that U is an element on Stiefel manifold, use the related function 'egrad2rgrad' to obtain the Riemannian form of d(f; U_t);
Step 3: Use the 'line search' function to get the step size alpha_t;
Step 4: Use the 'retraction_qr' to generate a new U_t+1 that is orthonomal. Output the U_t+1;
------------------------------------------
Besides, I also have a question about the step_size alpha.
Could I set a fixed value to it? Does this influence the performance a lot?
Best wishes,
Shi