solomo...@gmail.com
unread,May 21, 2015, 12:19:44 PM5/21/15Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to manopt...@googlegroups.com
Hello,
I'm working on a project that seems like it would lend itself well to implementation with Manopt but my familiarity with optimizing over a manifold with a complicated objective function is limited. If there are any experts that could assist, I would greatly appreciate it.
A. Objective Data and Parameters
Data: Panel of descriptor data, x, where the number of observations per date is not balanced. y_x target for prediction for each observation on each date.
Parameters to learn: 1. Feature dictionary, D (matrix), that is fixed (neither time or observation varying, though this may be relaxed in future iterations). The columns are constrained individually to have unit length. 2. y_d a time series of vectors that are unconstrained and live in euclidean space.
B. Objective Function (Cost)
Within each date, for simplicity of explanation, the objective function is characterized as:
argmin{D,y_d(t)} 0.5 * (1/n(t)) sum([y_x(t) - x(t) * D * y_d(t)]^2)
Aggregating over all dates, the objective becomes the simple average over all periods (I wont write it out because it looks messy outside of latex style print).
C. Gradient
One of the issues with this formulation is that the matrix derivative is not defined for x(t) * D * y_d(t), where the forms are matrix/matrix/vector. It is however defined for the case where x(t) is a vector. Hence, for any given date slice, I believe I can average the derivatives for each observation in an explicit form to get the derivative I care about for each date, and similarly average over all dates to get the derivative for the full panel.
Road blocks:
1. Specifying Cost and Gradient.
I believe specifying the manifold can be done as easily as:
manifold = productmanifold(struct('D', obliquefactory(x_cols,20), 'y_d', euclideanfactory(x_rows,1)));
But specifying the cost and gradient are more complicated. Is it ok to write a function cost with a loop in it a then reference the handle? Similarly for the gradient, I believe I need to specify the update for both the variables differently since they are being optimized over different manifolds but they are similarly complicated and I do not quite understand how to combine them.
One of the vexing issues is that the y_d vector is time specific and can be updated with a simpler cost/grad structure, but the update for D needs the data for all dates. For this reason I think that specifying the updates for D, y_d(t) each date is preferred, then then averaging over all t in order to get the correct update for D.
2. Solving.
Once I have the problem specified, I would like to solve it by gradient descent. I would like to get back the euclidean update for each step, define a learning rate and take the 'step', and continue to iterate. Is it possible to do this? Or is there a more efficient way.