Manopt on GPU ?

87 views
Skip to first unread message

stum...@gmail.com

unread,
Apr 27, 2018, 6:54:22 PM4/27/18
to Manopt
Would it be feasible to execute the Manopt algorithms on a GPU card for large optimization problems? MATLAB is able to perform the computation of many functions on the GPU: https://www.mathworks.com/discovery/matlab-gpu.html

BM

unread,
Apr 27, 2018, 11:26:33 PM4/27/18
to Manopt
Hello,

It should work. We had tried it earlier and it had worked then.

Regards,
BM

stum...@gmail.com

unread,
Apr 27, 2018, 11:33:08 PM4/27/18
to Manopt
How does one modify Manopt to execute the computations on the GPU? Does one just use gpuArray to declare the cost function parameters and then use gather on the output from the Manopt solver? Is there sample code demonstrating this?

BM

unread,
Apr 27, 2018, 11:49:17 PM4/27/18
to Manopt
We don't have the code now. It was just an initial trial and a very basic on the the data input thing. Might not have been a full manopt program.

Nicolas Boumal

unread,
Apr 28, 2018, 2:31:59 PM4/28/18
to Manopt
Hello,

what did you have in mind specifically? A particular manifold?

In general, you can do whatever you want to code up the cost function and its derivatives. But I imagine you'd like to avoid having to gpuArray(..) and gpuGather(...) the point on the manifold at every call. In that case, I suspect that one may need to create a new factory (a function that creates a manifold structure), which would look very similar to the "standard" factory, except it would explicitly work with gpuArrays all the time, both for points and for tangent vectors. That shouldn't be too complicated, and we'll be happy to help.

Best,
Nicolas

stum...@gmail.com

unread,
Apr 29, 2018, 1:16:11 PM4/29/18
to Manopt
I'm interested in a cost function defined on n copies of the complex circle.

Nicolas Boumal

unread,
Apr 30, 2018, 10:28:52 AM4/30/18
to manopt...@googlegroups.com
Looking at complexcirclefactory.m (link to github):

I don't have access to a GPU right now -- can you please try the attached factory and let me know if that works out?

It assumes every point on the manifold, every tangent vector and every "ambient vector" (for example, the classical gradient) is a gpu array. So:

 - in the cost function, you should gather(...) the value of the cost function before returning it; your input point z will already be on the GPU.
 - in the gradient (egrad or grad, or whatever you use), your input z is on the GPU, and your output should also be on the GPU.
 - if you supply an initial guess z0, you need that to be on the GPU as well.

I didn't have a chance to test it, so don't hesitate to hack into it and to ask questions here until we figure it out.

Best,
Nicolas
complexcirclefactory_gpu.m

stum...@gmail.com

unread,
May 1, 2018, 7:54:16 PM5/1/18
to Manopt
Your GPU modification of complexcirclefactory seems to work on my GPU. It is possible to directly create certain kinds of arrays directly on the GPU, which may be more efficient than creating them on the CPU and transferring them onto the GPU using gpuArray: https://www.mathworks.com/help/distcomp/establish-arrays-on-a-gpu.html#bspvmhe-1

I modified your script to create the arrays directly on the GPU. How do I attach a file to the post?

Stuart Rogers

unread,
May 1, 2018, 7:59:13 PM5/1/18
to Manopt
Attached is the modification of your script that creates the arrays directly on the GPU.

On Tue, May 1, 2018 at 6:54 PM, <stum...@gmail.com> wrote:
Your GPU modification of complexcirclefactory seems to work on my GPU. It is possible to directly create certain kinds of arrays directly on the GPU, which may be more efficient than creating them on the CPU and transferring them onto the GPU using gpuArray: https://www.mathworks.com/help/distcomp/establish-arrays-on-a-gpu.html#bspvmhe-1

I modified your script to create the arrays directly on the GPU. How do I attach a file to the post?

--
http://www.manopt.org
---
You received this message because you are subscribed to a topic in the Google Groups "Manopt" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/manopttoolbox/atiMDxPtY1c/unsubscribe.
To unsubscribe from this group and all its topics, send an email to manopttoolbox+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/manopttoolbox.
To view this discussion on the web visit https://groups.google.com/d/msgid/manopttoolbox/d9c12b01-eddc-48ea-b0e5-410f8064ac88%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

complexcirclefactory_gpu.m

Nicolas Boumal

unread,
May 1, 2018, 8:50:32 PM5/1/18
to Manopt
Nice, thanks for the feedback!

Is execution faster as a result?

Best,
Nicolas

stum...@gmail.com

unread,
May 1, 2018, 8:58:32 PM5/1/18
to Manopt
Yes, I have a specific example that takes 1349 seconds on the CPU, but only 146 seconds on the GPU, yielding a 9x speedup. The cost function for my specific example is: https://groups.google.com/forum/#!topic/manopttoolbox/GCGzFMp9gH8, for which you provided the gradient and Hessian. The matrix A has dimensions 3200 x 4000 for the specific example.

Nicolas Boumal

unread,
May 1, 2018, 9:14:23 PM5/1/18
to Manopt
That's great!

Now I wonder if we could have a tool that "GPU's" a factory.

Functions such as inner, norm, dist are easy: just need to compose them with a gather().

The tricky part is functions that create arrays, as this is rather specific.. Currently, I don't see a way to automate it without touching the original factories.

Perhaps the easiest way will be to allow for an extra input in the factory call that specifies if we want the GPU version, and have the code work for both cases.. It shouldn't be terrible. We can for example have:

if gpuflag
   collect = @gather;
else
   collect = @(x) x;
end

Then, everywhere you would normally gather(), you just collect() instead. It will do nothing if we're not in the GPU version. That's nice to limit code changes, but it involves an extra "dummy" call to "@(x) x" for the CPU version, which is still going to be the main use of Manopt. So perhaps it's not worth it.

Another way is to have the different versions of the functions in there, as dist() and dist_gpu() and just assign M.dist accordingly. This way, once the structure M is created, there is no branching of any sort anymore.

If you have any thoughts on this, they are very welcome.

Best,
Nicolas

BM

unread,
May 2, 2018, 12:30:58 AM5/2/18
to Manopt
Hello Nicolas, 

I like your second option, which is to have *_gpu functions. Should we include them in the same factory file or have them in a separate file devoted to gpu functionalities and have a mergeOptions sort of a thing at a high level for factories. 

Regards,
Bamdev

Nicolas Boumal

unread,
May 2, 2018, 8:03:03 AM5/2/18
to Manopt
Hello Bamdev,

I didn't mean to have separate files. I think this would create a lot of code duplication. If you look at the example worked out here, changes to the existing factory are really minor.

I'm thinking we could simply have an extra input that indicates if we want the gpu version or not.

Inside the code, we can have

If gpuflag
gpustring = 'gpuArray'
Else
gpustring = 'double'
End

And we pass this string to any array creating function such as zeros, eye, randn, etc. (need to check what's needed a'd available.)

For the collects, we can just do this actually:

Have a generic tool that takes M and a list of field names, and returns the same M but with indicated fields composed with gather. That's a one liner and it's flexible. We only call it if gpuflag is true.

What do you think? This would require very small changes, it seems to me, and also would introduce no difference at all if gpuflag is false.

stum...@gmail.com

unread,
May 2, 2018, 3:33:02 PM5/2/18
to Manopt
You also need to handle u_vec for the GPU:
M.mat = @(x, u_vec) gpuArray(u_vec(1:n) + 1i*u_vec((n+1):end));

Nicolas Boumal

unread,
May 2, 2018, 3:58:48 PM5/2/18
to Manopt
I'm not sure what you mean? The documentation states u_vec is a real, column vector. By default it wouldn't be on the GPU (and I'm not sure it should).

stum...@gmail.com

unread,
May 2, 2018, 4:00:45 PM5/2/18
to Manopt
On Wednesday, May 2, 2018 at 2:33:02 PM UTC-5, stum...@gmail.com wrote:
> You also need to handle u_vec for the GPU:
> M.mat = @(x, u_vec) gpuArray(u_vec(1:n) + 1i*u_vec((n+1):end));

You put that line of code in your original GPU modification of complexcirclefactory.

Nicolas Boumal

unread,
May 2, 2018, 4:06:15 PM5/2/18
to Manopt
Right, so we'll need to gpuArray that function for other factories as well, I agree.

Is it fine to have structures / cells whose fields / elements are on the GPU? I imagine so. (I'm thinking about productmanifold and powermanifold.)

stum...@gmail.com

unread,
May 2, 2018, 4:08:40 PM5/2/18
to Manopt
Yes, the fields of a structure can be on the GPU.

stum...@gmail.com

unread,
May 3, 2018, 12:11:54 AM5/3/18
to Manopt
Is it correct to gather u_mat but gpuArray u_vec?

M.vec = @(x, u_mat) gather([real(u_mat) ; imag(u_mat)]);

Nicolas Boumal

unread,
May 3, 2018, 9:44:08 AM5/3/18
to Manopt
It is correct in that this is what every tool that uses M.vec / M.mat expects:

M.vec takes a tangent vector as input (that one sits on the GPU, it could be in any format) and outputs a real column-vector; tools using this function do not expect the result to be on the GPU.

M.mat is the inverse of M.vec, so it should take as input a column-vector that's not on the GPU, and output a tangent vector (on the GPU, in this case.)

This is not too important though. It seems that the only tool that uses M.vec / M.mat is hessianspectrum. It is used to create a function handle for a linear operator (called hess_vec) that has the same spectrum as the Riemannian Hessian (possibly with a few extra zero eigenvalues.) This is typically used for development / research, not to solve a problem. If a need emerges for mat / vec on the GPU, we can think about what's the right approach here.

Nicolas

Nicolas Boumal

unread,
Aug 3, 2018, 6:27:04 PM8/3/18
to Manopt
Update:

On the latest GitHub version of Manopt, support for GPU has been added for a few factories, namely, spherefactory, stiefelfactory and complexcirclefactory.

It's quite trivial to use; an example is available in the examples folder: using_gpu.m

Adapting factories (manifolds) to make them work on GPU is quite easy: let us know here if there are factories you would like us to adapt, it should be quick.

(There have been a lot of updates recently; this will all soon be released into a new numbered version.)

Best,
Nicolas

Nicolas Boumal

unread,
Aug 3, 2018, 6:59:27 PM8/3/18
to Manopt
(And now grassmannfactory too)
Reply all
Reply to author
Forward
0 new messages