Sift on GPU

333 views
Skip to first unread message

Jérôme Kieffer

unread,
Jun 12, 2013, 4:51:02 PM6/12/13
to scikit...@googlegroups.com
Dear Pythonistas,

We are porting the SIFT keypoints extraction algorithm (available from IPOL)
to GPU using PyOpenCL. For the moment, the keypoint location works and
shows a speed-up of 5 to 10x (without tuning so far, vs C++).

A lot of work is remaining, especially:
* limit the memory footprint (700MB/10Mpix image currently)
* calculate the descriptor for each descriptor
* keypoint matching and image alignment.
* best interleave of IO/CPU/GPU
but we managed to port the most trickiest part to OpenCL (without using
textures, which makes it running also on multi-core).

I would like to thank the people who published their algorithm on IPOL;
making unit testing possible.

Last but not least, the code is open source and should have a BSD
licence (even if there is a patent on the algorithm in the USA).
https://github.com/pierrepaleo/sift_pyocl

Cheers,

--
Jérôme Kieffer <goo...@terre-adelie.org>

Marc de Klerk

unread,
Jun 14, 2013, 5:36:39 AM6/14/13
to scikit...@googlegroups.com
Hi Jérôme,

I cloned the repo and tried running test_all.py,
Seems there are a couple bugs in test_image_functions.py that prevent it from executing properly.

Is there an example somewhere that I can play with/

Cheers,
Marc

Jérôme Kieffer

unread,
Jun 15, 2013, 7:21:53 AM6/15/13
to scikit...@googlegroups.com
Dear Mark,


On Fri, 14 Jun 2013 02:36:39 -0700 (PDT)
Marc de Klerk <dekl...@gmail.com> wrote:

> I cloned the repo and tried running test_all.py,
> Seems there are a couple bugs in test_image_functions.py that prevent it
> from executing properly.

This is highly possible: we still have a small differences in the
number of keypoints with C++ implementation. moreover the keypoint
localization can vary up to 1 pixel (to be multiplied by the number of octave).
This looks like a rounding error but we did not spot it.

> Is there an example somewhere that I can play with/
get the reference implementation:

git clone -branch numpy git://github.com/kif/imageAlignment.git
cd imageAlignment
python setup.py build
sudo python setup.py install #or modify your PYTHONPATH
cd ..

git clone git://github.com/kif/sift_pyocl.git
cd sift_pyocl/test
python test_all.py # I got (failures=2, errors=2, mainly because API changed faster than tests)
python crash.py

This should show you keypoints (red and blue arrows represents the orientation and the scale, in green are our errors)

Tell me if you are doing progress (or not).

Jerome Kieffer

unread,
Sep 5, 2013, 11:44:38 AM9/5/13
to scikit...@googlegroups.com
On Fri, 14 Jun 2013 02:36:39 -0700 (PDT)
Marc de Klerk <dekl...@gmail.com> wrote:

> Hi Jérôme,
>
> I cloned the repo and tried running test_all.py,
> Seems there are a couple bugs in test_image_functions.py that prevent it
> from executing properly.
>
> Is there an example somewhere that I can play with/
>

Hi Marc,

We have fixed most tests ... under linux+pyopencl+GPU nvidia (fermi+kepler)
* With AMD/intel on CPU driver some tests don't pass (but the library is functional and working)
* With NVidia GT200 few kernel crashes (but the library is functional and working)
* With Nvidia 9600 many kernel are crashing but the library is able to use CPU kernels
* With elder nvidia cards where atomic operation do not exist at all, no way to get it working.

The problem we encounter is that kernel designed for GPU do not behave properly under CPU and vice-versa.

This is fully untested with other platforms like windows or macosX and
with ATI graphic cards but feed-back would be welcome.

To run the tests: run test/test_all.py
to have a small demo: run test/demo_match.py

There is comprehensive sphinx doc. The repository should now be:
https://github.com/kif/sift_pyocl

Cheers,
--
Jerome Kieffer <goo...@terre-adelie.org>

Andreas Mueller

unread,
Sep 10, 2013, 3:43:24 AM9/10/13
to scikit...@googlegroups.com
Hi Jerome.
Have you benched against vl_feat.
They have pretty optimized C code, and I think it would be very
interesting to see
if you are faster. Their code is also BSD, btw.

Cheers,
Andy

Jerome Kieffer

unread,
Sep 10, 2013, 8:23:27 AM9/10/13
to scikit...@googlegroups.com, Andreas Mueller, pierre...@gmail.com, Jerome Kieffer
On Tue, 10 Sep 2013 09:43:24 +0200
Andreas Mueller <amue...@ais.uni-bonn.de> wrote:

> Hi Jerome.
> Have you benched against vl_feat.
> They have pretty optimized C code, and I think it would be very
> interesting to see
> if you are faster. Their code is also BSD, btw.

Hello Andreas,

Thanks for the link.
This is the first time I use matlab ... so the comparison is likely to be unfair:

I = imread(fullfile(vl_root,'data','roofs1.jpg')) ;
t=cputime;[f,d] = vl_sift(single(rgb2gray(I))) ;e=cputime-t
e=0.9600

under python:
In [1]: import scipy.misc,sift
In [2]: img = scipy.misc.imread("roofs1.jpg")
In [3]: sift_gpu = sift.SiftPlan(template=img,devicetype="GPU")
In [4]: %timeit kp = sift_gpu.keypoints(img)
10 loops, best of 3: 87 ms per loop
In [5]: sift_cpu = sift.SiftPlan(template=img,devicetype="CPU") #selects Intel driver
In [6]: %timeit kp = sift_cpu.keypoints(img)
1 loops, best of 3: 216 ms per loop
In [7]: sift_cpu_amd = sift.SiftPlan(template=img,device=(1,0)) #selects AMD driver, computer specific
In [8]: %timeit kp = sift_cpu_amd.keypoints(img)
1 loops, best of 3: 225 ms per loop

The computer is a dual Intel(R) Xeon(R) CPU E5-2643 0 @ 3.30GHz (fast) but with a moderate graphics card Quadro 2000
On my GeForce Titan (GK110) I got:
10 loops, best of 3: 38.7 ms per loop


A rough and unfair comparison would say our code is 25x faster; but the
test has been made on a rather small image (640x478) which does not
allow the GPU to express it's speed. On the other hand if the could
would have run on my computer it would be worse (my CPU is only 2.2GHz).

You says vl_feet is optimized, it looks slower than the one from IPOL wrapped under python
https://github.com/kif/imageAlignment

In [9]: import feature
In [10]: %timeit feature.sift_keypoints(img.max(axis=-1))
1 loops, best of 3: 687 ms per loop

Comments are welcome.
Reply all
Reply to author
Forward
0 new messages