About feature localization and description

3 views
Skip to first unread message

TheFrenchLeaf

unread,
Aug 23, 2009, 12:53:12 PM8/23/09
to libmv-devel
Hello,

I have see that libmv try to implement Surf ourself.
But I think it could be better to use the OpenSurf project "http://
code.google.com/p/opensurf1/" that works not so bad and can be ported
easily with the libmv image class (it have a dependacy over Opencv
that can easily be removed).
The code is clean and use the CenSURE descriptor that is computed very
fast and seems to be very good (and it have the Upright descriptor and
an oriented one ;) ).

I will be please to be in charge of the task if you want.

i.e :
- Convert the code to work with the native libmv image class.

I have also done some test on the integration of the "Fast corner
detector". It works well.
- I can update it in third party and write the wrapper to use it with
the "Array3Du" image class.

The image class of libmv suffer of important basic image tools :
- a. Convertion from color to gray,
- b. Basic drawing operation are not present (circle, segment...),

I can take care of the point a and b.

Best regards
Pierre

Keir Mierle

unread,
Aug 23, 2009, 4:42:26 PM8/23/09
to libmv...@googlegroups.com
On Sun, Aug 23, 2009 at 9:53 AM, TheFrenchLeaf <thefre...@gmail.com> wrote:

Hello,

I have see that libmv try to implement Surf ourself.
But I think it could be better to use the OpenSurf project "http://
code.google.com/p/opensurf1/" that works not so bad and can be ported
easily with the libmv image class (it have a dependacy over Opencv
that can easily be removed).
The code is clean and use the CenSURE descriptor that is computed very
fast and seems to be very good (and it have the Upright descriptor and
an oriented one ;) ).

I think you will find that the code in libmv is better written and more modular. The notable exception is that we haven't added code for rotation invariance.
 
I will be please to be in charge of the task if you want.

Please focus on improving our existing SURF implementation. Note that the author of OpenSURF used several of the ideas in the libmv code to make OpenSURF faster!

Regarding interest point detection and description: I found the following and downloaded all the data:


It's hundreds of thousands of image patches extracted from reconstructions, for the purpose of optimizing descriptors. 

I've been thinking about wide baseline matching lately, and here's what I want to do (though I probably won't in the short term)

1. Create an API for finding interest points. The API should use types for easy policy-based design (e.g. see the Kernel concept for RANSAC that was recently added. See two_view_kernel.h).
2. Create an API for extracting descriptors from an image.
3. Add an implementation for 1 and 2 that calls our SURF code.
4. Write a program that uses the patch data to extract descriptors and compare their performance.

It's a big job, but it would open the doors to designing our own descriptor and optimizing it in a similar way to the "Picking the best DAISY" paper. In any case, 1 and 2 open the doors to other descriptors such as FAST in a unified way.

Regarding rotation invariance:

There are two requirements: 1. Extracting a stable orientation for interest points. 2. Extracting a descriptor rotated in accordance to the interest point's stable orientation. I didn't do this because the 'simple' implementation involves rotating the image (resampling) and we don't have code to do that easily. See below re: image class.

 i.e :
- Convert the code to work with the native libmv image class.

I have also done some test on the integration of the "Fast corner
detector". It works well.
- I can update it in third party and write the wrapper to use it with
the "Array3Du" image class.

That would be great. I wish the FAST code wasn't such a black box. Essentially, the C code they provide is generated by scripts that are not provided.
 
The image class of libmv suffer of important basic image tools :
- a. Convertion from color to gray,
- b. Basic drawing operation are not present (circle, segment...),

I can take care of the point a and b.

Yes, image handling is a sore point for libmv. I would be happy to move to an external image library, but they all have flaws relating either to portability or licensing. The one I'm most interested in is vision workbench (also, I work with the author), because it has a beautiful design and API. Sadly, the license is truly weird and not usable for a library, or even GPL compatible.

I'm open to suggestions.

Thanks for looking at this stuff!
Keir
 

Best regards
Pierre


Pau Gargallo

unread,
Aug 23, 2009, 7:21:52 PM8/23/09
to libmv...@googlegroups.com
Acutally, it is quite simple (and faster) to do it without rotating
the image. You can compute the derivative in any direction by
rotating the derivative in x and y. That is
dv = c dx + s dy
where c and s are the cosine and sine of the direction. If i remember
right, this is the way it is done in openSURF. Also, resampling is
one of the few image algorithms that we actually have ;-)

>  i.e :
>>
>> - Convert the code to work with the native libmv image class.
>>
>> I have also done some test on the integration of the "Fast corner
>> detector". It works well.
>> - I can update it in third party and write the wrapper to use it with
>> the "Array3Du" image class.
>
> That would be great. I wish the FAST code wasn't such a black box.
> Essentially, the C code they provide is generated by scripts that are not
> provided.
>
>>
>> The image class of libmv suffer of important basic image tools :
>> - a. Convertion from color to gray,
>>
>> - b. Basic drawing operation are not present (circle, segment...),
>>
>> I can take care of the point a and b.
>
> Yes, image handling is a sore point for libmv.

Agree.
Two basic things that I would change (even if i wrote them myself ;-)

1- I would use reference counted views instead of image objects that
own data. This way complete images, image crops, or image planes can
all be accessed as image views. Btw, when we first though about it,
we dismissed reference counting because (i) google style said so, and
because (ii) boost's shared_array was the simplest way of reference
counting, but we didn't want to depend on boost. Do we still care
about these show stoppers?

2- This is just a matter of taste, but I wish I had written
image(x,y) instead of image(y,x). Even if it is standard in python or
matlab, i still find image(y,x) to be a pain.

Also, I would forget about the ArrayNd stuff for a while, since we are
not using it.

> I would be happy to move to
> an external image library, but they all have flaws relating either to
> portability or licensing. The one I'm most interested in is vision workbench
> (also, I work with the author), because it has a beautiful design and API.
> Sadly, the license is truly weird and not usable for a library, or even GPL
> compatible.
> I'm open to suggestions.

If using vision workbench is not doable, we can simply clone its
ImageView type. They certainly thought more than us to find the good
image API. So cloning their API (or its main features) should save us
time. Implementing the actual code should be much easier than
deciding a good API.

cheers,
pau

TheFrenchLeaf

unread,
Aug 24, 2009, 4:36:44 AM8/24/09
to libmv-devel
Hello,

Glad to see that the post have answer and good advice ;)

I'm also agree that the ArrayNd is not the best choice. I prefer a
standard image way (single dynamic allocated array).
I think also that we have only yo handle 2D Typed image.

A simple way of handle it can be something as following :
libmv::CImageT<Typed> (Typed image class)

=> We can implement simple Pixel type by using class, or struct :
class Gray
{ uchar m_pix;
}
class RGB8
{ uchar r,g,b;
} ...

I think Vision Workbench is designed in a similar way.

I think we have to designed algorithm in a very naive way and suppose
that the image class provide a 2D iterator or access. So whatever the
library we will use or designed the algorithm we will work.
I want call it : GIAP Generic Image Processing Algorithm (so our
algorithm can work with OpenCV, VisionWorkbench whatever by writing a
single wrapper over the image iterator).

As an answer to the Keir post on "API for finding interest points"

I think the main dilemma is that all the interest point detector do
not provide the same information :
Some will only return x,y position,
Some will return x,y,Scale, Orientation...
But We will find a way to manage it.

So as a future roadmap (I think we can imagine something similar to) :
-> Pau will think about can design for the image class,
-> I start to think to the API for extraction interest point and
description,
Another keypoint localization can be good to have is YAPE (Yet Another
Keypoint Extractor, LePetit group) (We need to see the licence over
it).
It's seems pretty simply and be fast and reliable.

Cheers,
Pierre.

PS : If we can plan "forward-looking" roadmap (with no dead line
because of the work off each other...) it could be cool.


Reply all
Reply to author
Forward
0 new messages