recommendation for 'human-like' weighted combination of features

Brian Murphy

unread,

Dec 9, 2009, 1:14:27 PM12/9/09

to FIRE - Flexible Image Retrieval Engine

Hi,

I'm interested in using FIRE functions to extract pairwise measures of
image similarity. I'm a researcher in computational neuroscience, and
I want to get an objective measure of similarity, that simulates human
judgements.

Looking at chapter 3 of your thesis, I've seen you've done really
interesting work on this, but not being an expert in computer vision
I'm not sure what the take-home message is. Also, I'm not sure to what
extent retrieval task evaluation should carry over to other tasks. For
example, I guess you could have a retrieval algorithm that reliably
pushes some similar images up to the top of the ranking and getting a
high F score, while still ignoring other similar images - this would
be a good retrieval algorithm, but less good at simulating human
similarity judgements in general.

So first of all, what is the 'best' combination of feature similarity
measures for these purposes?

Second, is this combination already programmed into FIRE subroutines,
or should I do the combination myself?

And third, what is the easiest way to get at these routines (I see
that there are both C++ and python interfaces, at least)?

best,

Brian

Thomas Deselaers

unread,

Dec 9, 2009, 1:28:00 PM12/9/09

to fire...@googlegroups.com

Hi Brian,

1)

The work I have done on this is probably best read in the JASIST paper:

Abebe Rorissa, Paul D. Clough, and Thomas Deselaers. Exploring the
relationship between feature and perceptual visual spaces. In: Journal
of the American Society for Information Science and Technology 59.5
(Mar. 2008), pp. 770-784.

which you find here:
http://thomas.deselaers.de/publications/all_publications.html

[Most of it is also in my phd thesis]

There we tried to find a combination of features that matches human perception.

All the methods described in this paper have used some parts of FIRE
as functions but have otherwise been implemented using octave (or
matlab, I don't quite remember).

Therefore I used the FIRE feature extractors
computed distance files with fire (this is supported,but probably not
well documented)
reformatted these distance files to be able to read them in octave/matlab

and performed the optimization (which is basically solving a system of
linear equations) using SVD.

2)
no these combinations are not hard coded into fire.
To use it, you need to extract the appropriate features for your
dataset and then set the weights as I learned them.

3)
I used FIRE as it is
+ a python script that makes FIRE save the distance files
+ a script to make these readable in octave
+ an octave script to compute the weights.

I hope this helps.

Cheers,
thomas

> --
>
> You received this message because you are subscribed to the Google Groups "FIRE - Flexible Image Retrieval Engine" group.
> To post to this group, send email to fire...@googlegroups.com.
> To unsubscribe from this group, send email to fire-cbir+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/fire-cbir?hl=en.
>
>
>

--
http://thomas.deselaers.de

Brian Murphy

unread,

Dec 11, 2009, 12:32:09 PM12/11/09

to FIRE - Flexible Image Retrieval Engine

Hi Thomas,

thanks for the link. The paper is exactly what I was looking for -
really nice work. I was hoping for higher correlations between human
and machine measures of distance, but I suppose your paper
demonstrates that there is a lot more to perceived similarity than
just low level visual features. Still, that leaves the riddle of why
these measures work so well for retrieval, despite having only a
modest correspondance with intuitions of similarity.

But anyway, I added the thumbnail measure into my classifier and now
I'm getting much better results, on a binary judgement of similar vs
dissimilar (relative to a shared reference picture).

Incidentally, if you or your co-authors are still working on this sort
of thing, it would be interesting to try some sort of reduction of the
linear model - to deal with all the partially correlated measures you
have as input variables. In R there is a function called step() which
strips out variables, one by one, to automatically arrive at an
optimal model.

best of luck,

Brian

Thomas Deselaers

unread,

Dec 11, 2009, 12:48:25 PM12/11/09

to fire...@googlegroups.com

Dear Brian,

I guess the good retrieval results vs. the relatively poor
correspondence in human preception can mostly be explained by the type
of images on which typically retrieval experiments are performed.

I am not working on this anymore, but I guess Abebe Rorissa might
still be working on it.

Cheers,
thomas

Reply all

Reply to author

Forward