Re: [bob-devel] Question about GMM statistics and UBM

35 views

Skip to first unread message

Laurent El Shafey

unread,

Oct 6, 2013, 9:29:17 AM10/6/13

to bob-...@googlegroups.com

Hi,

Thanks for your input, Manuel!
I've just realised that my previous answer was not posted to the list but only to Marta! Here it is, in case this is helpful for others as well.

Cheers,
Laurent

-------- Original Message --------

Subject:	Re: [bob-devel] Question about GMM statistics and UBM
Date:	Sun, 06 Oct 2013 13:12:52 +0200
From:	Laurent El Shafey <laurent....@idiap.ch>
To:	Marta Gomez-Barrero <mart...@gmail.com>

Hello,

The problem you are facing is related to the fact that the output of the feature extractor in the facereclib (in your script, the output of gabor_graph_feature_extractor) returns an object (a NumPY array in this case) which is a class member of the extractor.
When you append this object into a list, it only appends a reference to the class member of the extractor instead of a copy. In practice, this leads to having the exact same 'values' in your list of feature vectors. To avoid the problem, you can call the method copy() on the output of the feature extractor.
Attached is your script updated with this fix. Please let us know if it fixes your problem.

Cheers,
Laurent

On 06/10/13 12:16, Marta Gomez-Barrero wrote:

Hi Laurent and Günther,

In the end I'm using facereclib... I finally "saw the light" and I think I learned how to use it.

My problem now is, that apparently everything goes well, with no errors or warnings.... but when I evaluate the scores, I get EER = 50%!! So something is definitely wrong.

I'm running the tests on the graph matching algorithm (just because it's much faster for tests), and I get the same problem. So I guess the mistake is in reading the images. However, when I print the files and the ids, everything seems correct. But when I print the features, here's the problem: for all the images (regardless of the user) I obtain the same features, even though they're not identical. This leads to obtaining the same score for all the images against all the models. And thus the 50% EER.

Here's the code I implemented. I copied the initialization from an example of one of your packages:

atnt_db = facereclib.databases.DatabaseXBob(

database = xbob.db.atnt.Database(),

name = "gbu",

original_directory = "/Users/martagomezbarrero/Downloads/orl_faces",

original_extension = ".pgm",

)

# Gabor grid graphs for the Gabor graphs algorithm:

gabor_graph_feature_extractor = facereclib.features.GridGraph(

# Gabor parameters

gabor_sigma = math.sqrt(2.) * math.pi,

# what kind of information to extract

normalize_gabor_jets = True,

extract_gabor_phases = True,

# setup of the fixed grid

first_node = (4, 4),

#image_resolution = (CROPPED_IMAGE_HEIGHT, CROPPED_IMAGE_WIDTH),

image_resolution = (112, 92),

node_distance = (8, 8)

)

# Gabor graphs: Use the similarity function incorporating the Gabor phase difference and the Canberra distance

gabor_graph_tool = facereclib.tools.GaborJets(

# Gabor jet comparison

gabor_jet_similarity_type = bob.machine.gabor_jet_similarity_type.PHASE_DIFF_PLUS_CANBERRA,

# Gabor wavelet setup; needs to be identical to the feature extractor

gabor_sigma = math.sqrt(2.) * math.pi

)

I also adapted the functions for loading images to what I needed:

#######################################################################

### Functions for loading images and for load + extract

def load_images_enrol(db):

"""Reads the images for the given group and the given client id from the given database"""

# get the file names from the database

model_ids = db.model_ids()

# iterate through the list of file names

images = []

for k in model_ids:

lst = []

files = db.enroll_files(k)

#print k

#print files

for k in files:

image = bob.io.load(k.make_path(db.original_directory, db.original_extension))

#image = tan_triggs_preprocessor(image)

lst.append(image)

images.append(lst)



return images, model_ids

def load_images_probe(db):

"""Reads the images for the given group and the given client id from the given database"""

# get the file names from the database

files = db.probe_files()



#print files

# iterate through the list of file names

images = []

ids = []

for k in files:

image = bob.io.load(k.make_path(db.original_directory, db.original_extension))

#image = tan_triggs_preprocessor(image)

images.append(image)

ids.append(db.m_database.get_client_id_from_file_id(k.id))



#print ids

return images, ids

And finally, here's the code I'm running:

print "Extracting and enrolling models"

model_images, model_ids = load_images_enrol(atnt_db)

models = []

for user in model_images:

lst = []

for image in user:

feat = gabor_graph_feature_extractor(image)

lst.append(feat)

model = gabor_graph_tool.enroll(lst)

#print model

models.append(model)

print "Extracting and scoring probes"

positive_scores = []

negative_scores = []

probe_images, probe_ids = load_images_probe(atnt_db)

probe_features = []

for image in probe_images:

feat = gabor_graph_feature_extractor(image)

probe_features.append(feat)

counter1 = 0

for model in models:

k = model_ids[counter1]

print k

counter = 0

for feat in probe_features:

if probe_ids[counter] == k:

positive_scores.append(gabor_graph_tool.score(model, feat))

else:

negative_scores.append(gabor_graph_tool.score(model, feat))

counter += 1

counter1 += 1

I'm sorry for all the trouble I'm causing, but I have no clue about where the error may be. The py file I'm using is also attached.

Thanks a lot and best regards,

Marta