Hello,
I am trying to use joint factor analysis in spear for my own dataset and thus created my own config file for the database and specifying the file in verify command. I have unlabelled training data and since UBM training is unsupervised, a single column of training data in 'train_world.lst' should suffice, as per my understanding. But it gives me an error and on checking the code in bob/bio/base/database/filelist/models.py there is validation for 2 columns for 'train_world'.lst file as shown below.
def _read_column_list(self, list_file, column_count):
# read the list
rows = self._read_multi_column_list(list_file)
# extract the file from the first two columns
file_list = []
for row in rows:
if column_count == 2:
assert len(row) == 2
# we expect: filename client_id
file_list.append(FileListFile(file_name=row[0], client_id=row[1]))
elif column_count == 3:
assert len(row) in (2, 3)
# we expect: filename, model_id, client_id
file_list.append(FileListFile(file_name=row[0], client_id=row[2] if len(row) > 2 else row[1], model_id=row[1]))
elif column_count == 4:
assert len(row) in (3, 4)
# we expect: filename, model_id, claimed_id, client_id
file_list.append(FileListFile(file_name=row[0], client_id=row[3] if len(row) > 3 else row[1], model_id=row[1],
claimed_id=row[2]))
else:
raise ValueError(
"The given column count %d cannot be interpreted. This is a BUG, please report to the author." % column_count)
return file_list
Please guide me if I am missing out on some information about the library for unsupervised UBM-GMM training. I have tried jfa, gmm algorithms and my command line call is as follows:
verify.py -d /home/prasanna/anaconda2/lib/python2.7/site-packages/bob/bio/spear/config/database/custom.py -p energy-2gauss -e mfcc-60 -a gmm -s gmm --groups {world,dev,eval}
Thanks,
Prasanna