Problem with creating a dataset

63 views
Skip to first unread message

Soundar Thiagarajan

unread,
Mar 24, 2015, 10:01:34 PM3/24/15
to tor...@googlegroups.com
I have my dataset images in a folder in .jpg format and the labels in a csv file

I load the images into a tensor
by  

dname = paths.dirname('/home/soundar/Videos/Train/Train_Data/5ChildrenAndIt_audio_00014_non-human1.jpg')

count = 0
for a in paths.files(dname) do 

if a:find('jpg') then 
count = count+1 
end 
end

print(count)
train_set = torch.Tensor(count, 3, 96, 96):float()

count = 0
for a in paths.files(dname) do 
if a:find('jpg') then
                collectgarbage()
count = count + 1
                a = dname .. "/" .. a
    train_set[count] = image.load(a)
                print(count)                
end
end

when I train I always get true negative and false positive zero,

I doubt whether the images are loaded in the same order as in the folder because the labels are loaded in order.

Can anyone tell me whether they will be loaded in order or some other way to load in order

soumith

unread,
Mar 24, 2015, 11:26:57 PM3/24/15
to torch7 on behalf of Soundar Thiagarajan
Hey Soundar,

The iterator is NOT guaranteed to be in alphabetical order at all.
What you should probably do is one of the two things:

Solution 1:
Keep images to be folder-separated according to their label:
cat/[jpegs of cat]
dog/[jpegs of dog]
etc.

Solution 2:
in the CSV file that has labels, also keep image filename:
test1.jpg,human
test2.jpg,dog
test3.jpg,cat

That way, the order of file loading is not related to the order of labels.

Solution 3:
Load all filenames first, and then sort the table:

count = 0
filenames = {}
for a in paths.files(dname) do 
if a:find('jpg') then 
count = count+1 
                table.insert(filenames, a)
end 
end

table.sort(filenames)
train_set = torch.Tensor(count, 3, 96, 96):float()
for i=1,#filenames do
                collectgarbage()
                local path = dname .. "/" .. filenames[i]
    train_set[i] = image.load(path)
                print(i)                
end
end




--
You received this message because you are subscribed to the Google Groups "torch7" group.
To unsubscribe from this group and stop receiving emails from it, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at http://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.

Soundar Thiagarajan

unread,
Mar 24, 2015, 11:56:40 PM3/24/15
to tor...@googlegroups.com
how does this sorting work , when I tried I felt it is not the same order as in the folder in ubuntu

soumith

unread,
Mar 25, 2015, 12:06:49 AM3/25/15
to torch7 on behalf of Soundar Thiagarajan

Soundar Thiagarajan

unread,
Mar 25, 2015, 2:24:56 AM3/25/15
to tor...@googlegroups.com
thanks


On Wednesday, March 25, 2015 at 9:36:49 AM UTC+5:30, smth chntla wrote:
Reply all
Reply to author
Forward
0 new messages