VGG19 validation accuracy remains low

536 views
Skip to first unread message

facel...@gmail.com

unread,
Jul 21, 2016, 11:37:09 PM7/21/16
to Keras-users
Hello guys, 

I'm using vgg19 model to do some multiclassfication. After the training, my training accuracy can be close to 0.98 but the validation accuracy remains low (about 0.3~0.4). Is that so called overfitting? My batch size is 32 and due to the GPU memory problem, the batch size can not be bigger. Now the main problem I want to solve is that I hope I can rise my validation accuracy. Do you guys have any suggestions? I am new in cnn, so if you can give any helps I would be very appreciated!!!

Big thanks!!!

Daπid

unread,
Jul 22, 2016, 2:44:50 AM7/22/16
to facel...@gmail.com, Keras-users
On 22 July 2016 at 05:37, <facel...@gmail.com> wrote:
> After the training, my training accuracy can be close to 0.98 but the
> validation accuracy remains low (about 0.3~0.4). Is that so called
> overfitting?

Yes. Basically, you aren't learning anything.

What can you do? Use regularisation techniques: dropout, batch
normalisation, l2 penalties... Furhter than that, it is difficult
without knowing the specific architecture and the amount and diversity
of training data.

facel...@gmail.com

unread,
Jul 23, 2016, 5:50:45 AM7/23/16
to Keras-users, facel...@gmail.com
Hello guys,

I'm using vgg19 to train about 150,000 images and classify them into 20 classes. I don't know if the vgg19 is the most appropriate model that I can use. Because my amount of data is not as much as vgg19 original training data. So I'm not sure if there is any better model can train 150,000 images to 20 classes?

On the other hand, I met a memory error when I converted all images into numpy array. My CPU ram is 64G, and this error seems my ram is still not big enough. Is that possible? The size of my image is 3*187*100.

Any suggestions or help? Thank you very much!!!


facel...@gmail.com於 2016年7月22日星期五 UTC+8上午11時37分09秒寫道:

Daπid

unread,
Jul 23, 2016, 8:29:06 AM7/23/16
to facel...@gmail.com, Keras-users
On 23 July 2016 at 11:50, <facel...@gmail.com> wrote:
> I'm using vgg19 to train about 150,000 images and classify them into 20
> classes. I don't know if the vgg19 is the most appropriate model that I can
> use. Because my amount of data is not as much as vgg19 original training
> data. So I'm not sure if there is any better model can train 150,000 images
> to 20 classes?

If your images are natural images, similar to those that VGG was
designed and trained for, you could take the pretrained VGG model and
tune the top layers to your data. Here is a blog post that describes
how to do it:
http://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

A simpler alternative would be to just train from scratch a simpler
model and see where it takes you.
The advantage here is that you can more easily engineer your
architecture to your data, for example, using branched architectures
to get different perspectives.

> On the other hand, I met a memory error when I converted all images into
> numpy array. My CPU ram is 64G, and this error seems my ram is still not big
> enough. Is that possible? The size of my image is 3*187*100.

The total amount of data is:

np.empty((3, 100, 187), dtype=np.float32).nbytes * 150000*1e-9
33.66

So, it should fit. I can think of three reasons why it wouldn't:
- You are loading them as float64
- You are loading all the images in memory and then stacking them,
so, at some point, storing them twice, that requires 67.
- Your Python is 32 bits, that is limited to 4.5 GB of memory per process.

The second one can be solved by preallocating the full array, and
loading each image on it.

facel...@gmail.com

unread,
Jul 25, 2016, 6:08:16 AM7/25/16
to Keras-users, facel...@gmail.com
Hello David,

Thank you so much! I think you are right about my memory error. You suggest that I need to preallocating the full array and load each image on it. Can I ask you some specific questions?

- Should I need to set an empty array? If I do, I need to know the exactly amount of my data and then I can set the dimension of the array, right? 
- Should I use np.append to append my numpy array in to the full array? 

I tried some ways but I could not preallocate the array successfully. So hope you can give me some advices. Thank you so so much!!!

David Menéndez Hurtado於 2016年7月23日星期六 UTC+8下午8時29分06秒寫道:

Daπid

unread,
Jul 25, 2016, 8:18:23 AM7/25/16
to Julia Chang, Keras-users
On 25 July 2016 at 12:08, <facel...@gmail.com> wrote:
>
> - Should I need to set an empty array? If I do, I need to know the exactly
> amount of my data and then I can set the dimension of the array, right?

That would be ideal.

> - Should I use np.append to append my numpy array in to the full array?

I think numpy can sometimes extend an array in place, but I think it
depends on your OS.

If you don't know your exact number of images you can either
preallocate something that fits in memory, but is bigger than your
whole data, fill it, and slice out the remaining:

images_container = np.empty((1000, 3, 50, 50), dtype=np.float32)

# You put them in, and discover that there are 950:

all_images = images_container[:950, ...]


You can also save them to a database and read them in bulk. For the
second, I usually go with a Pytables EArray, if you can afford the
HDF5 dependency.
Reply all
Reply to author
Forward
0 new messages