Generating a dataset of images in RAM

23 views
Skip to first unread message

Jean-Patrick Pommier

unread,
Sep 27, 2016, 6:27:08 AM9/27/16
to scikit-image
Dear all,

I proposed on kagle an image processing/supervised classification problem concerning the resolution of overlapping chromosomes.The aim is to produce a large dataset of examples. A first dataset was produced, but it seems to be too small to yield good results for supervised classification with a neural network. As explained in the first notebook, 8Go is not enough to process, mainly to resize/crop, the images.
My question how a large batch of images >>100 000 can be resized?

Thanks.

Jean-Patrick

PS
I can't hide that It would be great if some would be interrested by the problem itself and give some help on the resolution itself or some advices on the proposed code.

Juan Nunez-Iglesias

unread,
Sep 27, 2016, 7:55:09 PM9/27/16
to scikit...@googlegroups.com
Hi Jean-Patrick,

Why do you need to load everything into RAM to resize it? This a perfect use-case for streaming data processing. Have a look at my notebook from EuroSciPy 2015 for some examples:

Specifically, as you generate examples, you should be writing them to disk directly. Then you are limited by disk size, instead of RAM size.

I hope that helps!

Juan.
--
You received this message because you are subscribed to the Google Groups "scikit-image" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-image...@googlegroups.com.
To post to this group, send email to scikit...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/scikit-image/cc5cdf4e-6847-4872-a10f-a598148edf56%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jean-Patrick Pommier

unread,
Sep 28, 2016, 1:17:27 AM9/28/16
to scikit-image
Thanks Juan,
I didn't know toolz.
Reply all
Reply to author
Forward
0 new messages