Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

cropping a random part of an image

2,565 views
Skip to first unread message

drewe...@gmail.com

unread,
Aug 9, 2016, 8:41:19 AM8/9/16
to
Hi,

I'm new to python and I have 30.000 pictures.
I need to crop, let's say 100, parts of 256x256, randomly out of every picture.

I cant find an answer in the net, would be nice if someone could help me out!

Thanks!

Steffen

Peter Otten

unread,
Aug 9, 2016, 10:30:48 AM8/9/16
to
You can walk over the files with

https://docs.python.org/dev/library/os.html#os.walk

find out the image size and process the image with pillow

http://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.size
http://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.crop

using

https://docs.python.org/dev/library/random.html

to determine what part of an image you want to pick.

Now write some code and come back here if you run into problems you cannot
solve yourself.

Robin Koch

unread,
Aug 9, 2016, 4:30:23 PM8/9/16
to
Am 09.08.2016 um 14:40 schrieb drewe...@gmail.com:

> I'm new to python and I have 30.000 pictures.
> I need to crop, let's say 100, parts of 256x256, randomly out of every picture.

Interessting task. May I ask for the purose of this?

> I cant find an answer in the net, would be nice if someone could help me out!

Although I think you should follow Peters advise I have put something
together, out of curiousity, that should do the trick.

There are several ways to improve it.
First I'm using os.listdir() instead of os.walk().
In my example scenario that's ok. If your files are spread over
different subfolders, os.walk() is the better way to do it.

You could add a counter (see: enumerate()) to have a better overview
over the progress.

Also you might have other preferences for the location of the tiles.
(e.g. one folder per image).

# Cuts random tiles from pictures
#
# This version works with approx. 270 tiles per second on my machine
# (on 13.5MPx images). So 3 million tiles should take about 3 hours.

import random, os, time
from PIL import Image

INPATH = r".../images"
OUTPATH = r".../tiles"

dx = dy = 256
tilesPerImage = 100

files = os.listdir(INPATH)
numOfImages = len(files)

t = time.time()
for file in files:
with Image.open(os.path.join(INPATH, file)) as im:
for i in range(1, tilesPerImage+1):
newname = file.replace('.', '_{:03d}.'.format(i))
w, h = im.size
x = random.randint(0, w-dx-1)
y = random.randint(0, h-dy-1)
print("Cropping {}: {},{} -> {},{}".format(file, x,y, x+dx, y+dy))
im.crop((x,y, x+dx, y+dy))\
.save(os.path.join(OUTPATH, newname))

t = time.time()-t
print("Done {} images in {:.2f}s".format(numOfImages, t))
print("({:.1f} images per second)".format(numOfImages/t))
print("({:.1f} tiles per second)".format(tilesPerImage*numOfImages/t))



--
Robin Koch

drewe...@gmail.com

unread,
Aug 10, 2016, 5:27:18 AM8/10/16
to
Hi Peter, Hi Robin,

thanks for your help, and especially for the code ;)

@Peter: thanks for the links, I know, it is always better to write the code than to copy and paste!

@Robin: the question, iḿ trying to answer is, if good pictures, in this case portraits, also have good "Parts", so if the beauty of a picture can be found in its crops, too.
In the and it is for a deep CNN and I hope it will work!

Thanks again!

Steffen

drewe...@gmail.com

unread,
Aug 10, 2016, 6:32:03 AM8/10/16
to
Hi Robin,

I tried to understand and run your code, and I get the Error:

"File "Rand_Crop.py", line 15, in <module>
with Image.open(os.path.join(INPATH, file)) as im:
File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 528, in __getattr__
raise AttributeError(name)
AttributeError: __exit__"

I looked up the Image.py and don't know, why the Name could not be read out.

If i print the names of the images out, before the first "for" everything is fine, but it dosen't work after line 15.

I try to figure this out, but if someone has a hint, I would be happy!


Steffen
Message has been deleted

drewe...@gmail.com

unread,
Aug 10, 2016, 8:17:08 AM8/10/16
to
Ok, now it works for me!
Thanks again!

import random, os, time
from PIL import Image

INPATH = ('/home/.../Start/')
OUTPATH = ('/home/.../Ziel/')

dx = dy = 228
tilesPerImage = 25

files = os.listdir(INPATH)
numOfImages = len(files)
print(files)
t = time.time()
for file in files:
im = Image.open(INPATH+file)
0 new messages