Problem with Multiprocessing.pool and pickle. Could someone give my logic a quick glance and tell me if I'm reasoning about this correctly.

115 views

Skip to first unread message

Chris

unread,

Oct 11, 2013, 2:42:21 PM10/11/13

to tamp...@googlegroups.com

I tired this question on Stack Overflow, but didn't get any responses. You guys are smart, so I thought I give it a go here! :) This is lengthy, so be warned

I have two key questions.

(a) Am I reasoning about the bottle neck correctly

(b) Is there a way to store state in functions in such a way that they can be safely pickled for the multiprocessor

The task in question is my (very terrible) stab at a template matching algorithm. It's extremely brute force, pretty terrible from an efficiency standpoint, and dear God is it slow -- but! It (technically) works .

The basic algorithm simply "slides" the sub picture the user wants to find across a screen shot of the desktop until a match on the pixel data is found (or not found). Don't judge! It's a first stab at this sort of thing.

Firstly, here's is the core of the program without the multiprocessing:

    def build_match_box(x):
        data, source, row, col, img_size = x
        match_box = [data[row + x][col : img_size + col] for x in xrange(img_size)]
        return match_box == source


    if __name__ == '__main__':
	    for row in range(master_image.rows): 
		# creates a generator which creates a slice of the pixel_matrix 
		# with the same dimensions as the template image to find, and 
		# for each pixel in the row. 

		columns = ((pix_mtrx_of_scrn, template, row, col, templte.size.x) for col in range(row_length))

		# I pass the columns generator to a map function which 
		# then builds the actual slice of the pixel_matrix and 
		# checks if the pixel data of the sub_image matches the 
		# pixel data for the template_image 
		results = map(build_match_box, columns)
		if True in results: 
			print 'Found it!'
			break

Now, my assumption as to why this works without exploding is that everything in this bit of code is passed by reference, right? So none of those giant matrices are actually being moved around -- just the reference to them. Which keeps me from wasting time/memory/CPU copying data around.

Additionally, the only *unique* data that's actually being passed to the `build_match_box` function are the row indexes, and the width of the template Image -- all simple Ints. Ans also, because there's no state change on the main matrix structure, it seems like this should be easily to run in parallel.

So, assuming I'm reasoning about the program correctly, I pushed on to setting up the multiprocessing.

Only two lines needed changed. setting up the Pool, and changing map() to pool.map().

    if __name__ == '__main__':
	pool = multiprocessing.Pool(2)
	for row in range(master_image.rows): 
		# creates a generator which creates a slice of the pixel_matrix 
		# with the same dimensions as the template image to find, and 
		# for each pixel in the row. 

		columns = ((pix_mtrx_of_scrn, template, row, col, templte.size.x) for col in range(row_length)

		# I pass the columns generator to a map function which 
		# then builds the actual slice of the pixel_matrix and 
		# checks if the pixel data of the sub_image matches the 
		# pixel data for the template_image 
		results = pool.map(build_match_box, columns)
		if True in results: 
			print 'Found it!'
			break

And this works, but it is ungodly slow. Well over an order of magnitude slower! My guess is that my reference passing is now destroyed. In addition to the time spent pickling/unpickling everything, the entire pixel_matrix (which is (3840, 1080) in this case) is getting deep copied on every function call. Terrifically inefficient!

So, it seems like the way around this would be to store the two big offenders as state in the build_match_box function. If I understand correctly, then it should only copy it once for each instance of a worker -- which in my case would be around 4. IF that is indeed how things would work, then the only things that would need to be pickle and sent to the build_match_box function would be the three Integers.

Pool Attempt 2:

I wrap up build_match_box in a closure passing in the two lists as stored state.

def build_match_box(dataa, sourcee):
	def inner(x):
		data = dataa 
		source = sourcee 
		row, col, img_size = x
		match_box = [data[row + x][col : img_size + col] for x in xrange(img_size)]
		return match_box == source
	return inner

if __name__ == '__main__':

    data = [comprehension which loads a ton of state]
    source = [comprehension which also loads a medium amount of state]

    modified_func = build_match_box(data, source)

    pool = multiprocessing.Pool(2)
    for num in range(100):
        columns = ((pix_mtrx_of_scrn, template, row, col, templte.size.x) for col in range(row_length)
        result = pool.map(modified_func, columns)

However, this returns the pickle error as it seems that you cannot call a function with multiprocessing from inside of the same scope. I base this on a Stack Overflow question I found here: http://stackoverflow.com/questions/11287455/how-do-i-avoid-this-pickling-error-and-what-is-the-best-way-to-parallelize-this

It looks like the function has to be looked up outside of the current scope..? This is the exact error:

Error:

PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

I tried a whole range of options from storing the state in a class and passing the class' method to the pool.map function, but that also throws a pickle error. I tried using an infinite generator to store the state ala:

def builder(p,t):
	p_matrix = p 
	template = t 
	while True: 
		row, col, img_size = yield
		match_box = [data[row + x][col : img_size + col] for x in xrange(img_size)]
		yield match_box == source

And then passing the send function in the map call:

if __name__ == '__main__':

	processor = builder(p_matrix, template)
	processor.next()

	pool = multiprocessing.Pool(2)
	for row in range(master_image.rows): 
		# creates a generator which creates a slice of the pixel_matrix 
		# with the same dimensions as the template image to find, and 
		# for each pixel in the row. 

		columns = ((pix_mtrx_of_scrn, template, row, col, templte.size.x) for col in range(row_length)

		# I pass the columns generator to a map function which 
		# then builds the actual slice of the pixel_matrix and 
		# checks if the pixel data of the sub_image matches the 
		# pixel data for the template_image 
		results = pool.map(processor.send, columns)
		if True in results: 
			print 'Found it!'
			break

But I just can't escape that stupid error! >:( It seems like the ONLY way to avoid the error is to call the class from outside of the current function -- and since that's the case, I don't know how to add state to that function without referencing it inside of the current scope! Frustrating! Am I missing something obvious?

I suppose that a work around would be to simple subclass multiprocessing.Process and store the state that way and use a Queue to dispatch everything to the workers, but at this point, that feels like defeat, which I shan't accept! There's gotta be a way to safely pickle an object that is referenced in the current scope, no?

Bruce Frederiksen

unread,

Oct 14, 2013, 3:06:22 PM10/14/13

to TamPy Bay Python meetup google group

OK, I'll take a stab at this... You have many questions, so I may miss some.

Call by reference: yes! All arguments passed by reference. In addition, all data is stored in variables and containers (list, dicts, etc) by reference. (This doesn't apply to multiprocessing).

Pickle: not _everything_ is pickle-able. When you pickle functions or classes, only the module name and function/class name are put in the pickle. The function/class is then looked up again in its module (automatically importing the module if that hasn't already been done), when the pickle is loaded. Thus, dynamically assigned attributes (i.e., attributes not assigned as part of the module initialization process) on the function/class are not going to be pickled and seen when later loaded. This means that only top-level functions/classes are pickle-able.

Multiprocessing: Neither of the two images change, right? So can these be passed once in the "multiprocessing.Process(...)" call? You might also explore the shared memory options that multiprocessing supports.

Large Numeric Arrays: I don't know what type your image data is. If you are storing lists of numbers, these numbers are stored by reference in the list, thereby consuming ~twice as much memory as storing the numbers directly. You could look at the array module (in the standard library) or the more capable numpy module. (If you are using image types from an image library, or storing pixels as bytes in a string or bytearray, then you shouldn't need these).

-Bruce

--
You received this message because you are subscribed to the Google Groups "TamPy Bay" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tampybay+u...@googlegroups.com.
To post to this group, send email to tamp...@googlegroups.com.
Visit this group at http://groups.google.com/group/tampybay.
For more options, visit https://groups.google.com/groups/opt_out.

Chris

unread,

Oct 19, 2013, 8:36:21 PM10/19/13

to tamp...@googlegroups.com

Ohhh... I didn't realize that's how Pickle worked. That actually makes a lot of sense now.

Shared Memory was definitely the way to go! Once I dumped everything into an shared array as you suggested, I was able to bring the search time down significantly. Still not super great (my 'algorithm' is a bit on the terrible side).. it takes about 10 seconds to search though a 1920x1080 image -- but! It's good enough for my purposes!