Something may be seriously wrong with my Custom Mapreduce - Python

19 views

Skip to first unread message

Kaan Soral

unread,

Jan 1, 2012, 5:26:16 PM1/1/12

to google-a...@googlegroups.com

1) First of all, I make a key_only query and order("__scatter__") and get scattered properties

2) Then I get the first element of the query, start, and I add this to begging of array (1)

3) Than I sort that array of keys ( I use the regular .sort() function on arrays )

so lets say I have keys k1 k2 k3 k4 k5 k6 on my array, k1 being the start_key I got on (1)
Now I can deploy workers (k1,k2) (k2,k3) ...

(k1,k2): starts from key k1, ends when it sees key k2

This way I achieve parallelism. Of course this is now my own creation, I got the idea from gae's mapreduce

But strange things happen, I have been using these for nearly a year now and I started thinking that I may be doing something wrong all these times.
I don't see all the results that I should see after the mapreduce gets completed.

An Initial concern: Can datastore keys be sorted? - To clarify: if a key k1 comes first then a key k2 on DB, is k1<k2 on python ?

Kaan Soral

unread,

Jan 1, 2012, 5:27:31 PM1/1/12

to google-a...@googlegroups.com

not* my own creation

Reply all

Reply to author

Forward

0 new messages