Something may be seriously wrong with my Custom Mapreduce - Python

19 views
Skip to first unread message

Kaan Soral

unread,
Jan 1, 2012, 5:26:16 PM1/1/12
to google-a...@googlegroups.com
1) First of all, I make a key_only query and order("__scatter__") and get scattered properties

2) Then I get the first element of the query, start, and I add this to begging of array (1)

3) Than I sort that array of keys ( I use the regular .sort() function on arrays )

so lets say I have keys k1 k2 k3 k4 k5 k6 on my array, k1 being the start_key I got on (1)
Now I can deploy workers (k1,k2) (k2,k3) ...

(k1,k2): starts from key k1, ends when it sees key k2

This way I achieve parallelism. Of course this is now my own creation, I got the idea from gae's mapreduce


But strange things happen, I have been using these for nearly a year now and I started thinking that I may be doing something wrong all these times.
I don't see all the results that I should see after the mapreduce gets completed.

An Initial concern: Can datastore keys be sorted? - To clarify: if a key k1 comes first then a key k2 on DB, is k1<k2 on python ?

Kaan Soral

unread,
Jan 1, 2012, 5:27:31 PM1/1/12
to google-a...@googlegroups.com
not* my own creation
Reply all
Reply to author
Forward
0 new messages