Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
surprising behavior w/ memoized function
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  2 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Andrew Xue  
View profile  
 More options Apr 28 2012, 6:12 am
From: Andrew Xue <and...@lumoslabs.com>
Date: Sat, 28 Apr 2012 03:12:00 -0700 (PDT)
Subject: surprising behavior w/ memoized function
hi --

so i have a lookup function that basically does a mapreduce job to
read small dimension data from S3 and then puts it into a hashmap. i
memoized the function so that the map is stored in memory. code looks
like this

(defn- get-referral-dimension-map* [referral-dimension-path]
  (let [rd-src (sdt/get-query referral-dimension-path ["!referral_key"
"!ref_name"])  <- this makes a query and selects fields !referral_key
and !ref_name
        tuples (??- rd-src)]
    (into {} (first tuples))))

(def get-referral-dimension-map (memoize get-referral-dimension-map*))

(defn get-referral-name [referral-key referral-dimension-path]
  (let [m (get-referral-dimension-map referral-dimension-path)] (m
referral-key)))

this gets called in a "main'"query, something like

(<- [?referral_name]
    (src ?referral_key)
    (get-referral-name ?referral_key :> ?referral_name))

oddly, the behavior I am observing is that a mapreduce job is launched
for the referral-dimension data for every map task in the "main" query
-- just seems like once one map task has called the get-referral-name
function, it be memoized and all subsequent map tasks on a node that
call that function should not need to re-do the mapreduce job.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andrew Xue  
View profile  
 More options Apr 28 2012, 6:53 am
From: Andrew Xue <and...@lumoslabs.com>
Date: Sat, 28 Apr 2012 03:53:19 -0700 (PDT)
Local: Sat, Apr 28 2012 6:53 am
Subject: Re: surprising behavior w/ memoized function
so the fix for this was to do something like

(let [rd-map (get-referral-dimension-map* referral-dimension-path)]
(<- [?referral_name]
    (src ?referral_key)
    (rd-map ?referral_key :> ?referral_name)))

the odd thing is that in this case there is only one mapreduce job for
the referral dimension data -- but this also seems odd; my intuition
feels like there should be at as map mapreduce jobs as there are nodes
(ie, copies of the app jar)? what is actually going on under the hood?

On Apr 28, 6:12 am, Andrew Xue <and...@lumoslabs.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »