Should a Cursor loop exceed the soft private memory limit?

65 views
Skip to first unread message

Mark Mandel

unread,
Jun 17, 2014, 12:05:07 AM6/17/14
to google-appengine-go
So I'm wrestling with doing a data change to 50K Entities (totally separate issue), and I wanted to see how long it would take to simply loop through them using a Cursor all and based on certain conditions, I could queue the actual update process from there.

However, I found that no matter how simple I made it, GAE would always fail with a: "Exceeded soft private memory limit with 158.586 MB after servicing 2 requests total"

What I don't get is that I'm using a Cursor, which is meant to be for exactly this situation (unless I missed something, which is possible), so why is this happening?

I'm using a Delayed Queue Task here, so I can get a longer request time, but I get the same issue if I run it directly in the Handler as well.

All I am doing is, opening a query, and stepping through each record. I wouldn't have thought that this would have caused memory to increase at all. Timeout, maybe, I was quite surprised at the memory increase.

Here is the code in question:

func handleFoo(w http.ResponseWriter, r *http.Request) {
ctx := appengine.NewContext(r)
ctx.Infof("Starting Migration. V3")

f := delay.Func("processFoo", processFoo)
f.Call(ctx)

fmt.Fprintln(w, "Ok. V1")
}

func processFoo(ctx appengine.Context) {
q := datastore.NewQuery("Foo")
i := q.Run(ctx)

index := 0

for {
z := thingo.Foo{}
_, err := i.Next(&z)

if err == datastore.Done {
break
} else if err != nil {
ctx.Warningf("Error with Next() %v", err)
}

index += 1
}

ctx.Infof("Run through all the data. %v", index)
}

Thanks in advance,

Matthew Zimmerman

unread,
Jun 17, 2014, 8:36:22 AM6/17/14
to Mark Mandel, google-appengine-go
I've run into the memory limit with github.com/mzimmerman/sdzpinochle
(a card game AI) due to my algorithm needing to hold so many
permutations of legal outcomes in the game. Running under the memory
profiler really helped. I find it easiest to do that using go test to
setup the profiling. http://blog.golang.org/profiling-go-programs

How big is a thingo.Foo{}? The first thing I see here is that you're
re-creating the Foo each time. As I understand it, it would be easier
on the garbage collector if instead of creating a new one you created
a zero() method instead which would reset it on each iteration. At
that point, I don't think your code would need to allocate any
additional memory.
> --
> You received this message because you are subscribed to the Google Groups
> "google-appengine-go" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to google-appengin...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Mark Mandel

unread,
Jun 17, 2014, 6:12:10 PM6/17/14
to Matthew Zimmerman, google-appengine-go
Thaks for the feedback.

I ended up resolving my issue with a slightly different approach, but this brings up some interesting questions:


On Tue, Jun 17, 2014 at 10:36 PM, Matthew Zimmerman <mzimm...@gmail.com> wrote:
I've run into the memory limit with github.com/mzimmerman/sdzpinochle
(a card game AI) due to my algorithm needing to hold so many
permutations of legal outcomes in the game.  Running under the memory
profiler really helped.  I find it easiest to do that using go test to
setup the profiling.  http://blog.golang.org/profiling-go-programs

I've not gone into profiling yet, so thanks for that, I will have a good read of that shortly.
 

How big is a thingo.Foo{}?  

It has 4 properties on it, each one is a string that's < 50 characters.
 
The first thing I see here is that you're
re-creating the Foo each time.  As I understand it, it would be easier
on the garbage collector if instead of creating a new one you created
a zero() method instead which would reset it on each iteration.  At
that point, I don't think your code would need to allocate any
additional memory.

Interesting. So this seems like more a limitation of Go's GC, not being able to clean up 100%, that you need to do things like resetting an object back to a zero state rather than just create a whole new one.  Coming from a JVM background, that's not how I'm used to thinking, but I can change ;)

Thanks for the pointers, good things to keep in mind.

Mark 

Matthew Zimmerman

unread,
Jun 18, 2014, 1:09:13 AM6/18/14
to Mark Mandel, google-appengine-go

Just profile it; I could be completely wrong.  In the profiling post:
" Every time FindLoops is called, it allocates some sizable bookkeeping structures. Since the benchmark calls FindLoops 50 times, these add up to a significant amount of garbage, so a significant amount of work for the garbage collector.

Having a garbage-collected language doesn't mean you can ignore memory allocation issues. In this case, a simple solution is to introduce a cache so that each call to FindLoops reuses the previous call's storage when possible."

strings are immutable so you can't reuse their storage so I'd recommend to store your data as []byte and zero the length before loading data again

Reply all
Reply to author
Forward
0 new messages