Access to bulk data from DB possible?

22 views
Skip to first unread message

Luca de Alfaro

unread,
Dec 14, 2011, 12:56:55 PM12/14/11
to google-a...@googlegroups.com
We need to write an app/site whose general usage pattern matches quite well AppEngine, except for one thing: the site will be collecting data / logs, and every now and then, we need to download bulk data from the database tables collecting such data / logs, for offline analysis. 
If we went with a "non-appengine" solution, we could obviously do this by generating e.g. csv files from DB queries or dumps, and then processing the files. 
Will it be possible to have the same kind of simple access to bulk data on AppEngine?  We can of course build a (web) API that issues DB queries and produces csv files for download, but my concerns are:
  1. I remember that there is a limit to how many records can be extracted from a DB using a query (1000?), so that we would have to implement the query with continuation parameters etc -- feasible, but complicating the design.
  2. I worry about the execution time of the query (is it still true that processes taking over 1s are killed?). 
I guess access to log files / bulk data must be a pretty common requirement for many apps, so I am hoping someone has good words of advice... 
Many thanks!!

Luca

Ikai Lan (Google)

unread,
Dec 14, 2011, 4:28:03 PM12/14/11
to google-a...@googlegroups.com
1. This limit is gone, but obviously the time it takes to do a query scales with the result set. The design is actually quite easy: get back a cursor, pass the cursor on the next iteration. I don't trivially say things are easy; I err on the side of saying something is too complex.

2. Nope, not anymore.

You might want to look into something like appengine-mapreduce. In theory you should be able to aggregate large numbers of entities into a few blobstore (App Engine files API) entities:


I say "in theory" because this tool is still experimental. We have every intention of making this tool a core part of the platform but it isn't there yet. Still, many developers are using this in production now for offline computation:


Luca

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/FD50slblUYEJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Reply all
Reply to author
Forward
0 new messages