How to share data files between versions

95 views
Skip to first unread message

web...@macfh.co.uk

unread,
Apr 6, 2015, 8:01:11 AM4/6/15
to google-a...@googlegroups.com
I have an experimental Google App ...
    www.macfhoslp.appspot.com
... which uses UK Ordnance Survey Landform Panorama terrain data to provide either spot heights or a terrain profile between two points, where the points are given in UK eastings and northings.  UK OS LP data totals 628MB.

Although the current version works fine, it is a little slow, so I want to upload a new version which should be a little faster.  However, this fails because the storage limit of 1GB is exceeded.  Is there any way to tell the system that the UK OS LP data is invariant and should be shared between all versions?

Come to that, if needed, how could I share it with other Google Apps?

TIA

Barry Hunter

unread,
Apr 6, 2015, 8:39:59 AM4/6/15
to google-appengine
Where is the "data" actully stored? Just files in the 'filesystem'? 

Might find it worthwhile to use Google Storage instead for the data files

... Particular if can make 'ranged' requests. ie can seek directly and read part of a file, rather than the whole file. 

(I've found the 'filesystem' itself to often be slow in AppEngine)



--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/de9092cc-4953-496c-8b6b-f4e220f1ea6e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

web...@macfh.co.uk

unread,
Apr 6, 2015, 9:15:44 AM4/6/15
to google-a...@googlegroups.com
Thanks, but see below ...


On Monday, April 6, 2015 at 1:39:59 PM UTC+1, barryhunter wrote:
Where is the "data" actully stored? Just files in the 'filesystem'? 

Yes, it is stored in a directory off root, pretty much exactly as originally supplied by OS in 812 files over 56 sub-folders.
 
Might find it worthwhile to use Google Storage instead for the data files

As this is intended to be just a demo, I'd rather avoid such complications, if possible.
 
... Particular if can make 'ranged' requests. ie can seek directly and read part of a file, rather than the whole file. 

Don't think that would apply here.
 
(I've found the 'filesystem' itself to often be slow in AppEngine)

But would it really be quicker to transfer the files over the web for the Google App to process?  Surely it must be quicker to store the necessary data locally to the app?

Barry Hunter

unread,
Apr 6, 2015, 9:29:44 AM4/6/15
to google-appengine

 
 
(I've found the 'filesystem' itself to often be slow in AppEngine)

But would it really be quicker to transfer the files over the web for the Google App to process? 

Its not really 'over the web'. In general Google Storage is pretty 'close' to AppEngine, probably within the same data-center. 

 
Surely it must be quicker to store the necessary data locally to the app?

Well AppEngine is itself a distributed system. The 'code' is stored on some sort of storage subsystem. When an instance is started for your application, the code has to be copied from that system to the local instance. In general its copied as needed (rather than transferring the up to 1Gb on instance startup) 

Copying over the network from that repository, vs copying from dedicated Google Storage, is probably not significantly different. Been a while since I benchmarked it tho. 


... anyway was only a suggestion how to get round your "problem". 


Vinny P

unread,
Apr 7, 2015, 5:10:58 PM4/7/15
to google-a...@googlegroups.com
On Mon, Apr 6, 2015 at 8:28 AM, Barry Hunter <barryb...@gmail.com> wrote:
Copying over the network from that repository, vs copying from dedicated Google Storage, is probably not significantly different


+1.  

As Barry noted, you would want GCS to store your data files. PHP on GAE does a similar trick: instead of writing to a local filesystem, PHP/GAE uses gs:// file names to route reads and writes to a GCS bucket.

If you're really focused on speed, one alternative - and this would get costly - is to (ab)use dedicated memcache and store your dataset in it. You'd have to deal with cache eviction policies and you might exceed the ops/sec/GB limit, but it's doable given enough programming expertise.
 
 
-----------------
-Vinny P
Technology & Media Consultant
Chicago, IL

App Engine Code Samples: http://www.learntogoogleit.com
 
 


web...@macfh.co.uk

unread,
Apr 11, 2015, 4:29:27 AM4/11/15
to google-a...@googlegroups.com
Thanks to both, but ...


On Tuesday, April 7, 2015 at 10:10:58 PM UTC+1, Vinny P wrote:

On Mon, Apr 6, 2015 at 8:28 AM, Barry Hunter <barryb...@gmail.com> wrote:
 
Copying over the network from that repository, vs copying from dedicated Google Storage, is probably not significantly different

+1

I gave it a try, and ...
     :-(    Google's self-contradictory documentation meant that GCS was a pain to set up
     :-(    GCS is about half the speed of using the local FS.  See the table below.

Averaged over five different test paths:
    Test             Processing Time    Travel Time    Total Time
    OSLP+FS       1.006                    0.770            1.776
    OSLP+GCS    1.828                    1.260            2.722
    OST50+FS      2.292                    0.894            3.186
    SRTM+FS       0.177                    0.947            1.124

For comparison, also included are SRTM and the more recent OS Terrain 50 datasets, both using local FS lookups, averaged over the same five test paths.  SRTM is much the fastest, because it is already in short integer binary format, and has overlapping tiles, while both Landform Panorama and T50 datasets are in ASCII text format, so every point on the path must be converted from text to floating-point, and the T50 is much the slowest because it doesn't have overlapping tiles.

web...@macfh.co.uk

unread,
Apr 11, 2015, 5:38:15 AM4/11/15
to google-a...@googlegroups.com


On Saturday, April 11, 2015 at 9:29:27 AM UTC+1, web...@macfh.co.uk wrote:


     :-(    GCS is about half the speed of using the local FS.  See the table below.

Actually it's even worse than the table implied, because I'd failed to notice that the test path that goes over open water between two islands (where therefore part of the path had no tiles),  had thrown an error in the GCS version.  When I corrected the exception trapped, the time for processing this path was anything between about 35 - 55 seconds!  Perhaps it would be possible to reduce this significantly by fiddling about with the retry parameters, but there seems little point in experimenting further when the result is still going to be slower anyway.

Incidentally, I got around the original upload problem by deleting the OSLP folder from the previous version, updating it, then adding the OSLP folder to the new version, and updating again.  It was a tiresome bore, and you couldn't do it with a serious app that had to remain live at all times, but as this is a only intended to be a demo, it was an acceptable workaround.

Reply all
Reply to author
Forward
0 new messages