Re: [google-appengine] Best strategy

178 views
Skip to first unread message
Message has been deleted

Joshua Smith

unread,
Mar 11, 2013, 1:59:11 PM3/11/13
to google-a...@googlegroups.com
Don't use Java.

Use python (or perhaps Go, but definitely not Java).

What you are describing is a trivial program. Write it the most obvious way, and if you find that you need to optimize it, hire Brandon Wirtz to do it for you.

-Joshua

On Mar 11, 2013, at 1:44 PM, victo...@gmail.com wrote:


Hello, I was wondering if someone could help me choose the best strategy for my application in terms of design.I want to use the least amount of resources as I can and optimize my code. My application is essentially written in html and javascript. I will use java to access the datastore (unless there is another more efficient way?). So essentially I will be storing two main things inside the datastore. A notification status (1 single integer, I am expecting about 200,000 of these to be stored) and content (5000 ascii characters each expecting about 200,000 of these as well). They will be access quite frequently (most likely 50,000 notifications and content will be read and updated every 30 mins). I have read much about optimizing from google here, but I am wondering as well what people would suggest I do to use the least amount fo resources so I can compare.

Thanks!
Cheers

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Message has been deleted

Joshua Smith

unread,
Mar 11, 2013, 2:16:08 PM3/11/13
to google-a...@googlegroups.com
It is a very heavyweight solution to a very lightweight problem.

If you look through the archives here for people complaining about startup time, excessive billing, etc., you will find that they are almost all using Java.

IMHO, the only time it makes sense to use Java on GAE is if you have a massive existing code base, and cannot afford to port it; or you need to maintain it in Java for other reasons and don't want to fork. For a new project on GAE, Java is never the right answer.

-Joshua

On Mar 11, 2013, at 2:03 PM, victo...@gmail.com wrote:

Thanks for the reply Josh, but why not use Java?
Message has been deleted

Joshua Smith

unread,
Mar 11, 2013, 2:43:39 PM3/11/13
to google-a...@googlegroups.com
GAE Java is nothing like the Java being profiled there. It is not the same VM, and it runs in a completely different manner.

Also, the described application is I/O bound, so the thing being measured there, even if it was something like GAE's VM, would not be relevant to the OP's question.

On Mar 11, 2013, at 2:33 PM, victo...@gmail.com wrote:

I was just looking at benchmarks fro efficient of both programs and Java seems to be much more efficient than Python. For instance I looked up some benchmarks on different OS and machines, and Java came out on top by a lot in every benchmark. For instance: http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?test=all&lang=java&lang2=python3

Jim

unread,
Mar 11, 2013, 5:04:04 PM3/11/13
to google-a...@googlegroups.com
what about a project where you have extensive UI requirements and want to use Google Web Toolkit?  

what about a project where you want to make use of the vast array of commercial and open source Java code that exists already?

what about a project where you may want to port to an in-house enterprise environment down the road?

these are just three scenarios where Java would most definitely be the right answer.  I'm sure there are many others.
Message has been deleted

Carl Schroeder

unread,
Mar 11, 2013, 6:08:35 PM3/11/13
to google-a...@googlegroups.com
It doesn't matter how efficient Java is or is not at some arbitrary problem. Do not use Java on GAE. It is not ready for prime time. The instance startup scheduler will make your life hell. Use Python or Go.

I just got done porting all of my Java code to Go and Python. I would have others avoid this mistake I made.

However, I am also a connoisseur of schadenfreude. So if you ignore my advice do go with GAE/Java, be sure and post here about your performance problems so I can feed upon your soul. ;)

On Monday, March 11, 2013 2:38:02 PM UTC-7, victo...@gmail.com wrote:
Thanks for the reply Jim. The things you mentioned below I am not doing, but in terms of reading strings from the data store very frequently and updating them, which is the sole purpose of my app, would you agree in this scenario python is more efficient than Java?

Jeff Schnitzer

unread,
Mar 11, 2013, 8:39:40 PM3/11/13
to Google App Engine
I just can't resist this conversation, sigh.

What do you mean by "efficient"? In GAE-land there is only "more
expensive" and "less expensive". Depending on what you are doing,
Java _could_ be less expensive than Python; lots of algorithmic
manipulation (even just a lot of serialization) will burn
significantly more Python-hours than Java-hours.

But if you go Java, and you have a sophisticated app (ie lots of class
files), you will start to bump into long app startup times and
user-facing cold starts, which screws up the UX. Python is not immune
to this problem, just less severe. Go apps appear to be immune... but
then you have to use Go.

The bigger problem is that it sounds like you want to iterate through
50k entities in some sort of aggregation pattern. You may end up
jumping through some hoops to make that perform.

Jeff


On Mon, Mar 11, 2013 at 5:38 PM, <victo...@gmail.com> wrote:
> Thanks for the reply Jim. The things you mentioned below I am not doing, but
> in terms of reading strings from the data store very frequently and updating
> them, which is the sole purpose of my app, would you agree in this scenario
> python is more efficient than Java?
>
>
> On Monday, March 11, 2013 10:44:54 AM UTC-7, victo...@gmail.com wrote:
>>
>>
>> Hello, I was wondering if someone could help me choose the best strategy
>> for my application in terms of design.I want to use the least amount of
>> resources as I can and optimize my code. My application is essentially
>> written in html and javascript. I will use java to access the datastore
>> (unless there is another more efficient way?). So essentially I will be
>> storing two main things inside the datastore. A notification status (1
>> single integer, I am expecting about 200,000 of these to be stored) and
>> content (5000 ascii characters each expecting about 200,000 of these as
>> well). They will be access quite frequently (most likely 50,000
>> notifications and content will be read and updated every 30 mins). I have
>> read much about optimizing from google here, but I am wondering as well what
>> people would suggest I do to use the least amount fo resources so I can
>> compare.
>>
>> Thanks!
>> Cheers
>

Alpha Scorpii

unread,
Mar 11, 2013, 8:49:06 PM3/11/13
to google-a...@googlegroups.com
go is just very efficient as i have tried. also make use of memcache wherever possible to cache datastore query result
Message has been deleted

Joshua Smith

unread,
Mar 12, 2013, 9:30:15 AM3/12/13
to google-a...@googlegroups.com
Those are all valid, and I put them under the umbrella "or you need to maintain it in Java for other reasons and don't want to fork."

The OP is writing a trivial new program and all it is going to do is read/write datastore. Writing that program in Java on GAE would be foolish.

Writing it in Python would be safe.

Writing it in Go would be interesting, scary, and probably not that big a deal to abandon and switch to Python if Go proves too weird.

-Joshua

Vinny P

unread,
Mar 12, 2013, 1:08:26 PM3/12/13
to google-a...@googlegroups.com
There's a lot of anti-Java sentiment here, which is justifiable considering the problems Java has and is having on AppEngine. With that said, Java is only a problem if you're loading in a bunch of libraries (classloading is slow on GAE). If all you're doing is manipulating the datastore,you can do that with the low-level API and skip using the Apache PersistenceManager and other datastore abstractions. 

I would also recommend that you look into Cloud SQL ( https://developers.google.com/cloud-sql/docs/introduction ). While the datastore is a great place to store data, it can also be very spiky in terms of latency. Cloud SQL IMHO seems to be less spiky, and since your data has a clear schema, SQL can work for you. It might also be cheaper, depending on your use patterns.

-Vinny

Carl Schroeder

unread,
Mar 12, 2013, 1:18:24 PM3/12/13
to google-a...@googlegroups.com
The sentiment is not anti-Java, it is anti GAE/Java. Writing "Hello world" to the datastore takes 4-5 seconds to cold start load on an F2 in GAE/Java. That is on good days, it could take twice as long on stormy days. The more the application grows, the worse this gets.

Why risk 5+ second response times? Wait until Google has fixed the Java instance loading time issue before you embark on writing Java apps for GAE.
FYI, I used the low-level API and no frameworks. Latency was untenable.

Vinny P

unread,
Mar 12, 2013, 2:05:21 PM3/12/13
to google-a...@googlegroups.com

On Tuesday, March 12, 2013 12:18:24 PM UTC-5, Carl Schroeder wrote:
The sentiment is not anti-Java, it is anti GAE/Java. Writing "Hello world" to the datastore takes 4-5 seconds to cold start load on an F2 in GAE/Java. That is on good days, it could take twice as long on stormy days. The more the application grows, the worse this gets.

Why risk 5+ second response times? Wait until Google has fixed the Java instance loading time issue before you embark on writing Java apps for GAE.
FYI, I used the low-level API and no frameworks. Latency was untenable.


I can't say that I share your experiences with GAE/J, I find that I can routinely cold start instances + datastore requests in less than 5 seconds, although I'm recently having issues with memcache slowness...

True, GAE/J has problems. But so do the Python and Go runtimes. I'm subscribed to the google-appengine-go Google group, and I see as many complaints there, as I do on this group about Python/Java.  Giving the OP a balanced view of the pros/cons of each runtime is the point of this thread, and I think it's fair to let him know what he's looking at.

@OP: if you're interested in looking more about Java performance, may I recommend this thread (it's a bit of a long read, to give you fair warning): https://groups.google.com/d/msg/google-appengine/sA3o-PTAckc/okDqu5aE-78J 

Jim

unread,
Mar 12, 2013, 4:02:02 PM3/12/13
to google-a...@googlegroups.com
I would have to defer to others with more recent experience using Python on GAE.  I wrote a very early GAE/Python app when GAE first launched, but then switched to Java when it became available, mostly for the reasons I cited in my earlier post.  My early experience with Python was that it also had significant cold start times and could be a pig, especially with the Django framework.  But like I said, that was over four years ago and I'm sure things have changed a lot since then.



On Monday, March 11, 2013 4:38:02 PM UTC-5, victo...@gmail.com wrote:
Thanks for the reply Jim. The things you mentioned below I am not doing, but in terms of reading strings from the data store very frequently and updating them, which is the sole purpose of my app, would you agree in this scenario python is more efficient than Java?

Message has been deleted

Carl Schroeder

unread,
Mar 12, 2013, 7:36:39 PM3/12/13
to google-a...@googlegroups.com
This has been documented to death on these forums with resident instances for paid apps and posted logs with missing warmup requests. 

Also, "Routinely" is a very loose description. Does that mean that sometimes your app loads faster than 5 seconds and sometimes slower?
What instance size are you talking? F4s or F1?

If you have stumbled upon something magical that accesses the user & datastore apis and never takes longer than 5 seconds to service requests in an F1, I suggest you post some code here. You would close two open issues for Google and make the people who starred them really happy.

Renzo Nuccitelli

unread,
Mar 13, 2013, 2:22:50 PM3/13/13
to google-a...@googlegroups.com
 I am part of team anti "Java/GAE". There are a lot of reason because I prefer Python, but the main on is the cold start problem already mentioned. The fact that the cold start time more o less proportional to the number of class you have in you project make Java not first citizen on GAE. Just to make it clear, it is not a Java problem, is a GAE-Java interaction problem.

  About it, it's not true that Python can have this problem. You can avoid it with the right approach. If you could make your code to be read in a lazy way, cold start would be not a problem. I just wrote a small framework, Zenwarch to do this and the result is that my cold start is constant and small all the time, even when the number of script files grows.

 And even webapp2 has LazyHandlers (http://webapp-improved.appspot.com/guide/routing.html?highlight=routing%20string) So you can avoid the cold start.

 So, for small projects, use whatever language you like. But if you think your app can get bigger and you want to use java, use AWS.
Message has been deleted

timh

unread,
Mar 14, 2013, 2:02:51 AM3/14/13
to google-a...@googlegroups.com
I have also found (in python) that startup time is much quicker if you are only loading core appengine modules.  Even one non core module seemed to heavily affect startup times (especially on M/S when gae was having a bad hair day ;-).  I found early on that having a fast path to cached stuff that didn't involve any 3rd party modules usually meant a very quick startup time even on the worst M/S days.  If the cache lookup failed then start the full stack (ad that was as lazy as possible, ie don't load CRUD form definitions unless the user was logged in.)

All these strategies (which could be considered over-optimisations) tended to make even bad days bearable on python with M/S ;-)

I have since moved to HRD which has meant I haven't had to resort to some of these strategies.

T

Vinny P

unread,
Mar 14, 2013, 10:13:39 AM3/14/13
to google-a...@googlegroups.com
Honestly? It really doesn't matter what language you write in. If you were doing something complicated like image processing or heavy mathematical analysis, then we could talk about which libraries are available for each language and decide based on that. If all you're doing is datastore manipulation, any language is more than capable of handling that.

IMO, consider using Go. I've been dabbling in it for simple side projects, and it's quite a fun language. It also seems to be the future of AppEngine.

-Vinny

On Tuesday, March 12, 2013 6:32:43 PM UTC-5, victo...@gmail.com wrote:
Thanks for the replies! I am now more confused than I was before on which language I should use:P I am using datastore and to server static files and that is it. There seems to be better things about python in regards to class loading, but java seems to be potentially faster for other things and more portable. I think the best route for me would be to use Python because I will be loading the Datastore class MANY times.

Reply all
Reply to author
Forward
0 new messages