I am trying to get clarification on a statistic given by Brett
Slatkin's google i/o talk on building scalable apps on app engine.
This is the talk I am referencing.
http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine
In the discussion, Brett mentions writes have a rate of 100 seeks per
second where the size and shape of your data actually determining the
write throughput of an entity. He then goes on to say that "You can
think of this as being the maximum write throughput for a single
entity on disk."
To me, this statement means that a single entity can be written up to
a maximum of 100 times per second. I would like to clarify that my
interpretation of Brett's statement is correct.
If anyone could clarify my understanding to be correct, I would
appreciate any input that people have.
Best,
Marc
A put() needs about 3 writes and 1 read, each writes needs at least
10ms seek time. Considering the network latency and disk write time,
most entities cannot be written more than 5 times in a second.
2010/2/3 marcdmarc <marcd...@gmail.com>:
> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
>
>
the language is a little tricky here. 风笑雪's response is a closer to
reality. brett's argument was for pure disk operations at the lowest
level, e.g., he was speaking solely of max possible write operations
to disk (and not *entity* writes to disk). it's not even possible to
write 100 small entities to disk in a sec because of the overhead of
journaling, indexing, and verification.
i also received some clarification from brett to confirm:
"[My] point in saying that was to illustrate that *base-case* with a
10ms seek time you could do 100 writes/sec, and that doesn't even
include the data transfer time. With data larger than one disk block
and operating system overhead the potential write throughput for a
single entity is way less."
if you wanted to do a real measurement, you could use time.time() in 2
places to measure write throughput for your app and get a rough idea.
keep in mind that your app runs on different machines and different
disks so an average number is the best rough estimate.
hope this helps!
-- wesley
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Core Python Programming", Prentice Hall, (c)2007,2001
"Python Fundamentals", Prentice Hall, (c)2009
http://corepython.com
wesley.j.chun :: wesc...@google.com
developer relations :: google app engine