Thank you all for your interesting feedbacks.
The "JSON in datastore" solution seems quite well suited to my needs.
I keep in mind this idea of mixed/combo approach in case searching
becomes difficult with JSON storage.
In my case I think that instead of having all data in both format,
JSON and datastore classes, it could be better to keep the main data
in JSON format and have a partial copy in a few individual records for
searching and reporting purpose.
Even if CPU usage is a little bit higher with longer JSON string to
convert, I find it useful to reduce the datastore classes' complexity
without extra redundancy. And I don't really need frequent search in
that data. So this is the way to go.
Thanks again,
David
On 11 nov, 00:06, Robert Kluin <
robert.kl...@gmail.com> wrote:
> I am using a "mixed" approach. When I receive data I process it (by
> validating inputs and update related entity's statistics), then when I
> compute aggregates I create and store a JSON object containing individual
> records with them.
>
> Compared to individual entities, I have seen only a negligible decrease in
> write performance (a few extra CPU cycles used and no change in API time).
> I see a huge increase on read performance since 95% of the time I can
> directly use the entities with the JSON data. I only use the individual
> records for "advanced" searching and an infrequent report.
>
> Even when needing to display individual records I have found deserializing
> the JSON and writing fields out to be as quick (or faster!) than fetching
> the the individual records from the datastore. I have about 5 individual
> records bundled into 1 JSON object. I tested both methods performance using
> several hundred records of each format (identical data in the proper formats
> for each method).
>
> Robert
>
>
>
> On Tue, Nov 10, 2009 at 3:14 PM, Paul Kinlan <
paul.kin...@gmail.com> wrote:
> > Hi,
>
> > I am doing something similar at the moment onhttp://
www.ahoyo.com. We
> > parse feeds and aggregate them into a canonical JSON form that can be read
> > directly by our client applications. Pre-aggregating the data-feed as soon
> > as we poll it or receive a pubsubhubbub notification rather than compute it
> > when the client requests the data allows us to have a very speedy http
> > handler (it is important because this is the touch point for our users). We
> > aren't memcaching the data at the moment, but it is very simple to add in
> > and should save us a lot of datastore time on popular client applications.
>
> > There is very little processing effort required to give the data to our
> > clients so the cost should be predictable per user of our system, if we
> > didn't precompute the data the performance of the client applications is
> > datastore query, quantity of data and sort dependent (and for popular apps
> > it would end up costing us a lot of money).
>
> > There are downsides, none of this data that is json formatted is
> > searchable, but if you can live with that your solution is pretty much what
> > we do.
>
> > Paul
>
> > 2009/11/10 davidgm <
david.guyonmar...@gmail.com>