Datastore Enhancement ideas in SDK

133 views
Skip to first unread message

Ugorji Nwoke

unread,
Sep 28, 2011, 10:06:35 AM9/28/11
to google-ap...@googlegroups.com

GO on App Engine currently provides us with a very very nice ORM-like solution (where they Marshal/Unmarshal data from the datastore into typed structs or regular maps for us). Working with it and GO has been very exciting - I feel like one of the first set of kids to discover the candy store ;)

There are some features which would be quite compelling for the community at large if they were natively built-in to the datastore feature set. I needed and built these while working on my java app engine application, and always envied Python App Engine folks because they had these provided out of the box (see ndb). I'm hoping that we can enjoy the same with the GO Runtime (it's already close).

These include:
- "Optional" Integrated caching: L1 (request-scoped, in-process) and L2 (Memcache)
- Embedded Types (stored as . separated columns)
  Include Support storing maps of primitive to primitives (2 columns: fieldName and fieldName_)
- Alternate Datastore Column Names for fields
- Callbacks: preSave, postLoad. Allow app reject a load/save request also
- Functions for decisions: store/index this property? 
- Polymorphic Storage/Queries 

I've tried to make my case for each of these in-depth at http://blog.ugorji.net/2011/09/datastore-enhancement-for-go-language.html

Please can you folks take a look at it and discuss. I'm hoping that these could be built-in, as most of the work to support them is already in place in the code. Doing it as a layer on top would lead to a lot of duplicity of work.

Andrew Gerrand

unread,
Sep 28, 2011, 1:04:18 PM9/28/11
to google-ap...@googlegroups.com
I would love to have some of these features, particularly automatic
caching. It would certainly save much boilerplate in typical apps.

The Go App Engine team is small. Right now our focus is on stability,
ease of use, improved API support, and bringing the Go runtime out of
"experimental" status. The features you've described are desirable but
are hardly core requirements, so it's unlikely we'll have the time for
work on such things for a while.

However, most of the stuff you describe here could be built on top of
the existing datastore API. If you're up for it, I would encourage you
to try implementing it yourself.

Andrew

Ugorji Nwoke

unread,
Sep 28, 2011, 1:36:23 PM9/28/11
to google-ap...@googlegroups.com
Thanks Andrew.

I understand the need to focus on the core requirements now, and agree that these features are not core requirements (as they can be built atop the SDK). 

I didn't want to create a whole new API on top of the really good once provided by the SDK and possibly duplicate work being built into the SDK. Since I know it's not being worked on by the team in the near future, I'd start working on it.



David Symonds

unread,
Sep 28, 2011, 1:54:48 PM9/28/11
to google-ap...@googlegroups.com
On Wed, Sep 28, 2011 at 2:06 PM, Ugorji Nwoke <ugo...@gmail.com> wrote:

> - Alternate Datastore Column Names for fields

> - Functions for decisions: store/index this property?

These two are on our short-term roadmap already.


Dave.

Ugorji Nwoke

unread,
Sep 28, 2011, 2:21:28 PM9/28/11
to google-ap...@googlegroups.com
Thanks David.

That's really great. I wouldn't work on those ones then. 

Can we have a compromise for the embedded types? I think many apps will have fields which are not primitive (num, bool, string, []byte, key) or slice of it. If a field is not one of these but is exported, can the SDK store it as a Compressed gob? This way, the only limitation is that it cannot be queried upon, but it can be stored/retrieved and used in code. It's hard to build just this feature by itself without re-building most of the functionality that the SDK already provides. 

The preSave/postLoad should be easy to add. It will involve creating a mirror wrapper API over the datastore package functions (which I was hoping to avoid). It sounds like a simple interface with niladic methods which can be called on the struct. 

The caching is the beast, but a side-cache can be built for it. I'd work on that first. 

Thanks again for your quick responses. Much appreciated.

Kyle Lemons

unread,
Sep 28, 2011, 4:17:51 PM9/28/11
to google-ap...@googlegroups.com
Can we have a compromise for the embedded types? I think many apps will have fields which are not primitive (num, bool, string, []byte, key) or slice of it. If a field is not one of these but is exported, can the SDK store it as a Compressed gob? This way, the only limitation is that it cannot be queried upon, but it can be stored/retrieved and used in code. It's hard to build just this feature by itself without re-building most of the functionality that the SDK already provides. 

One approach for a "translation" between a struct with non-primitive types and the datastore API which could be implemented by a client (as well as renaming of fields, though it sounds like this is on the roadmap) would be to utilize the ability of the datastore API to store/retrieve map[string]interface{}.  You'd reflect over the type and build the map, then use that for the actual datastore operation.  You could do gob, json, whatever sort of marshaling and compression you wanted on things that the datastore doesn't handle natively, and store them as []byte.  (I'm not saying this would be fast though.)  If you had this layer in place, you could also use that as a place to implement "transparent" memcache.

~K

Ugorji Nwoke

unread,
Sep 28, 2011, 4:35:58 PM9/28/11
to google-ap...@googlegroups.com
Hi Kyle,

Thanks for your suggestion. I totally agree. That is the approach I was going to take if the team wasn't looking to implement this in the short term. 

I looked at the GO SDK code, and a lot of the work has already been implemented there. There's code that goes from the Proto to a struct or a map. I didn't want to duplicate the work, and build another layer, unless I had to. 

As I looked through the 6 features, building a layer/wrapper over appengine package became less necessary.
- (short-term road map) Alternate Datastore Column Names for fields 
- (short term road map) Functions for decisions: store/index this property? 
- (use side-cache)         "Optional" Integrated caching: L1 (request-scoped, in-process) and L2 (Memcache)
- (can work around it)    Callbacks: preSave, postLoad. Allow app reject a load/save request also
- (nice to have)               Polymorphic Storage/Queries (can live without it)
- (really important)          Embedded Types (stored as . separated columns)

The really important one that required a wrapper would be Embedded Types, but then I'd have to duplicate the work done by the SDK for alternate column names, functions for store/index, proto-to-struct codec, etc. On further thinking, if I could live without having to index values in the non-primitive fields, then storing/retrieving those fields to/fro a gob would suffice. 

That's why I asked for the compromise of storing unsupported fields as a gob, instead of returning an error. It looks like an easy change in the SDK.

Ugorji Nwoke

unread,
Oct 4, 2011, 5:05:32 PM10/4/11
to google-ap...@googlegroups.com
Just wanted to give an update on this. I also updated my blog post with this update at the bottom (http://blog.ugorji.net/2011/09/datastore-enhancement-for-go-language.html)

The App Engine team already said that they already had two of them on the short-term roadmap:
- Alternate Datastore Column Names for fields
- Functions for decisions: store/index this property?

These are two of the more important ones. The others can be worked around using application-defined conventions and wrapper methods that expect the convention. I've built support for all the other 4 features for my application in about 400 lines of GO code, which doesn't duplicate but builds upon and depends on support provided by the SDK (including the 2 features on the roadmap). This shows that the team made the right decision in picking these 2 to support from the jump. 

Thanks folks for an awesome SDK. I really appreciate the steady (not rushed) way that API's are introduced, ensuring they are very thought out, idiomatic and bloat-free and hit all the right sweet spots. (Notwithstanding my seeming impatience sometimes ;)

Rodrigo Moraes

unread,
Oct 18, 2011, 12:15:59 PM10/18/11
to google-appengine-go
On Sep 28, 12:06 pm, Ugorji Nwoke wrote:
> - Embedded Types (stored as . separated columns)

Hey,
Did you start working on this? I'd adopt the ndb approach literally.
For example:

type B struct {
F1 string
F1 int
F3 bool
}

type A struct {
F4 B
}

When saving struct A, the protocol buffer would store properties named
"F4.F1", "F4.F2" and "F4.F3": nested structs are just flattened on
encoding, and expanded on decoding. Is this is what you had in mind?

I'd like to help; I have a fairly good understanding of ndb internals
and think we could port some of those ideas.

-- rodrigo

Ugorji Nwoke

unread,
Oct 18, 2011, 4:30:25 PM10/18/11
to google-ap...@googlegroups.com
Hi Rodrigo,

Yes, I started working on it and have a working implementation for my application. I ended up doing more than just the embedded support, because I needed the support for storing some fields, and ignoring others. I'd write up what I have so you see the general idea I took (within the next few days), and can try to tease out the implementation. I think though that we may want to see what the GO team comes up with in like a month, so we can just focus on the differences. We need to know the story with indexed fields, support for optionally storing and/or indexing some field values, etc. 


Rodrigo Moraes

unread,
Oct 18, 2011, 5:29:20 PM10/18/11
to google-appengine-go
On Oct 18, 6:30 pm, Ugorji Nwoke wrote:
> I'd write up what I have so you see the general idea I took (within the next few
> days), and can try to tease out the implementation.

Okay.

> I think though that we
> may want to see what the GO team comes up with in like a month, so we can
> just focus on the differences.

I have no hurry, but this puts us in an inglorious situation. I wish
that the datastore-related development would be more open, like
happened with ndb, so that we could follow and test and report bugs
and submit patches. But the team is small and maybe this is not
possible, but this leaves us (or me) with little motivation to start
developing a given feature. :-/

-- rodrigo
Reply all
Reply to author
Forward
0 new messages