Objectify 4.0 - big changes coming

912 views
Skip to first unread message

Jeff Schnitzer

unread,
Nov 13, 2011, 9:37:27 PM11/13/11
to objectify...@googlegroups.com
It's coming up on two years since Objectify 2.0 was released, which was the last time the public API changed significantly.  Since then the datastore has added several major API features (most notably async operations) and I've learned a *lot* by building several complex applications on the platform.  As I started implementing automatic relationship fetches, I realized that we need to make major backwards-incompatible changes to the Objectify API.

You can read what I have in mind (and why) here:


Assuming this design remains more-or-less intact, this will mean a fair bit of work to upgrade existing applications.  I can provide an adapter that will help, but it won't be perfect - for example, the new Query object is immutable so a linear set of filter() calls won't work.  Hopefully you've already been using a fluent-style query.filter().filter().filter() pattern.

I'm committed to fixing any bugs that surface in v3.1 for the forseeable future, so you don't need to be in a hurry to upgrade.  Unless you want the new @Fetch feature, of course.

Comments, criticisms, f-you's, etc are welcome.  This isn't carved in stone; I'd love to hear proposals for improvements both minor and major.  I'm halfway through the implementation now and expect to have a long beta period to make sure the API is satisfactory.  I'd like this new API to remain stable for at least another two years :-)

Jeff
P.S. for those paying attention, yes there's definitely some inspiration drawn from Twig 2.0.  I've asked John Patterson if he's interested in collaboration; it might be nice, for example, if we had a common set of annotations in a neutral package (com.googlecode.datastore.annotation?).  Dunno.

tfannon

unread,
Nov 13, 2011, 9:41:09 PM11/13/11
to objectify...@googlegroups.com
I saw that there were some enhancements to JDO with latest releases of GAE, and I was beginning to wonder the other day what would happen if you decided to stop contributing to the library and Objectify either fell behind or stopped working with new releases.   That would really suck.  

That being said, very glad to hear you are continuing work on it!  I really hope Google does something to help make this the mainstream way to access the datastore.   Props to you.  Great work!

-TF

Jon Stevens

unread,
Nov 13, 2011, 10:19:33 PM11/13/11
to objectify...@googlegroups.com, objectify...@googlegroups.com
Jeff and I are building our new business on top of Ofy. It isn't going anywhere anytime soon.

Jon

Missplet on my ipone

Daniel

unread,
Nov 14, 2011, 2:40:21 AM11/14/11
to objectify-appengine
just wanted to say that Objectify is amazing!!!!

I think that changes are a good thing , cause eventually it will be
reflected on the speed/easy of development of the application and will
add more features to the library...

keep it up :)

On Nov 14, 5:19 am, Jon Stevens <latch...@gmail.com> wrote:
> Jeff and I are building our new business on top of Ofy. It isn't going anywhere anytime soon.
>
> Jon
>
> Missplet on my ipone
>

Matthew Jaggard

unread,
Nov 14, 2011, 4:01:41 AM11/14/11
to objectify...@googlegroups.com
Hi Jeff,
I have to say that I'm not very sure about the new API (but maybe
it's just in a style I'm not used to) - I really like the existing
one. Things that occur to me are:

1. using oft.find() for gets doesn't make it clear when you're doing a
get and when you're doing a query - which is quite important to know
because of the underlying datastore implementation speeds / cost.

2. If you're coding in your IDE and think I'd like to do z, x and y -
is there anything to stop you getting the methods in the wrong order?
Is there a wrong order? For example can I do
ofy.find().id(123L).type(Thing.class);
instead of
ofy.find().type(Thing.class).id(123L);
or
ofy.now().put().entities(e1, e2, e3);
instaead of
ofy.put().entities(e1, e2, e3).now();

and how do I know which re-orderings are valid and which are not?

3. What are get() and getSafe() and how are they different? Does get()
return a runtime exception instead of a checked exception for "not
found"? Is NotFoundException a checked exception now?

4. Will the new version continue to use a CachingDatastoreService that
can be used outside of Objectify - I think this is really useful and
also less likely to be updated (for improvements - I'm sure bug fixes
will still happen) if it's not used in the main code. Maybe it would
only have a async version now - that would be OK I guess?

I really like that properties will be unindexed by default and session
cache being enabled by default. And using the Map interface to hide
asynchronism is a great idea. I think I prefer "@Indexed" to "@Index"
on a field because to me it's a description of the field rather than
an instruction.

These are just my thoughts, some of them are probably just me not
understanding fully.

Thanks,
Mat.

Gerald Tan

unread,
Nov 14, 2011, 4:25:43 AM11/14/11
to objectify...@googlegroups.com
Hi, my main concern is, will all these still work with gwt-rpc?

Stefano Ciccarelli

unread,
Nov 14, 2011, 4:59:43 AM11/14/11
to objectify...@googlegroups.com
Hi Jeff,

I'm not very convinced of the direction you want to give Objectify.

I like its simplicity, its resemblance to the low-level API and the ability to create POJOs compatible with GWT-RPC. I like that the Keys are always visible because it is in this way that the datastore works.

In my opinion, what is lacking is a more powerful lifecycle callbacks, which probably allows you to create automatic fetches relationship, without polluting the current syntax.

I chose to use Objectify, rather than Twig because I like how Objectify works, otherwise I would have chosen Twig.

About the Session Cache, I don't like it. Breaks the transaction isolation and introduces the possibility of side-effects, since it works with the same object instance.

This is my personal opinion.

A user of Objectify since version 1
Stefano

Matthew Jaggard

unread,
Nov 14, 2011, 5:33:37 AM11/14/11
to objectify...@googlegroups.com
"About the Session Cache, I don't like it. Breaks the transaction
isolation and introduces the possibility of side-effects, since it
works with the same object instance."

I'm not sure this is entirely true. I thought that the session cache
was not used for transactions. Presumably it would be possible for
Objectify to create a new object for each call to get() regardless of
whether the session cache is used - although it does complicate the
cache a bit but I think it would then be possible to make it a
"per-JVM" cache instead of "per-Session" based?

jon

unread,
Nov 14, 2011, 7:48:40 AM11/14/11
to objectify-appengine
* Putting Key<?> in the data model is a PITA sometimes

Not sure what this really means.

* Hard to create tidy data models for jsonification or serialization
to clients; tends to require ugly @Transient fields

Long ago I have decided never to mix "client" and "backend" code in
the same classes. I have 2 sets of classes, one for UI/JSON and the
other for persistence, each being very simple as it only serves one
purpose. People dislike this approach as it requires more code, but I
argue that the extra code is simple code, easy to maintain. Mixing
purposes, on the other hand, quickly leads to convoluted code, even if
it's more compact in terms of LoC.

* Asynchrony shoehorned in

That would be nice.

* Reliance on JPA annotations is problem for some

Is it because the required JPA JAR fattens up our apps?

* Automatic fetching of object graphs

This sounds useful, though I don't have any immediate use case for it.

* Session cache enabled by default

Not sure if I like this. Correct me if I'm wrong, but there's a low-
ish memory limit for each application instance, isn't there? The
quicker an app instance reaches this limit, the more frequently GAE
has to replace a bloated app instance with a fresh one, which would
lead to more users experiencing a cold start, I assume.

Thanks for you great work Jeff.

On Nov 14, 9:33 pm, Matthew Jaggard <matt...@jaggard.org.uk> wrote:
> "About the Session Cache, I don't like it. Breaks the transaction
> isolation and introduces the possibility of side-effects, since it
> works with the same object instance."
>
> I'm not sure this is entirely true. I thought that the session cache
> was not used for transactions. Presumably it would be possible for
> Objectify to create a new object for each call to get() regardless of
> whether the session cache is used - although it does complicate the
> cache a bit but I think it would then be possible to make it a
> "per-JVM" cache instead of "per-Session" based?
>

Jeff Schnitzer

unread,
Nov 14, 2011, 9:04:14 AM11/14/11
to objectify...@googlegroups.com
Thanks for all the feedback everyone - I will try to address everything point-by-point.

On Mon, Nov 14, 2011 at 5:01 AM, Matthew Jaggard <mat...@jaggard.org.uk> wrote:
Hi Jeff,
  I have to say that I'm not very sure about the new API (but maybe
it's just in a style I'm not used to) - I really like the existing
one. Things that occur to me are:

1. using oft.find() for gets doesn't make it clear when you're doing a
get and when you're doing a query - which is quite important to know
because of the underlying datastore implementation speeds / cost.

This is somewhat deliberate because the lines are getting a little blurry.  For example, Objectify may automatically convert from a regular query to a keys-only query + batch get when the entity has a @Cache annotation.  When @Fetch is involved, queries will involve some amount of batch gets.  I may add some (time-based) cache control to the query itself so your queries might never actually hit the datastore.

Furthermore, I'm going to make it easier to switch between ReadPolicy.Consistency.EVENTUAL and ReadPolicy.Consistency.STRONG.  I don't think a lot of people realize this but you can make get() in the HRD weakly consistent and it speeds up the fetch by something like 3X.  You wouldn't want to do this for cached entities (at risk of pulling stale data into the cache) but for noncache data it could be relevant.  And for queries which involve @Fetch, well, the queries are weakly consistent so might as well make non-cached fetches weakly consistent too.  So it's possible for get()s to behave more like queries.
 
2. If you're coding in your IDE and think I'd like to do z, x and y -
is there anything to stop you getting the methods in the wrong order?
Is there a wrong order? For example can I do
ofy.find().id(123L).type(Thing.class);
instead of
ofy.find().type(Thing.class).id(123L);
or
ofy.now().put().entities(e1, e2, e3);
instaead of
ofy.put().entities(e1, e2, e3).now();

and how do I know which re-orderings are valid and which are not?

The command structure takes care of all this for you.  At any step in the chain when you hit '.' your IDE will show you only the valid options for that point.  I have all of this implemented and it actually works really well.
 
3. What are get() and getSafe() and how are they different? Does get()
return a runtime exception instead of a checked exception for "not
found"? Is NotFoundException a checked exception now?

get() vs getSafe() is analogous to find() vs get() in the current API.  get() returns the value or null, getSafe() never returns null but throws NotFoundException (which remains a RuntimeException).

The advantage of doing this with Ref<?> is that it cuts down on the API explosion.  Part of the problem is that the old API requires an combinatorial explosion of methods; find-style vs get-style, sync vs async, etc.  I would need to add maybe a dozen methods to handle fetching of Refs and unfetched-entities-as-keys.  Let me explain.

Let's say you fetch an instance of an entity like this:

class Thing {
   @Id Long id;
   @Fetch("someGroup") OtherThing other;
}

If you aren't fetching someGroup, the 'other' field will be an entity object with id/parent fields set but otherwise uninitialized.  If you want to fetch it, we need methods on Objectify like this:

<T> T get(T entity) throws NotFoundException;
<T> T find(T entity);
<T> Map<Key<T>, T> get(T... entities);
<T> Map<Key<T>, T> get(Iterable<T> entities);

Plus the async versions: 

<T> Result<T> get(T entity);
<T> Result<T> find(T entity);
<T> Result<Map<Key<T>, T>> get(T... entities);
<T> Result<Map<Key<T>, T>> get(Iterable<T> entities);

 ...this is getting out of control.  The new "command builder" pattern keeps the API load short; you start with find(), put(), or delete() and go from there.  The IDE helps out at each step.

4. Will the new version continue to use a CachingDatastoreService that
can be used outside of Objectify - I think this is really useful and
also less likely to be updated (for improvements - I'm sure bug fixes
will still happen) if it's not used in the main code. Maybe it would
only have a async version now - that would be OK I guess?

Absolutely, this will not change.  Literally, this code is not affected by the changes at all.  As currently, Objectify only uses the Async version, but we'll continue to support the sync version.  Everyone should use caching, this is the easiest way to make your app cheap.
 
I really like that properties will be unindexed by default and session
cache being enabled by default. And using the Map interface to hide
asynchronism is a great idea. I think I prefer "@Indexed" to "@Index"
on a field because to me it's a description of the field rather than
an instruction.

I have to admit that I'm somewhat on the fence about @Index vs @Indexed but overall it does seem like the annotations are instructions.  Look at this:

@Entity
class Thing {
   @Id Long id;
   @Index(IfTrue.class) boolean admin;
   @Embed Foo foo;
   @Fetch Bar bar;
   @AlsoLoad("old") String newer;
}

It also makes more linguistic sense; you @Embed the class; the class itself is embedded.  I have this problem right now where I have a base class which I literally want to call Embedded but instead call it Embed, which is somewhat backwards.

By comparison, this seems a little weird:

@Entity
class Thing {
   @Id Long id;
   @Indexed(IfTrue.class) boolean admin;
   @Embedded Foo foo;
   @Fetched Bar bar;
   @AlsoLoaded("old") String newer;
}

I'd be happy for more feedback.  Admittedly this is a bikeshed problem, but it seems important.  We already work with imperative-style annotations (@AlsoLoad is one, but @Inject is probably the canonical example) so I don't think this is forging new ground.

Jeff

Jeff Schnitzer

unread,
Nov 14, 2011, 9:11:26 AM11/14/11
to objectify...@googlegroups.com
On Mon, Nov 14, 2011 at 5:25 AM, Gerald Tan <woeful...@gmail.com> wrote:
Hi, my main concern is, will all these still work with gwt-rpc?

Yes.  Everything that works now will continue to work.  It actuallly gets better:

The ability to use "unfetched" entities as surrogate keys makes Objectify significantly more GWT-friendly.  You no longer need to worry about not being able to create Key<?> on the client-side.  And when you want to pass a whole object graph back, you can pass it as-is without loading up @Transient fields.

For example:

class Thing {
  @Id Long id;
  @Fetch("fat") List<Other> other;
}

Let's say you want to pass a Thing and its Other entities back to the GWT client.  It's now a one-liner:

return ofy.find().fetch("fat").type(Thing.class).id(thingId);

Jeff

Matthew Jaggard

unread,
Nov 14, 2011, 10:38:15 AM11/14/11
to objectify...@googlegroups.com
Yep, that all makes sense and puts my mind a rest a bit that I'll want
to use the version. I'm still not convinced about using find() for
everything - I want to be completely sure that I'm not running a query
for pretty much all of my application - I literally don't use query()
at all in my main interface at the moment, only in the admin interface
and I want to be sure that I don't break this without noticing. Maybe
the API could make it clear somehow that a query might be performed. I
see what you mean about the blur, but that only applies in one
direction - you don't ever run a query when the user is expecting a
get.

I like the idea of making changing the read policy easier - I did use
the eventual consistency before moving to the CachedDatastoreService -
I think caching will speed up my gets more for now.

I see what you mean about @Inject - it's pretty standard usage so I
guess it's better to go with the rest of the Java community!

Thanks again for all your work. If I ever get the confidence to quit
my job and start my business I'll have less work to do on the
datastore :-D

Mat.

Jeff Schnitzer

unread,
Nov 14, 2011, 10:42:15 AM11/14/11
to objectify...@googlegroups.com
On Mon, Nov 14, 2011 at 5:59 AM, Stefano Ciccarelli <scicc...@gmail.com> wrote:
Hi Jeff,

I'm not very convinced of the direction you want to give Objectify.

I like its simplicity, its resemblance to the low-level API and the ability to create POJOs compatible with GWT-RPC. I like that the Keys are always visible because it is in this way that the datastore works.

Nothing prevents you from continuing to work this way; Key<?> objects continue to work as before.  You may, however, find the Ref<?> to be more satisfying with GWT.  It wraps a Key<?> and a reference to the actual entity.  This eliminates the need for @Transient fields to send object graphs to the client.

For example:

class Thing {
  @Id Long id;
  Ref<Other> other;
}

Let's say you want to return a Thing + Other graph to the client.  You could put @Fetch (with or without a fetch group name) on 'other' to have Objectify populate the field for you.  Or you could do it manually:

Thing th = ofy.find().type(Thing.class).id(thingId);
ofy.find().ref(th.other);
return th;

Now in the GWT client you can simply call th.getOther().get().

In my opinion, what is lacking is a more powerful lifecycle callbacks, which probably allows you to create automatic fetches relationship, without polluting the current syntax.

Keep in mind that as far as entities are concerned, the current syntax remains intact (with the exception of possible annotation renaming).  You can still use Key<?> as you do now.

I understand the desire for more powerful lifecycle callbacks, but this isn't a good solution for automatic fetching.  You can't do automatic fetching entity-by-entity otherwise total RPC overhead will kill you; really you want to group fetches into a minimal number of successive batches.  To do this with lifecycle methods, you would need deep hooks into Objectify's processes much like Hibernate's internal events.  It's certainly possible, but it's not realistic to expect developers to figure out how to translate this into fetches.  There are probably a dozen people in the world who understand Hibernate's internal event model.

Also, picking and choosing which pieces to load and then doing it in optimal chunks is a nontrivial problem.  I know because we do this right now in the app Jon and I are developing.  It's tedious to code and we've had more than one bug because we didn't populate the right field at the right time.  I really want to say:

ofy.find().fetch("listing").type(Product.class).filter("foo", foo).entities();

...and get back a list of products and their related entities, all set up for what I need when I display a product listing.

I chose to use Objectify, rather than Twig because I like how Objectify works, otherwise I would have chosen Twig.

I appreciate this and I hope to preserve your existing architectural patterns.  Key<?> is not going away - it's still a critical aspect of Objectify.  Other than syntactic changes to the get/query api and possibly renaming some of the annotations, your app doesn't have to change.

On the other hand, many apps will benefit from being able to fetch object graphs wholesale and the ability to send object graphs across a wire without creating lots of DTOs or manually-populated @Transient fields.  There are things I don't like about Twig and there are things I do like about Twig.  This is one of its strongest features and worth appropriating.  It doesn't intrude on your design if you don't want to use it.
 
About the Session Cache, I don't like it. Breaks the transaction isolation and introduces the possibility of side-effects, since it works with the same object instance.

I'll respond to Matthew's response.
 
This is my personal opinion.

I very much appreciate this feedback.
 
A user of Objectify since version 1

I hope you don't miss OKey<?>, OQuery<?>, and OPreparedQuery<?> ;-)

Jeff

Jeff Schnitzer

unread,
Nov 14, 2011, 11:01:05 AM11/14/11
to objectify...@googlegroups.com
On Mon, Nov 14, 2011 at 6:33 AM, Matthew Jaggard <mat...@jaggard.org.uk> wrote:
"About the Session Cache, I don't like it. Breaks the transaction
isolation and introduces the possibility of side-effects, since it
works with the same object instance."

I'm not sure this is entirely true. I thought that the session cache
was not used for transactions. Presumably it would be possible for
Objectify to create a new object for each call to get() regardless of
whether the session cache is used - although it does complicate the
cache a bit but I think it would then be possible to make it a
"per-JVM" cache instead of "per-Session" based?

A single Objectify instance contains, optionally:
  * A single transaction
  * A session cache

This means you can never have a session cache that crosses multiple transactions; the Objectify instance *is* the session and *is* the transaction.  The session cache can't break txn isolation.[1]

Having a session cache definitely opens up the opportunity for side effects.  That's actually part of the value; if you load an object in different parts of your code you often want to get the exact same object that you had before.

Should session cache be the default?  Should it be the default always or should it default to false when a transaction starts?  I have two data points:

 * Hibernate enables session cache by default, both for transactions and not.

 * My apps all turn on the session cache.  I only explicitly disable the session cache when iterating through large datasets which would otherwise overwhelm the java heap.

This suggests to me that session caching should be enabled by default.  But I'm not certain of this, and more opinions / data points help.

[1] Actually, there is a way, but it's somewhat convoluted.  If you load an entity instance out of one Objectify instance and put() it in another instance, you can get one object in multiple session caches.  This should be a rare case and looks like a code smell, especially if there are transactions involved.

Jeff

Jeff Schnitzer

unread,
Nov 14, 2011, 11:39:33 AM11/14/11
to objectify...@googlegroups.com
On Mon, Nov 14, 2011 at 8:48 AM, jon <jonni...@gmail.com> wrote:
* Putting Key<?> in the data model is a PITA sometimes

Not sure what this really means.

There are legitimate problems with using keys as persistent fields.  If you are a purist (I'm not, but some people are), this does "contaminate" the data model with GAE-specific classes.  My bigger complaint is that when building an object graph you have to work around the Key<?> field.  For example, here's something from our app:

class Organization {
   @Id Long id;
   ...fields...
}

class Event {
   @Parent Key<Organization> org;
   @Id Long id;
   ...fields...
   @Transient Organization orgEntity;
}

I want to get an Event with its Organization.  This is awkward, especially because now I have getOrg() and getOrgEntity() methods.  Even without automatic fetching, client code is much cleaner with an interface like this:

class Event {
   @Parent Organization org;
   @Id Long id;
   ...fields...
}

* Hard to create tidy data models for jsonification or serialization
to clients; tends to require ugly @Transient fields

Long ago I have decided never to mix "client" and "backend" code in
the same classes. I have 2 sets of classes, one for UI/JSON and the
other for persistence, each being very simple as it only serves one
purpose. People dislike this approach as it requires more code, but I
argue that the extra code is simple code, easy to maintain. Mixing
purposes, on the other hand, quickly leads to convoluted code, even if
it's more compact in terms of LoC.

I find this varies heavily from application to application.  If you use GWT or any other RMI-like java-to-java serialization system (eg Hessian), I tend to agree.  There's too much risk of inappropriate data being serialized to the client, and you end up with an anemic domain model because you're limited by what you can use on the client.

My Mobcast and Similarity (GWT apps) both follow this pattern; I was very conservative in what I would share with the client.  The client APIs use carefully-defined DTOs and it works great.

Voost (Jon's and my latest project) is quite different.  It uses a mix of traditional templating (Cambridge Templates) and javascript/ajax calling REST/JSON methods.  It works really well to use the entity objects directly in the "client":

  * In direct templates, there is no risk of data exposure.  Just don't call methods and render data you don't want exposed.

  * We use Jackson to render JSON.  Jackson has several facilities to render various JSON "views" of objects; using a combination of @JsonAutoDetect(JsonMethod.NONE) and @JsonFilter we turn our entity objects into precisely the JSON form we want with just a couple annotations.

In contrast, creating DTOs for Voost would be pure hell.  We have 25+ entities including several polymorphic hierarchies.  We expect this number to double or triple as the app evolves and we expand our core product into new niches with different data requirements.  Maintaining a parallel hierarchy of DTOs would be miserable.

* Reliance on JPA annotations is problem for some

Is it because the required JPA JAR fattens up our apps?

I'm not worried about jar sizes.  There are three problems, ordered from least important to most severe:

1) Lots of people think "I'll use Objectify and delete the JPA jar" and then complain when it doesn't work.

2) Some people try the JDO annotations by mistake.  It's not intuitively obvious that you should use JPA, after all Objectify isn't a JPA provider.

3) We use the JPA annotations until we discover that we need to alter it slightly (typically by adding an attribute).  But we can't change the JPA annotation so we add a local duplicate and now have to support both annotations.  This is ultimately just confusing for users and painful to document.
 
* Session cache enabled by default

Not sure if I like this. Correct me if I'm wrong, but there's a low-
ish memory limit for each application instance, isn't there? The
quicker an app instance reaches this limit, the more frequently GAE
has to replace a bloated app instance with a fresh one, which would
lead to more users experiencing a cold start, I assume.

The session cache lives inside each Objectify instance; when you throw away the instance and create a new one you throw away the session cache.  It's very similar to Hibernate in this regard.

As with Hibernate, there's really only a risk of hitting heap problems if you iterate through lots of data.  But this is a fair complaint.  Should Objectify, by default:

 1) Perform better for everyone, but possibly cause heap problems when iterating through large numbers of entities.  Or:

 2) Perform worse for everyone, but never have heap issues (well, at least not by iterating).

I don't know the answer.  We went with #2 because Objectify started out without a session cache and we didn't want to break any existing code.  Hibernate/JPA goes with #1, and most people probably come from that background.  This won't be an issue for anyone who reads the manual or reads the javadocs, because they will explicitly choose session cache or not session cache.

Jeff

Aidan O'Kelly

unread,
Nov 14, 2011, 6:08:13 PM11/14/11
to objectify...@googlegroups.com
I like it, fetch-groups are a great way to get the exact object graph
you need for an operation.  Ref type still represents exactly what is
in the datastore, like Key.
The difference between using Ref<MyThing> and just MyThing isn't
really clear to me though:

   @Fetch({"bigGroup", "smallGroup"})
   SomeThing some;

@Fetch
   Ref<OtherThing> refToOtherThing;

From design goals: Use Ref<?> when asynchrony cannot be hidden (ie,
returning concrete classes)
I don't get it?! What's the difference between SomeThing & OtherThing here?

Is it a big step further to do 'put-groups'? just to make life easier
when saving an object graph a few levels deep. Perhaps with a
lifecycle callback for dirty detection.

Also, a hook in entity creation to enable injection would be a welcome
feature! A lot of my entities have instance methods with logic, and
need to be injected with services at times.

Jeff Schnitzer

unread,
Nov 14, 2011, 7:38:59 PM11/14/11
to objectify...@googlegroups.com
On Mon, Nov 14, 2011 at 7:08 PM, Aidan O'Kelly <aid...@gmail.com> wrote:
The difference between using Ref<MyThing> and just MyThing isn't
really clear to me though:

   @Fetch({"bigGroup", "smallGroup"})
   SomeThing some;

  @Fetch
   Ref<OtherThing> refToOtherThing;

From design goals: Use Ref<?> when asynchrony cannot be hidden (ie,
returning concrete classes)
I don't get it?! What's the difference between SomeThing & OtherThing here?

In this example, not much.  Some people will prefer the first version and some people (the ones who like Key<?>) will probably prefer the later.  Also, when you aren't automatically fetching, Refs are easier to update:  put all the refs you want in a List and call ofy.find().refs(list).  Since the Refs are already in place, you don't need to walk the object graph and call setThing() for each of them.

For an example, let's say you wanted to populate the refs of two Things:

ofy.find().refs(thing1.getRefToOtherThing(), thing2.getRefToOtherThing());  // that's it

Now if you wanted to populate the SomeThing field, it's quite a bit more complicated:

Map<Key<SomeThing>, SomeThing> someThings = ofy.find().keys(thing1.getSome(), thing2.getSome());
thing1.setSome(someThings.get(thing1.getSome().getKey()));
thing2.setSome(someThings.get(thing2.getSome().getKey()));

There's another reason for Ref, which is to future-proof code if we decide to enable optional "live" proxies to populate an object graph.  That is, you load an entity and then you can walk through the graph, fetching at each step.  For SomeThing, this could be hidden behind a cglib or javassist proxy - unless the entity is polymorphic.  The only way to have this kind of live proxy for polymorphic entities is to use something like Ref for indirection.

Also note that Ref<?> is what gets returned by the ofy.find().type(Foo.class).id(123) method.  Instead of returning a Foo object directly, it hides the asynchrony behind the Ref<?>.  It means slightly more typing for the common case:

Foo foo =  ofy.find().type(Foo.class).id(123).get();
vs
Foo foo =  ofy.get(Foo.class, 123);

But it makes all the other cases simpler, dramatically reduces the method count on Objectify, and stops me from having to maintain parallel methods on Objectify and AsyncObjectify.  Truth be told, you could easily add the above get(Class, long) method to a wrapper for your own convenience.
 
Is it a big step further to do 'put-groups'? just to make life easier
when saving an object graph a few levels deep. Perhaps with a
lifecycle callback for dirty detection.

Yes, this would be a major step and one I don't feel comfortable with at this time.  Cascading put/delete brings up all kinds of issues in the world of GAE because of transaction constraints.  I don't envy the guys writing GAE's JDO adapter (Max?).  I also don't feel that cascading put/delete really brings a lot of value to the table.

Dirty change detection is interesting but that is also a massive step with major API implications.  The problem with dirty change detection is that it requires an explicit boundary - you have to open a session and close it, at which point changes are synced.  This ends up being a try/finally pattern which gets ugly.

On the other hand, there are definitely times I have wanted dirty change detection, and end up faking it with methods like this:

boolean changed = false;
changed = changed |= obj.setThing1(foo);
changed = changed |= obj.setThing2(bar);
if (changed)
    ofy.put(obj);

It's not elegant.  I'm not sure what the answer is but maybe sometime in the future we will come up with a way to address this that doesn't require everyone to define try/finally boundaries.  Objectify 5.0.

Also, a hook in entity creation to enable injection would be a welcome
feature! A lot of my entities have instance methods with logic, and
need to be injected with services at times.

This is an exceptionally good idea.  I will delegate object creation to an overridable method on ObjectifyFactory so you can delegate to Guice or whatnot.

Jeff

Jeff Schnitzer

unread,
Nov 14, 2011, 7:44:38 PM11/14/11
to objectify...@googlegroups.com
On Mon, Nov 14, 2011 at 8:38 PM, Jeff Schnitzer <je...@infohazard.org> wrote:

boolean changed = false;
changed = changed |= obj.setThing1(foo);
changed = changed |= obj.setThing2(bar);
if (changed)
    ofy.put(obj);

sorry, that should read:

boolean changed = false;
changed |= obj.setThing1(foo);
changed |= obj.setThing2(bar);
if (changed)
    ofy.put(obj);

Jeff

Chris

unread,
Nov 15, 2011, 9:42:58 AM11/15/11
to objectify-appengine
Hey Jeff,

Most of the changes I personally like.

The only question I have is why are you making Query immutable?

There are certain use cases (REST Query api with params that are not
required...) where have the query object mutable is a great asset.

- Chris

On Nov 14, 7:44 pm, Jeff Schnitzer <j...@infohazard.org> wrote:

Chris

unread,
Nov 15, 2011, 9:53:35 AM11/15/11
to objectify-appengine
Another thought Twig annotation is great, but what about morphia
annotation integration.

Morphia was heavily influenced by Objectify, so why not have that team
share the same set of annotations with Twig/Objectify.

com.google.code.nosql-annotations or something

Jeff Schnitzer

unread,
Nov 15, 2011, 10:10:51 AM11/15/11
to objectify...@googlegroups.com
I just re-looked at Morphia's annotations and the one major difference is to use @Reference instead of @Fetch.  I have mixed feelings about that... the nice thing about @Fetch is that it parallels the ofy().find().fetch("groupName") method.  Although it could be ofy().find().reference("groupName").

I would certainly be ok with keeping the @Indexed @Embedded etc annotations and using @Reference instead of @Fetch.  Anyone want to vote on this?

Option A:
@Index
@Unindex
@Ignore (maybe instead of @Transient?)
@IgnoreSave (maybe instead of @NotSaved?)
@Fetch (combined with ofy().find().fetch("groupName").etc...)
@AlsoLoad
@Cache
@Serialize
@Embed

Option B:
@Indexed
@Unindexed
@Ignored (maybe instead of @Transient?)
@NotSaved
@Reference (combined with ofy().find().reference("groupName").etc...) (or maybe use @Fetch)
@AlsoLoad (forget @AlsoLoaded, that sucks)
@Cached
@Serialized
@Embedded

Jeff

Jeff Schnitzer

unread,
Nov 15, 2011, 10:12:42 AM11/15/11
to objectify...@googlegroups.com
On Tue, Nov 15, 2011 at 10:42 AM, Chris <ritte...@gmail.com> wrote:
Hey Jeff,

Most of the changes I personally like.

The only question I have is why are you making Query immutable?

There are certain use cases (REST Query api with params that are not
required...) where have the query object mutable is a great asset.

Does this matter?  Seems like it would just be the difference between...

if (limitParam != null)
    query.limit(limitParam)

...and...

if (limitParam != null)
    query = query.limit(limitParam)

Jeff

Jeff Schnitzer

unread,
Nov 15, 2011, 10:29:43 AM11/15/11
to objectify...@googlegroups.com
One more thing to consider before you vote:

In the past, there have been multiple people posting to this list wondering why their code wasn't working, and the reason is because when they organized imports they ended up with a JDO annotation instead of a JPA annotation.  Who knows how many people had this issue and simply gave up without asking for help.

This is something that we need to avoid.  One option is to support the JPA and JDO equivalents (@Transient, @Embedded, @Entity).  This would create a dependency on the JPA and JDO jars (yuck).

Another option is to divorce our annotations from the JPA/JDO equivalents.  @Entity is probably ok but @Embedded -> @Embed and @Transient -> @Ignore would help.

Jeff

Matthew Jaggard

unread,
Nov 15, 2011, 10:33:45 AM11/15/11
to objectify...@googlegroups.com
Option A has my vote. The only issue with @Ignore is that it doesn't specify what will ignore it - Another developer? Objectify? The user? Some other persistence system? Standard java serialisation?

Jeff Schnitzer

unread,
Nov 15, 2011, 10:37:44 AM11/15/11
to objectify...@googlegroups.com
On Tue, Nov 15, 2011 at 11:33 AM, Matthew Jaggard <mat...@jaggard.org.uk> wrote:
Option A has my vote. The only issue with @Ignore is that it doesn't specify what will ignore it - Another developer? Objectify? The user? Some other persistence system? Standard java serialisation?

Fair enough criticism, but it seems like @Transient is even worse because it mimics the 'transient' keyword.  The only advantage of @Transient is that it will be familiar to JPA folks (and previous Objectify users, and Morphia users).  But then we have the name collision problem.

Jeff

Chris

unread,
Nov 15, 2011, 2:00:35 PM11/15/11
to objectify-appengine
Option A.

As for Morphia -- is there any hope of convincing those folks to use
Option A in a later release?

I have this dream that Objectify's Interfaces could turn into a JDBC
like wrapper for many cloud databases (not just appengine).

We just need to get Morphia and others to adopt it... maybe I'm still
dreaming... but how cool would it be if you could almost copy and
paste your database access code and run that code on App Engine,
HBase, or MongoDB? As we're all aware every cloud database is very
unique, so fine tuning would still be required.. but you see where I'm
going...

- Chris

On Nov 15, 10:37 am, Jeff Schnitzer <j...@infohazard.org> wrote:

Ruslan V

unread,
Nov 15, 2011, 10:10:09 PM11/15/11
to Jeff Schnitzer
Dear Jeff,

I prefer A.



Tuesday, November 15, 2011, 7:10:51 AM, you wrote:


I just re-looked at Morphia's annotations and the one major difference is to use @Reference instead of @Fetch.  I have mixed feelings about that... the nice thing about @Fetch is that it parallels the ofy().find().fetch("groupName") method.  Although it could be ofy().find().reference("groupName").

I would certainly be ok with keeping the @Indexed @Embedded etc annotations and using @Reference instead of @Fetch.  Anyone want to vote on this?

Option A:
@Index
@Unindex
@Ignore (maybe instead of @Transient?)
@IgnoreSave (maybe instead of @NotSaved?)
@Fetch (combined with ofy().find().fetch("groupName").etc...)
@AlsoLoad
@Cache
@Serialize
@Embed

Option B:
@Indexed
@Unindexed
@Ignored (maybe instead of @Transient?)
@NotSaved
@Reference (combined with ofy().find().reference("groupName").etc...) (or maybe use @Fetch)
@AlsoLoad (forget @AlsoLoaded, that sucks)
@Cached
@Serialized
@Embedded


/Ruslan

jon

unread,
Nov 15, 2011, 10:15:19 PM11/15/11
to objectify-appengine
> 2) Perform worse for everyone, but never have heap issues (well, at least
> not by iterating).

I suggest Objectify 4 default to option 2 above. I like Objectify
because it doesn't depart too much from the underlying datastore. It
trains us developers to work within GAE's constraints and take
advantage of its strengths. Right now, GAE imposes a low memory limit
and expensive datastore operations. Memcache (as far as I know) is
largely free. Option 2 would train us to be frugal with the former and
use more of the latter.

I imagine an instance having to handle many concurrent requests would
quickly reach the memory limit if each request handler was too casual
with throwing stuff into memory. So it's not just "iterations" that
can cause heap issues.

In terms of annotation names my vote is A. And don't depend on JPA/JDO
please.

On Nov 16, 2:37 am, Jeff Schnitzer <j...@infohazard.org> wrote:

Jeff Schnitzer

unread,
Nov 16, 2011, 3:10:10 PM11/16/11
to objectify...@googlegroups.com
New proposed method signatures at:


The change is that find() is now load() and what was the fetch()/load() is now group().  I think this is a lot clearer:

class Thing {
  @Id Long id;
  @Load AlwaysStuff always;
  @Load("extra") BigStuff big;
}

... = ofy.load().group("extra").type(Thing.class).id(thingId);
... = ofy.load().group("extra").entities(thingKey1, thingKey2);
... = ofy.load().group("extra").filter("foo", foo).entities();

I'm not 100% certain this is the right wording, but it seems right since we use the word Load a lot - @AlsoLoad, @OnLoad, @Load, etc.  Consistent.  Opinions are welcome.

Another alternative is to get rid of the group() method entirely and simply have two overloads for load():

load()
load(String... groups)

...but while this would be slightly less typing, it starts to produce overload explosion if we offer other ways to define what to load in the future (like numeric activation levels).

Jeff

Chris

unread,
Nov 16, 2011, 3:44:46 PM11/16/11
to objectify-appengine
Like it.

Where is caching fitting in with this API?

Obviously the id and ids results are cached within the global cache.
Will anything else be in the global cache?

I remember talk earlier of every query being keys only and then using
batch gets... is this still in the pipeline or have new ideas emerged?

Jeff Schnitzer

unread,
Nov 16, 2011, 4:18:51 PM11/16/11
to objectify...@googlegroups.com
On Wed, Nov 16, 2011 at 4:44 PM, Chris <ritte...@gmail.com> wrote:
Like it.

Where is caching fitting in with this API?

Obviously the id and ids results are cached within the global cache.
Will anything else be in the global cache?

I remember talk earlier of every query being keys only and then using
batch gets... is this still in the pipeline or have new ideas emerged?

Still part of the plan, although I'm not sure if it will be 4.0 or 4.1.  I don't think it's hard to do, but I'm not sure what the API should be:

Option A:  Just always do this.  Worth looking at the real cost... after all, the query is effectively doing the same thing under the covers.  I suspect this is a bad idea.

Option B:  Require an explicit instruction on the Query interface.  Probably something like:  ofy.load().filter("foo", foo).hybrid().entities()

Option C:  Automatically "do the right thing"; if an explicit type() is specified check to see if the entity is cached, and if so, perform the hybrid query.  This is not exclusive with Option B.

Jeff

Jeff Schnitzer

unread,
Nov 16, 2011, 4:25:29 PM11/16/11
to objectify...@googlegroups.com
On Wed, Nov 16, 2011 at 5:18 PM, Jeff Schnitzer <je...@infohazard.org> wrote:

Option B:  Require an explicit instruction on the Query interface.  Probably something like:  ofy.load().filter("foo", foo).hybrid().entities()

BTW this also makes it easy to do something like:

ofy.load().filter("foo", foo).cache(600).entities();  // 600 seconds

...which can cache the query part (keysonly) and then perform the normal hybrid fetch through the memcache.

Jeff

Chris

unread,
Nov 16, 2011, 9:46:12 PM11/16/11
to objectify-appengine
The cache method in the chain is a great idea...

I think this actually makes a lot of sense. We just need to make sure
that we encourage developers to use App Stats to profile when doing
this is a good idea.

On Nov 16, 4:25 pm, Jeff Schnitzer <j...@infohazard.org> wrote:

dilbert

unread,
Nov 17, 2011, 4:42:09 AM11/17/11
to objectify...@googlegroups.com
Jeff, would you consider a programmatic API for entity configuration as an alternative to annotations for version 4.0. Something along the lines described here: http://code.google.com/p/objectify-appengine/issues/detail?id=49#c4
It would eliminate the need to use objectify libraries (because of the annotations) in android projects.

d

Jeff Schnitzer

unread,
Nov 17, 2011, 9:16:38 AM11/17/11
to objectify...@googlegroups.com
Hmmmm... that seems like a lot of work for both me and for people who want to use that approach.  I would hate to have to maintain those mixins.  I have another thought.

The problem:  Objectify annotations on domain classes force inclusion of objectify annotation class files in serialization-based clients (android, hessian, etc).

Current solution:  Provide objectify annotation jar.

Alternative solution:  Post-process the domain classes that are included in your client jar to strip out all objectify annotations.  Between asm, bcel, and javassist, there must be a fairly simple solution.

I just did a little googling and I can see how to do it with javassist, basically the key is here: http://www.csg.is.titech.ac.jp/~chiba/javassist/html/javassist/bytecode/AnnotationsAttribute.html

Might be easier with asm though.  This would be a straightforward ant task.

Jeff

Jeff Schnitzer

unread,
Nov 17, 2011, 9:27:39 AM11/17/11
to objectify...@googlegroups.com
For some reason I thought we were distributing an annotation-only jar.  I just realized you have to build it yourself.  We'll put this in the standard distro.

Jeff

Jeff Schnitzer

unread,
Nov 17, 2011, 9:28:14 AM11/17/11
to objectify...@googlegroups.com
...and by "build it yourself" I mean run "ant client-jar".

Jeff

dilbert

unread,
Nov 17, 2011, 10:23:03 AM11/17/11
to objectify...@googlegroups.com
Ah, I should have been more clear. I meant the builder type configuration not the mixins solution.
To quote:
ObjectifyService.register( YourEntity.class ).withId( "vin" );

d

Jeff Schnitzer

unread,
Nov 17, 2011, 11:21:22 AM11/17/11
to objectify...@googlegroups.com
It's still a crazy amount of work for both of us, and forces you to manually keep the metamodel in sync with the actual class structure.  That could introduce a lot of quirky errors (including data-destroying errors) very quickly.

If the problem is just that you want to use domain classes on the client without pulling in Objectify annotations, why not just edit them out in a bytecode postprocessing step?

Jeff

John Patterson

unread,
Nov 17, 2011, 10:53:51 AM11/17/11
to objectify...@googlegroups.com
Twig has a programmatic API for configuration and it turned out to be a real shit for adding new features.  Basically every new feature required an annotation and then also a new method in the Configuration layer.  An annotation is a convenient holder for configuration data (like a bean) and to have to pass that data via method parameters is a bit awkward.

It would be a lot easier to allow a pluggable "annotation extractor" which passes all annotations to the core of Objectify.  The default impl just reads annotations from the class unaltered.  You could create a list of annotation instances programatically (by implementing the interface) and pass them to Objectify also.

Then your data model could be free from annotations and no complicated configuration layer is needed.

But perhaps there are other workarounds that make all this unnecessary...



dilbert

unread,
Nov 17, 2011, 12:25:28 PM11/17/11
to objectify...@googlegroups.com
Hello John.
First of all congratulations for joining the objectify team. I didn't quite understand Your solution to the problem. Could You elaborate a bit. Perhaps a small code example would help me understand.
Thanks,
d

John Patterson

unread,
Nov 18, 2011, 8:10:37 AM11/18/11
to objectify...@googlegroups.com
Hi Guys, just some questions and points from reading the through current design.  I love the direction and the new innovation with features like fetch groups which should really help performance.

I am very used to doing similar things the Twig way - so just to throw the cat among the pigeons and hopefully create a bit of debate before the API gets too set in stone....

Using the word "entity" in the API makes me think of the datastore Entity class.  I prefer to use "persistent instance"or "instance" in the API to remove any confusion with an Entity stored in the datastore.  "Instances" are Java objects with fields and Entities are bags of properties or a representation in Big Table.  So:
ofy.load().entity(thingKey)
seems funny and also not consistent with this (which I like):
ofy.load().type(Thing.class).id(123L)
Perhaps the first should read
ofy.load().key(thingK
ey);
to be more consistent.  The other fluent methods are naming what is passed in or what state they are changing

The word "entity" is also doing double duty when used as a terminator to the chain:
Iterable<Thing> ths = ofy.load().group("group").type(Thing.class).filter("foo", foo).entities();
I would prefer ".instances()" or something other than "entities()"

I notice that the return type from queries is Iterable<T> which is a good default.  I assume it is backed by the "live" QueryResultIterator so results are fetched in batches.  Also, there is first() which is great.

In Twig, I also frequently use .unique() which throws an exception if there is more than one result (very useful sanity check!) and .all() which loads every result in a single datastore API call by setting the chunk size and fetch size to MAX_VALUE.  Very useful when you know you are not dealing with thousands of results.

all() returns a List rather than an Iterable because it has the complete result set in memory.


More to come... :)



John Patterson

unread,
Nov 18, 2011, 8:17:22 AM11/18/11
to objectify...@googlegroups.com
Cheers.  Perhaps Jeffs other suggestions would make this unnecessary but I was thinking the out-of-the-box behaviour could be like this:

class DefaultAnnotationReader implements AnnotationReader
{
     @Override
    public Annotation[] annotations(Method method)
    {
        return method.getDeclaredAnnotations();
    }
   
    @Override
    public Annotation[] annotations(Class<?> clazz) { ... }
   
    @Override
    public Annotation[] annotations(Field field) { ... }
}

then you could replace this with your own impl that had logic like
if (clazz.equals(MyClass.class))
{   
    // create annotations programatically
    return new SomeObjectifyAnnotation()
    {
        public int someMethod()
        {
            return 5;
        }
    };
}

Actually there are probably easier ways to create annotations by defining them directly like

@SomeObjectifyAnnotation(first="hi", second="there")
class MyAnnotationReader extends DefaultAnnotationReader
{
     @Override
    public Annotation[] annotations(Class clazz)
    {
        if (clazz.getName().equals("ComeClass"))
        {
            return this.getClass().getAnnotations();
        }
        else return super.annotations(clazz);
    }
}

John Patterson

unread,
Nov 18, 2011, 8:32:28 AM11/18/11
to objectify...@googlegroups.com
On 14/11/2011 16:01, Matthew Jaggard wrote:
1. using oft.find() for gets doesn't make it clear when you're doing a
get and when you're doing a query - which is quite important to know
because of the underlying datastore implementation speeds / cost.

I also like the difference to be clear and worry about the correct choice being made by the framework.

I'm don't understand yet though, how this choice would be made.  Still, it really seems like something that shouldn't be hidden from the user?  Open to being convinced though

2. If you're coding in your IDE and think I'd like to do z, x and y -
is there anything to stop you getting the methods in the wrong order?
Is there a wrong order? For example can I do
ofy.find().id(123L).type(Thing.class);
instead of
ofy.find().type(Thing.class).id(123L);
or
ofy.now().put().entities(e1, e2, e3);
instaead of
ofy.put().entities(e1, e2, e3).now();

and how do I know which re-orderings are valid and which are not?

3. What are get() and getSafe() and how are they different? Does get()
return a runtime exception instead of a checked exception for "not
found"? Is NotFoundException a checked exception now?
Return null when an instance is not found?  To me it doesn't really seem like a "exceptional" circumstance if you check for an instance and it is not there.

I really like that properties will be unindexed by default 

Yes this was also a big mistake in Twig.  Unindexed is the only way to remind you to do things right from the start.
and session
cache being enabled by default. 

I'm still not really sure how direct references can work without the session cache being enabled.  How would circular references work? 

class Band
{
    List<Musician> members;
}

class Musician
{
    List<Band> bands;
}

without a session cache this circular reference would keep creating new instances?

I don't think it should even be an option to disable the cache.  Is it only memory requirements that make this an issue?  In Twig there is .disassociate(Object) which removes an instance from the cache if you are reading a lot of data and mem might be an issue.  There was something similar in Hibernate if I remember correctly.

Also, the cached instances are held by SoftReferences so if you do not keep a reference to the instance in your program then it can be automatically removed from the cache if memory is low.  Personally, with caching working like this, I have never seen a problem from memory usage.


John Patterson

unread,
Nov 18, 2011, 9:13:30 AM11/18/11
to objectify...@googlegroups.com
On 18/11/2011 20:10, John Patterson wrote:
  .unique() which throws an exception if there is more than one result (very useful sanity check!) and .all()
I'm slowly getting my head around using the intrinsic interfaces like List, Map, Iterable to hide the async stuff.  So the chain terminators should be and able to return different collection types or Ref. 

.entities or .instances() is not consistent with .all() or .unique() or .first()

I was thinking perhaps it might be even more explicit to have terminators: .list() .iterable() .ref()

Thoughts?  Is that what Objectify 3 uses?  Sorry if I'm playing catch-up here!

BWT, I might be alone on this one but... why Ref instead of Reference?  Its doesn't really save typing with with auto-complete.  I've read similar discussions between Ceylon and Scala advocates and notice that Scala gets a bit of grief about having short forms of keywords that some people just don't like to look at.  I'm generally in that group - there is not much logic to it!  I just like t read full, complete words :)  At the end of the day its no big deal



John Patterson

unread,
Nov 18, 2011, 9:51:21 AM11/18/11
to objectify...@googlegroups.com
On 15/11/2011 07:38, Jeff Schnitzer wrote:
Foo foo =  ofy.find().type(Foo.class).id(123).get();
vs
Foo foo =  ofy.get(Foo.class, 123);

But it makes all the other cases simpler, dramatically reduces the method count on Objectify, and stops me from having to maintain parallel methods on Objectify and AsyncObjectify.  Truth be told, you could easily add the above get(Class, long) method to a wrapper for your own convenience.

Twig has such a "short cut" method .load(Class<?>, Object id) and I use if often.  Also, its counterparts .store(Object, String) and .store(Object, long)

Dirty change detection is interesting but that is also a massive step with major API implications.  The problem with dirty change detection is that it requires an explicit boundary - you have to open a session and close it, at which point changes are synced.  This ends up being a try/finally pattern which gets ugly.

On the other hand, there are definitely times I have wanted dirty change detection, and end up faking it with methods like this:

boolean changed = false;
changed = changed |= obj.setThing1(foo);
changed = changed |= obj.setThing2(bar);
if (changed)
    ofy.put(obj);

It's not elegant.  I'm not sure what the answer is but maybe sometime in the future we will come up with a way to address this that doesn't require everyone to define try/finally boundaries.  Objectify 5.0.

It could be good to store this "dirty" state in the Session Cache.  Also, its good to store activation state for each instance.  Before Twig did this, it was very easy to accidentally put an unactivated instance in the datastore.  After introducing activation state tracking, this became impossible (exception thrown) and also you would not activate the same instance more than once.


Also, a hook in entity creation to enable injection would be a welcome
feature! A lot of my entities have instance methods with logic, and
need to be injected with services at times.

This is an exceptionally good idea.  I will delegate object creation to an overridable method on ObjectifyFactory so you can delegate to Guice or whatnot.

I've also used this override to create Guice interceptors that do auto-activation.  So as soon as you call certain methods on a model that has not been activated, it goes and gets the data for you.

John Patterson

unread,
Nov 18, 2011, 9:59:45 AM11/18/11
to objectify...@googlegroups.com
On 18/11/2011 21:51, John Patterson wrote:
Also, a hook in entity creation to enable injection would be a welcome
feature! A lot of my entities have instance methods with logic, and
need to be injected with services at times.

This is an exceptionally good idea.  I will delegate object creation to an overridable method on ObjectifyFactory so you can delegate to Guice or whatnot.

I've also used this override to create Guice interceptors that do auto-activation.  So as soon as you call certain methods on a model that has not been activated, it goes and gets the data for you.

For auto-activation it is also required to track activation state.  There is a method ObectDatastore.isActivated() which is used by the proxy to check if it needs to activate or not.

I considered storing state in the instance itself (dirty state, activation state) using cglib or AspectJ but that would mess with serialization.

So the SessionCache ends up needing a bi-directional mapping - Key to Instance and Instance to Key (actually an Object holding Key and other state).

I'm making a lot of assumptions here about Objectify internals that I really don't know!

John Patterson

unread,
Nov 18, 2011, 10:11:08 AM11/18/11
to objectify...@googlegroups.com
On 17/11/2011 04:18, Jeff Schnitzer wrote:
>
> Option A: Just always do this. Worth looking at the real cost...
> after all, the query is effectively doing the same thing under the
> covers. I suspect this is a bad idea.
>
> Option B: Require an explicit instruction on the Query interface.
> Probably something like: ofy.load().filter("foo",
> foo).hybrid().entities()
>
> Option C: Automatically "do the right thing"; if an explicit type()
> is specified check to see if the entity is cached, and if so, perform
> the hybrid query. This is not exclusive with Option B.

Option D: leave it to the developer?
ofy.load().keys(ofy.find().filter("foo", foo).keys())

It is explicit and removes magic

John Patterson

unread,
Nov 18, 2011, 10:15:55 AM11/18/11
to objectify...@googlegroups.com
On 17/11/2011 04:18, Jeff Schnitzer wrote:
>
> Option A: Just always do this. Worth looking at the real cost...
> after all, the query is effectively doing the same thing under the
> covers. I suspect this is a bad idea.
>
> Option B: Require an explicit instruction on the Query interface.
> Probably something like: ofy.load().filter("foo",
> foo).hybrid().entities()
>
> Option C: Automatically "do the right thing"; if an explicit type()
> is specified check to see if the entity is cached, and if so, perform
> the hybrid query. This is not exclusive with Option B.

Option D: leave it to the developer?

Jeff Schnitzer

unread,
Nov 18, 2011, 10:59:30 AM11/18/11
to objectify...@googlegroups.com
On Fri, Nov 18, 2011 at 9:10 AM, John Patterson <jdpat...@gmail.com> wrote:
Hi Guys, just some questions and points from reading the through current design.  I love the direction and the new innovation with features like fetch groups which should really help performance.

I am very used to doing similar things the Twig way - so just to throw the cat among the pigeons and hopefully create a bit of debate before the API gets too set in stone....

This is excellent, it is definitely not set in stone.  More discussion please.

 
Using the word "entity" in the API makes me think of the datastore Entity class.  I prefer to use "persistent instance"or "instance" in the API to remove any confusion with an Entity stored in the datastore.  "Instances" are Java objects with fields and Entities are bags of properties or a representation in Big Table.  So:
ofy.load().entity(thingKey)
seems funny and also not consistent with this (which I like):
ofy.load().type(Thing.class).id(123L)
Perhaps the first should read
ofy.load().key(thingK
ey);

I fear that "persistent instance" is very JDO-ish, and "instance" is totally ambiguous (instance of what?).  For better or worse, we've used the term Entity everywhere in the documentation already and there is a direct 1-to-1 correspondance between an Objectify entity and a low-level entity.  Plus, I suspect most people who come to Objectify aren't intimately familiar with the low-level api and don't want to be.  So I'm not sure it makes sense to have a linguistic distinction between the kinds of entities any more than there is between the kinds of keys.

As far as ofy().load().entity(key) vs ofy().load().key(key), you're right this is a bit awkward.  In fact, a previous iteration of API proposal had methods like this (showing return and param values):

Ref ofy.load().key(Key<?>);
Map ofy.load().keys(Iterator<Key>);
Map ofy.load().keys(Key<?>...);

Ref ofy.load().rawKey(Key);
Map ofy.load().rawKeys(Iterator<Key>);
Map ofy.load().rawKeys(Key...);

Ref ofy.load().entity(Object);
Map ofy.load().entities(Iterator<Object>); 
Map ofy.load().entities(Object...); 

Map ofy.load().values(Iterator<Object>);  // can be heterogenous
Map ofy.load().values(Object...);

I am not fond of this:

1) It's a lot of methods, and there's a lot of redundancy.  For example, entities() and values() are basically the same method - you can pass in anything.  Is it really necessary to have separate keys() vs rawKeys()?  It turns out all cases are handled by having methods entity()/entities() (or value()/values()).

2) There's an inconsistency with the way queries work.  The query mechanism has methods keys() which means "return Key<?> objects" and entities() which means "return entity objects".  Whereas this API has method keys(...) which means "pass in a key" and entities() which means "pass in entity objects".

Of course, this is incongruent with both type().id() as well.  And maybe the query API is the part that should be "fixed".  And maybe it is.  But it's hard to see the value of having 11 methods when 3 will suffice with the exact same syntactic structure.

Maybe the answer is that instead of condensing to entity()/entities(), we should condense to key()/keys() (which takes Object as a parameter so you can pass in an entity).  It's a little odd to pass in an entity as a key but that's what we're proposing anyways - entities are key-like structures now.

Maybe this is what you were proposing from the beginning and I'm just being retarded.  In which case, yeah, I think this is a better idea :-)  I'll change the proposal to key()/keys() for now.

to be more consistent.  The other fluent methods are naming what is passed in or what state they are changing

The word "entity" is also doing double duty when used as a terminator to the chain:
Iterable<Thing> ths = ofy.load().group("group").type(Thing.class).filter("foo", foo).entities();
I would prefer ".instances()" or something other than "entities()"

Or we could just drop it and stick with iterable().  It's present already:

Iterable<Thing> ths = ofy.load().group("group").type(Thing.class).filter("foo", foo").iterable();

Of course this makes it easy to do stuff like this, which I find I use now and then:

for (Thing th: ofy.load().type(Thing.class).filter("foo", foo)) {
   ...
}

More comments about the query API in a later response.
 
I notice that the return type from queries is Iterable<T> which is a good default.  I assume it is backed by the "live" QueryResultIterator so results are fetched in batches.  Also, there is first() which is great.

The actual return value is QueryResultIterable and QueryResultIterator so you can get the cursor().  I was just being a lazy typist in the design doc.  The current java code is correct in this regard.
 
In Twig, I also frequently use .unique() which throws an exception if there is more than one result (very useful sanity check!) and .all() which loads every result in a single datastore API call by setting the chunk size and fetch size to MAX_VALUE.  Very useful when you know you are not dealing with thousands of results.

Sure, we can definitely add .unique() which adds the cost of an extra fetch (limit(2) instead of limit(1).

The all() method is analogous to list().  I don't have a strong opinion on the naming convention.
 
all() returns a List rather than an Iterable because it has the complete result set in memory.

There's actually some interesting behavior in the low-level api's List (which we don't currently use).  It will actually cursor through the results.  I don't think this is very good behavior because it tends to be slower than fetching the list all at once unless you're doing expensive operations at each iteration - in fact, you can speed up the app by calling size() on the list before iterating.

I haven't considered whether this is worth taking advantage of.  I suspect not.

Jeff

Jeff Schnitzer

unread,
Nov 18, 2011, 11:05:02 AM11/18/11
to objectify...@googlegroups.com
This is a good idea, with one caveat.  It needs to be a static so that Key<?> can construct itself without a reference to a factory (or whatever would hold the annotationreader).  Since Key<?> is just a holder for native Key, we need to be able to map from Class to kind when someone does this:

Key.create(Thing.class, id)

This requires static access to the @Entity annotation on Thing.  Which means the annotationreader would have to be static as well.  If that design constraint is ok, I can start weaving it in now.

Jeff

Jeff Schnitzer

unread,
Nov 18, 2011, 12:10:23 PM11/18/11
to objectify...@googlegroups.com
On Fri, Nov 18, 2011 at 9:32 AM, John Patterson <jdpat...@gmail.com> wrote:
On 14/11/2011 16:01, Matthew Jaggard wrote:
1. using oft.find() for gets doesn't make it clear when you're doing a
get and when you're doing a query - which is quite important to know
because of the underlying datastore implementation speeds / cost.

I also like the difference to be clear and worry about the correct choice being made by the framework.

I'm don't understand yet though, how this choice would be made.  Still, it really seems like something that shouldn't be hidden from the user?  Open to being convinced though

Well, some things you don't have a choice about.  For example, if you have @Load fields then your query is going to be followed by a batch get no matter what.  And some optimizations are obvious:  If you are querying on an entity with a @Cache annotation, convert to keysonly and batch fetch.

Also, consider a hypothetical future where we provide "unowned relationships".  Right now relationship graphs are defined by real keys in objects; if ThingA has field List<ThingB>, then this field corresponds to a real List<Key> in the datastore.  But what if we wanted to do something like this?

class ThingA {
  @Id Long id;
  @QueryFor("thingA") List<ThingB> thingBs;
}
class ThingB {
  @Id Long id;
  ThingA thingA;
}

In this case, what we want is ThingB to hold a Key to ThingA but not the other way 'round.  When loading ThingA, the thingBs field gets populated with a proxy that will query for the contents.

So all-in-all we have a world in which queries might involve batch gets and batch gets might involve queries.  It all mixes together so it doesn't seem right to try to put up a pretend wall between these modes.  Yes, the design of your data model and the way you issue requests can have an effect on cost & performance... but this is always the case.

3. What are get() and getSafe() and how are they different? Does get()
return a runtime exception instead of a checked exception for "not
found"? Is NotFoundException a checked exception now?
Return null when an instance is not found?  To me it doesn't really seem like a "exceptional" circumstance if you check for an instance and it is not there.

This mirrors the get() throws EntityNotFoundException in the low level api.  It's a facility I tend to use a *lot* in my code.  For example, let's say you have an API method:

/** This is a JAX-RS method, but could be any business API method (hessian, gwt-rpc, etc) */
@GET
@Path("/thing/{thingId}")
Thing getThing(@PathParam("thingId") long id) {
   Thing th = ofy.load().type(Thing.class).id(id).getSafe();
   checkForPermissionToAccess(th);
   return th;
}

This throws a NotFoundException rather than a NPE.  Client code can catch the NotFoundException and interpret it meaningfully.  In the JAX-RS case, an ExceptionMapper<NotFoundException> can display a pretty error message.

The NotFoundException really scratches an itch for me.

and session
cache being enabled by default. 

I'm still not really sure how direct references can work without the session cache being enabled.  How would circular references work? 

class Band
{
    List<Musician> members;
}

class Musician
{
    List<Band> bands;
}

without a session cache this circular reference would keep creating new instances?

Let's say session cache is turned off.  Different load() operations will return completely separate instances, however, within a single load() operation circular references will be maintained.  There is a "session" but it lives for the duration of the request and then gets thrown away.

This does bring up the question of what to do when iterating through large datasets (eg ofy.load().iterable()), because that takes place in a single load().  In this case, each chunk would be loaded with a separate "session".  This is how it needs to work for batch fetching anyways; I presume twig does this - grab chunkSize entities from the query, resolve references in a batch fetch, then grab the next chunkSize entities from the query.
 
I don't think it should even be an option to disable the cache.  Is it only memory requirements that make this an issue?  In Twig there is .disassociate(Object) which removes an instance from the cache if you are reading a lot of data and mem might be an issue.  There was something similar in Hibernate if I remember correctly.

JPA's EntityManager has a clear() method.  But Hibernate also has a StatelessSession which is the equivalent of calling clear() after every operation.  Hibernate has an evict() method (equivalent to disassociate()) but JPA doesn't.  I'd love to try living without evict().

I agree though, I think the session cache is important and should be enabled by default.  Aside from efficiency gains, some people will probably be confused that loading the same entity doesn't give the same instance, especially when doing queries in chunks.

One caveat - starting a transaction must start a new session cache.  Otherwise it'll blow up your transaction isolation.

Also, the cached instances are held by SoftReferences so if you do not keep a reference to the instance in your program then it can be automatically removed from the cache if memory is low.  Personally, with caching working like this, I have never seen a problem from memory usage.

This is a good idea, and will probably prevent major catastrophe.  I think this seals the deal for enabling session cache by default.  But it's not a good thing to rely on when iterating through large datasets.  Sure, the garbage collector will prune those soft references eventually but in the mean time all those entities are going to be in older heap generations and you're going to need full-stop gc's.  Best to explicitly turn off the session cache.  But this is an optional optimization.

Jeff

Jeff Schnitzer

unread,
Nov 18, 2011, 1:14:43 PM11/18/11
to objectify...@googlegroups.com
On Fri, Nov 18, 2011 at 10:13 AM, John Patterson <jdpat...@gmail.com> wrote:
I'm slowly getting my head around using the intrinsic interfaces like List, Map, Iterable to hide the async stuff.  So the chain terminators should be and able to return different collection types or Ref.  

.entities or .instances() is not consistent with .all() or .unique() or .first()

I was thinking perhaps it might be even more explicit to have terminators: .list() .iterable() .ref()

Sure, although ref() doesn't distinguish the semantic difference between first() and unique().

Also - need better terminology for "unactivated entity" since I think we want to stick with @Load instead of @Activate.  "unfetched entity"?  "partial entity"?  "empty entity"? "unloaded entity"?  That last one makes the most sense if we are using the world load everywhere, but it's a little awkward.  "Partial" kinda makes sense, especially in a world with @Denormalize.  I'll use it here just for the hell of it.

Here are the cases that make the query API hard:

1) How to fetch just the Key<?>s of the entities
2) How to fetch just the key parts of a whole entity (ie, "partial entities")
3) How to fetch just the Key<?> of one entity (first vs unique)
4) How to fetch just the key parts of one entity (ie, an "partial entity", first vs unique)
5) How to fetch just one entity (first vs unique)

The current proposal (which I am not in love with either):

1) QueryResultIterable<Key<?>> things = ofy.load().keys();  // implicitly keysOnly().keys()
2) QueryResultIterable<Object> things = ofy.load().keysOnly().iterable();
3) Key<Object> thingKey = ofy.load().keysOnly().first().key();  // or unique().key();
4) Object thing = ofy.load().keysOnly().first().get();  // or unique().get();
5) Object thing = ofy.load().first().get();  // or unique().get();

Some things that can be done to make this simpler:

 * Eliminate unique(), keep first() since first is cheaper.  Forces users to perform unique checking themselves if they want it.  Personally, I make sure that unique things are actually stored uniquely - I don't worry about detecting illegal duplicates because they can't happen.
    - This allows us to use ref() instead of first() but honestly I think first() is more expressive.

 * Don't make it possible to query for partial entities.  Seems like this is missing some utility.

Here's an alternative:

1) QueryResultIterable<Key<?>> things = ofy.load().keys().iterableKeys();
2) QueryResultIterable<Object> things = ofy.load().keys().iterable();
3) Key<Object> thingKey = ofy.load().keys().first().key();  // or unique().key();
4) Object thing = ofy.load().keys().first().get();  // or unique().get();
5) Object thing = ofy.load().first().get();  // or unique().get();

Or maybe invert the first two:

1) QueryResultIterable<Key<?>> things = ofy.load().keys().iterable();
2) QueryResultIterable<Object> things = ofy.load().keys().iterableEntities(); // iterablePartials()?  entityIteratable()?  entities()?
3) Key<Object> thingKey = ofy.load().keys().first().key();  // or unique().key();
4) Object thing = ofy.load().keys().first().get();  // or unique().get();
5) Object thing = ofy.load().first().get();  // or unique().get();

Thoughts?  Is that what Objectify 3 uses?  Sorry if I'm playing catch-up here!

FWIW, Objectify3 uses fetch()/iterable() (equivalent), fetchKeys(), list(), and listKeys().  I don't think this is a great approach, especially since there is now the question of how to return partial entities:

 
BWT, I might be alone on this one but... why Ref instead of Reference?  Its doesn't really save typing with with auto-complete.  I've read similar discussions between Ceylon and Scala advocates and notice that Scala gets a bit of grief about having short forms of keywords that some people just don't like to look at.  I'm generally in that group - there is not much logic to it!  I just like t read full, complete words :)  At the end of the day its no big deal

I tend to dislike abbreviations too but for things I type a lot, I like shorter names as long as they are not ambiguous.  Ref is almost a word by itself - people talk about refs, ref counting, etc in programming context all the time.  Also, a three-letter Ref<?> is very similar to Key<?>, which is good because they are closely related.

I've had many arguments with Gavin about wordiness from the days before Ceylon had a name.  Generally I like the verbose format but I've tried to convince him to shorten Integer to Int and value to val, just because they get typed so much they are special cases.  So far he remains unconvinced.


I tend to agree that one of the big problems with Java is verbosity.  Too much getThis() and setThat() and catch (SomeIdioticCheckedException idontcareabout).  I really enjoy working with python and javascript, but I've found that projects are too hard to scale up without static typing.  I'm hoping Ceylon will be a sweet spot, even if I have to type Integer a lot.

Jeff

Jeff Schnitzer

unread,
Nov 18, 2011, 2:19:36 PM11/18/11
to objectify...@googlegroups.com
On Fri, Nov 18, 2011 at 2:14 PM, Jeff Schnitzer <je...@infohazard.org> wrote:

I tend to dislike abbreviations too but for things I type a lot, I like shorter names as long as they are not ambiguous.  Ref is almost a word by itself - people talk about refs, ref counting, etc in programming context all the time.  Also, a three-letter Ref<?> is very similar to Key<?>, which is good because they are closely related.

Another reason to like Ref instead of Reference:  try command-shift-t in eclipse, and type both of them in.  Sooo...many...References...

Jeff

Jeff Schnitzer

unread,
Nov 18, 2011, 2:40:20 PM11/18/11
to objectify...@googlegroups.com
On Fri, Nov 18, 2011 at 10:51 AM, John Patterson <jdpat...@gmail.com> wrote:

Twig has such a "short cut" method .load(Class<?>, Object id) and I use if often.  Also, its counterparts .store(Object, String) and .store(Object, long)

I don't fundamentally object to a few convenience shortcuts on the Objectify interface - although it'll be interesting to see if we can come up with a short list.  Alternatively:  It's pretty easy to use ObjectifyWrapper to add your own.

The store(Object, String/long) methods do bring up an interesting point though.  I really dislike this idea of entities that don't have id/parent fields - it complicates the API, the documentation, everything.  I like that you can look at an Objectify entity and immediately know its identity, how it relates to its parent (if any), and what its structure looks like in the datastore.  And batch put()s are simple.
 
Dirty change detection is interesting but that is also a massive step with major API implications.  The problem with dirty change detection is that it requires an explicit boundary - you have to open a session and close it, at which point changes are synced.  This ends up being a try/finally pattern which gets ugly.

On the other hand, there are definitely times I have wanted dirty change detection, and end up faking it with methods like this:

boolean changed = false;
changed = changed |= obj.setThing1(foo);
changed = changed |= obj.setThing2(bar);
if (changed)
    ofy.put(obj);

It's not elegant.  I'm not sure what the answer is but maybe sometime in the future we will come up with a way to address this that doesn't require everyone to define try/finally boundaries.  Objectify 5.0.

It could be good to store this "dirty" state in the Session Cache.  Also, its good to store activation state for each instance.  Before Twig did this, it was very easy to accidentally put an unactivated instance in the datastore.  After introducing activation state tracking, this became impossible (exception thrown) and also you would not activate the same instance more than once.

Intriguing.  There some issues around this.  I would hate to separate out store() vs update(); the simplicity of put() is nice.  But it should be possible to detect the partial entities and reject them from put() if there is a session cache.
 
I've also used this override to create Guice interceptors that do auto-activation.  So as soon as you call certain methods on a model that has not been activated, it goes and gets the data for you.

I've considered something like this (or cglib or whatnot) but the biggest issue is what do you do with polymorphic entities?  You can't swap out the class definition at runtime.

Jeff

Jeff Schnitzer

unread,
Nov 18, 2011, 2:53:32 PM11/18/11
to objectify...@googlegroups.com
On Fri, Nov 18, 2011 at 10:59 AM, John Patterson <jdpat...@gmail.com> wrote:

For auto-activation it is also required to track activation state.  There is a method ObectDatastore.isActivated() which is used by the proxy to check if it needs to activate or not.

I considered storing state in the instance itself (dirty state, activation state) using cglib or AspectJ but that would mess with serialization.

So the SessionCache ends up needing a bi-directional mapping - Key to Instance and Instance to Key (actually an Object holding Key and other state).

I'm making a lot of assumptions here about Objectify internals that I really don't know!

The bidirectional mapping makes sense, if only to defend against put()ing partial entities.  Will have to be slightly complicated because while hashCode() will work, equals() won't - users might (and probably have) overriden equals().  You probably have already dealt with this issue :-)  Did you create an identity wrapper or did you modify the Map implementation?

The biggest problem with dirty state detection is not implementation[1], but from the API how do you define the commit boundary?  It would suck to force all user API code to work like JPA:

Objectify ofy = fact.begin();
try {
  ...do some work...
} finally {
   ofy.sync();
}

On the other hand, if dirty state detection could be an optional feature on an Objectify-instance-by-instance basis.  Only use it when you need it.  And in transactions the api is easy since there's always a commit().  Hell, maybe it's something that should only work in a transaction.

About serialization:  It should be possible to make proxies serialize themselves out of existence.  Although I don't know about GWT serialization.  Hmmm.

[1] Actually, how do you really know when some method changed a field?  eg, if method doSomethingWithSideEffects() changes a field directly, how do we know the entity is dirty?  Hibernate stores a copy of all the fields and explicitly checks them.  Can proxies "proxy" field access?  Seems unlikely.  I'm curious to know what JDO does.

We probably shouldn't start a discussion of dirty state detection since that's a ways down the roadmap...

Jeff

John Patterson

unread,
Nov 18, 2011, 10:10:02 PM11/18/11
to objectify...@googlegroups.com
On 18/11/2011 22:59, Jeff Schnitzer wrote:
On Fri, Nov 18, 2011 at 9:10 AM, John Patterson <jdpat...@gmail.com> wrote:
I fear that "persistent instance" is very JDO-ish, and "instance" is totally ambiguous (instance of what?).  For better or worse, we've used the term Entity everywhere in the documentation already and there is a direct 1-to-1 correspondance between an Objectify entity and a low-level entity.  Plus, I suspect most people who come to Objectify aren't intimately familiar with the low-level api and don't want to be.  So I'm not sure it makes sense to have a linguistic distinction between the kinds of entities any more than there is between the kinds of keys.

Good points, you just get used to using words in a certain way.  "Entity" is also a lot more familiar to people coming from other persistence libraries.  An early early version of Twig API sometimes returned low-level Entities so I needed to distinguish back then.


As far as ofy().load().entity(key) vs ofy().load().key(key), you're right this is a bit awkward.  In fact, a previous iteration of API proposal had methods like this (showing return and param values):

Ref ofy.load().key(Key<?>);
Map ofy.load().keys(Iterator<Key>);
Map ofy.load().keys(Key<?>...);

Ref ofy.load().rawKey(Key);
Map ofy.load().rawKeys(Iterator<Key>);
Map ofy.load().rawKeys(Key...);

Ref ofy.load().entity(Object);
Map ofy.load().entities(Iterator<Object>); 
Map ofy.load().entities(Object...); 

Map ofy.load().values(Iterator<Object>);  // can be heterogenous
Map ofy.load().values(Object...);

I am not fond of this:

1) It's a lot of methods, and there's a lot of redundancy.  For example, entities() and values() are basically the same method - you can pass in anything.  Is it really necessary to have separate keys() vs rawKeys()?  It turns out all cases are handled by having methods entity()/entities() (or value()/values()).

Question: why does the API need to support raw Keys and ofy Keys?  It seems to complicate the API and the discussion.  Now there are Ref, Key, and raw Key.... perhaps its time to axe the raw Keys?  If there are good reasons for raw key support perhaps we can think of work arounds?


2) There's an inconsistency with the way queries work.  The query mechanism has methods keys() which means "return Key<?> objects" and entities() which means "return entity objects".  Whereas this API has method keys(...) which means "pass in a key" and entities() which means "pass in entity objects".

Of course, this is incongruent with both type().id() as well.  And maybe the query API is the part that should be "fixed".  And maybe it is.  But it's hard to see the value of having 11 methods when 3 will suffice with the exact same syntactic structure.

Maybe the answer is that instead of condensing to entity()/entities(), we should condense to key()/keys() (which takes Object as a parameter so you can pass in an entity).  It's a little odd to pass in an entity as a key but that's what we're proposing anyways - entities are key-like structures now.

Yeah I see what you're saying.  But I really don't like methods that take Object as a parameter but actually expect a certain type of object (or set of types).  It doesn't give users who browse the API any indication even though I know your jdocs are good :)   It also doesn't let your IDE make a sensible guess about what variable to use.


I would prefer ".instances()" or something other than "entities()"

Or we could just drop it and stick with iterable().  It's present already:

Iterable<Thing> ths = ofy.load().group("group").type(Thing.class).filter("foo", foo").iterable();

Of course this makes it easy to do stuff like this, which I find I use now and then:

for (Thing th: ofy.load().type(Thing.class).filter("foo", foo)) {
   ...
}

Thats good.  Like.  So Query implements Iterable?  Prob no need then to have .iterable() as we already have .iterator().  That is handy.  I like the thought that your example above allows async but it's completely hidden in your short loop


More comments about the query API in a later response.
 
I notice that the return type from queries is Iterable<T> which is a good default.  I assume it is backed by the "live" QueryResultIterator so results are fetched in batches.  Also, there is first() which is great.

The actual return value is QueryResultIterable and QueryResultIterator so you can get the cursor().  I was just being a lazy typist in the design doc.  The current java code is correct in this regard.
 
In Twig, I also frequently use .unique() which throws an exception if there is more than one result (very useful sanity check!) and .all() which loads every result in a single datastore API call by setting the chunk size and fetch size to MAX_VALUE.  Very useful when you know you are not dealing with thousands of results.

Sure, we can definitely add .unique() which adds the cost of an extra fetch (limit(2) instead of limit(1).

Yeah, its really great though to assert the correctness of you data.  I've found a few consistency problems ("how the f*** did that get in there?") using .unique()


The all() method is analogous to list().  I don't have a strong opinion on the naming convention.

list() is good.  Didn't see that before.


all() returns a List rather than an Iterable because it has the complete result set in memory.

There's actually some interesting behavior in the low-level api's List (which we don't currently use).  It will actually cursor through the results.  I don't think this is very good behavior because it tends to be slower than fetching the list all at once unless you're doing expensive operations at each iteration - in fact, you can speed up the app by calling size() on the list before iterating.

Ah yes, Twig also does not use that.  It gets everything at once which is a much more common use case.

Jeff Schnitzer

unread,
Nov 19, 2011, 8:18:17 AM11/19/11
to objectify...@googlegroups.com
Just a random thought on the query api before I run out the door... what about using all() to signify the start of a query chain, kind of like the way gae/python does it?

Old proposal:

Iterator<Object> things = ofy.load().iterator();
Iterator<Object> things = ofy.load().filter("foo", foo).iterator();
Iterator<Thing> things = ofy.load().type(Thing.class).iterator();
Iterator<Thing> things = ofy.load().type(Thiing.class).filter("foo", foo).iterator();

With all():

Iterator<Object> things = ofy.load().all().iterator();
Iterator<Object> things = ofy.load().all().filter("foo", foo).iterator();
Iterator<Thing> things = ofy.load().type(Thing.class).all().iterator();
Iterator<Thing> things = ofy.load().type(Thiing.class).all().filter("foo", foo).iterator();

On the downside, this is slightly more typing.  On the upside, it reads a little better and it means there's less stuff in the eclipse completion when you type "ofy.load()." - you get key(), keys(), ref(), refs(), group(), all() instead of all of the query operations too.

Jeff

Matthew Jaggard

unread,
Nov 19, 2011, 10:04:58 AM11/19/11
to objectify...@googlegroups.com
It also makes a clear distinction (which you know I'm keen on!) between calls that will run a query and those that won't - at least for now.

John Patterson

unread,
Nov 19, 2011, 10:09:36 AM11/19/11
to objectify...@googlegroups.com
On 19/11/2011 22:04, Matthew Jaggard wrote:
> It also makes a clear distinction (which you know I'm keen
> on!) between calls that will run a query and those that won't - at
> least for now.

The way I made this distinction in Twig by naming all the "chain
terminators" with returnXXX. So .returnAll() .returnUnique() .returnCount()


John Patterson

unread,
Nov 19, 2011, 10:25:01 AM11/19/11
to objectify...@googlegroups.com
On 19/11/2011 02:53, Jeff Schnitzer wrote:
>
> [1] Actually, how do you really know when some method changed a field?
> eg, if method doSomethingWithSideEffects() changes a field directly,
> how do we know the entity is dirty? Hibernate stores a copy of all
> the fields and explicitly checks them. Can proxies "proxy" field
> access? Seems unlikely. I'm curious to know what JDO does.
>
> We probably shouldn't start a discussion of dirty state detection
> since that's a ways down the roadmap...

I *think* AspectJ can do field interception. CGLIB creates a subclass
so that does not help. You could certainly do it with ASM but I don't
think in bits and bytes... it is so low level.

There is small project, Salve, by Igor of the Wicket project

http://code.google.com/p/salve/wiki/WhyNotAspectJ

Which does very clever stuff with ASM and I think it has its own utils
to make ASM easier to use.

Opps let not get into this!

John Patterson

unread,
Nov 19, 2011, 10:29:10 AM11/19/11
to objectify...@googlegroups.com

On 19/11/2011 20:18, Jeff Schnitzer wrote:
> Just a random thought on the query api before I run out the door...
> what about using all() to signify the start of a query chain, kind of
> like the way gae/python does it?

all() should work consistently with unique() first() count() yeah?
perhaps I'm too tired to see how that can work

John Patterson

unread,
Nov 19, 2011, 7:32:39 PM11/19/11
to objectify...@googlegroups.com

On 19/11/2011 20:18, Jeff Schnitzer wrote:

> Just a random thought on the query api before I run out the door...
> what about using all() to signify the start of a query chain, kind of
> like the way gae/python does it?

all() should work consistently with unique() first() count() yeah?

perhaps I'm too tired to see how that can work

Looking at these terminator methods together it is clear that all() fits
better than list()

Jeff Schnitzer

unread,
Nov 19, 2011, 7:50:21 PM11/19/11
to objectify...@googlegroups.com
No, totally different.  It would be the beginning of the chain, not a terminator. It would be a way of saying that you're looking at the entire result set (possibly filtered, possibly ancestored, etc).  It's similar to the GAE/Python API where you literally say stuff = Thing.all().filter("foo", foo).fetch(100).

Jeff

John Patterson

unread,
Nov 20, 2011, 12:19:56 AM11/20/11
to objectify...@googlegroups.com
On 14/11/2011 23:01, Jeff Schnitzer wrote:
>
> * My apps all turn on the session cache. I only explicitly disable
> the session cache when iterating through large datasets which would
> otherwise overwhelm the java heap.
>
> This suggests to me that session caching should be enabled by default.
> But I'm not certain of this, and more opinions / data points help.

I would go even further and suggest it should always be on (cannot
disable) and items have to be explicitly removed from the cache. If
half of users were doing things one way, and half the other way, many
user questions on this list would be responded to with "Do you have
session caching enabled or disabled?". Perhaps to enforce best
practices from the start.

John Patterson

unread,
Nov 20, 2011, 3:29:17 AM11/20/11
to objectify...@googlegroups.com
On 19/11/2011 20:18, Jeff Schnitzer wrote:
> Just a random thought on the query api before I run out the door...
> what about using all() to signify the start of a query chain, kind of
> like the way gae/python does it?
>
> Old proposal:
>
> Iterator<Object> things = ofy.load().iterator();
> Iterator<Object> things = ofy.load().filter("foo", foo).iterator();
> Iterator<Thing> things = ofy.load().type(Thing.class).iterator();
> Iterator<Thing> things = ofy.load().type(Thiing.class).filter("foo",
> foo).iterator();
>
> With all():
>
> Iterator<Object> things = ofy.load().all().iterator();
> Iterator<Object> things = ofy.load().all().filter("foo", foo).iterator();
> Iterator<Thing> things = ofy.load().type(Thing.class).all().iterator();
> Iterator<Thing> things =
> ofy.load().type(Thiing.class).all().filter("foo", foo).iterator();
>

Actually Jeff, would you even need the load() keyword then?

ofy.type(Thing.class).all().filter("foo", foo)
ofy.type(Thing.class).delete(thingId);
ofy.type(Thing.class).deleteAll(); //ow!

Jeff Schnitzer

unread,
Nov 20, 2011, 9:25:39 AM11/20/11
to objectify...@googlegroups.com
On Sun, Nov 20, 2011 at 4:29 AM, John Patterson <jdpat...@gmail.com> wrote:

Actually Jeff, would you even need the load() keyword then?

ofy.type(Thing.class).all().filter("foo", foo)
ofy.type(Thing.class).delete(thingId);
ofy.type(Thing.class).deleteAll();  //ow!


It could be done this way, but remember that there are typeless queries and also the group() method.  It would mean that the Objectify interface would extend Query<Object>, which would mean a long confusing list when eclipse tries to complete "ofy<dot>".  Also, group() doesn't apply to put() or delete() (although it might in the future).

One reason I want to keep the method count on Objectify down is that I'm making it really easy to derive your own MyObjectify which adds whatever convenience methods you need.  Just create a class like this:

public class MyObjectify extends ObjectifyWrapper<MyObjectify> {
    /** gives us a run() method that takes a MyObjectify as a parameter */
    public static class Work<R> extends TxnWork<MyObjectify, R> {}

    Thing getOrCreateThing() { ... }
    Iterable<Thing> someComplexQuery() { ... }
    ...other custom business methods...
}

Then subclass your own ObjectifyFactory:

public class MyFactory extends ObjectifyFactory {
    @Override
    public MyObjectify begin() { return new MyObjectify(super.begin()); }
}

This works with both the transact() method and the fluent interface without casting.  It replaces the old "make a DAO" advice.  It means you can add convenience methods (say, synchronous versions) yourself pretty easily... but not if there are already a zillion methods on Objectify to get in your way.

Jeff

John Patterson

unread,
Nov 20, 2011, 10:40:06 AM11/20/11
to objectify...@googlegroups.com
On 19/11/2011 01:14, Jeff Schnitzer wrote:
> Also - need better terminology for "unactivated entity" since I think
> we want to stick with @Load instead of @Activate. "unfetched entity"?
> "partial entity"? "empty entity"? "unloaded entity"? That last one
> makes the most sense if we are using the world load everywhere, but
> it's a little awkward. "Partial" kinda makes sense, especially in a
> world with @Denormalize. I'll use it here just for the hell of it.

I think of loading a new entity by key or id and adding it to the
session as different to filling in the field values of an existing
entity that already exists in memory (activating it). Activating an
entity does not need to return a value like load() does - the fields are
just filled in on the existing entity instance. So its quite a
different beast.

> The current proposal (which I am not in love with either):
>
> 1) QueryResultIterable<Key<?>> things = ofy.load().keys(); //
> implicitly keysOnly().keys()
> 2) QueryResultIterable<Object> things = ofy.load().keysOnly().iterable();
> 3) Key<Object> thingKey = ofy.load().keysOnly().first().key(); // or
> unique().key();
> 4) Object thing = ofy.load().keysOnly().first().get(); // or
> unique().get();
> 5) Object thing = ofy.load().first().get(); // or unique().get();
>
> Some things that can be done to make this simpler:
>
> * Eliminate unique(), keep first() since first is cheaper. Forces
> users to perform unique checking themselves if they want it.
> Personally, I make sure that unique things are actually stored
> uniquely - I don't worry about detecting illegal duplicates because
> they can't happen.

It is still very reassuring to be certain that in your running system,
the data has not somehow become inconsistent.

We could share the implementation of unique() and first().

> - This allows us to use ref() instead of first() but honestly I
> think first() is more expressive.

yes

> * Don't make it possible to query for partial entities. Seems like
> this is missing some utility.

I use that a lot in Twig so would be good to keep. Also, it would allow
a keys only query for entities and then to selectively activate (load)
the unactivated instances. This could be really handy

> Here's an alternative:
>
> 1) QueryResultIterable<Key<?>> things = ofy.load().keys().iterableKeys();
> 2) QueryResultIterable<Object> things = ofy.load().keys().iterable();
> 3) Key<Object> thingKey = ofy.load().keys().first().key(); // or
> unique().key();
> 4) Object thing = ofy.load().keys().first().get(); // or unique().get();
> 5) Object thing = ofy.load().first().get(); // or unique().get();
>
> Or maybe invert the first two:
>
> 1) QueryResultIterable<Key<?>> things = ofy.load().keys().iterable();
> 2) QueryResultIterable<Object> things =
> ofy.load().keys().iterableEntities(); // iterablePartials()?
> entityIteratable()? entities()?
> 3) Key<Object> thingKey = ofy.load().keys().first().key(); // or
> unique().key();
> 4) Object thing = ofy.load().keys().first().get(); // or unique().get();
> 5) Object thing = ofy.load().first().get(); // or unique().get();

I want to spend some time to think about this a bit more. I guess you
are pushing on with the impl though.

> I tend to dislike abbreviations too but for things I type a lot, I
> like shorter names as long as they are not ambiguous. Ref is almost a
> word by itself - people talk about refs, ref counting, etc in
> programming context all the time. Also, a three-letter Ref<?> is very
> similar to Key<?>, which is good because they are closely related.
>
> I've had many arguments with Gavin about wordiness from the days
> before Ceylon had a name. Generally I like the verbose format but
> I've tried to convince him to shorten Integer to Int and value to val,
> just because they get typed so much they are special cases. So far he
> remains unconvinced.

Haha! I guess its simply a matter of preference and really there is
probably no best way.

I would definitely be with Gavin on that (really looking forward to
Ceylon!) but its 6 of one, half a dozen of the other. But let it be
duly noted that my preference is with Reference<T> and other full
words. Actually before Guice, I used PicoContainer which heavily used
ObjectReference<T>... you would despise that!

What do others think? Are many others still reading? Which do you
prefer to read/type? Anyway, its your call at the end of the day.

> BTW, this is interesting reading:
> http://www.quora.com/Java-programming-language/Why-do-some-people-hate-Java
>
> I tend to agree that one of the big problems with Java is verbosity.
> Too much getThis() and setThat() and catch
> (SomeIdioticCheckedException idontcareabout). I really enjoy working
> with python and javascript, but I've found that projects are too hard
> to scale up without static typing. I'm hoping Ceylon will be a sweet
> spot, even if I have to type Integer a lot.

Amen. I am an auto-complete junky even when it is not necessary so I
don't really just type identifiers anyway.

Jeff Schnitzer

unread,
Nov 20, 2011, 11:25:21 AM11/20/11
to objectify...@googlegroups.com
On Sun, Nov 20, 2011 at 11:40 AM, John Patterson <jdpat...@gmail.com> wrote:
On 19/11/2011 01:14, Jeff Schnitzer wrote:
Also - need better terminology for "unactivated entity" since I think we want to stick with @Load instead of @Activate.  "unfetched entity"?  "partial entity"?  "empty entity"? "unloaded entity"?  That last one makes the most sense if we are using the world load everywhere, but it's a little awkward.  "Partial" kinda makes sense, especially in a world with @Denormalize.  I'll use it here just for the hell of it.

I think of loading a new entity by key or id and adding it to the session as different to filling in the field values of an existing entity that already exists in memory (activating it).  Activating an entity does not need to return a value like load() does - the fields are just filled in on the existing entity instance.  So its quite a different beast.

What does Twig do for polymorphic entities?  You can't activate them :-(

I wasn't planning to have any kind of activation-of-existing-entities... I was thinking this would be what Ref<?> gives you.  I guess we could allow activation but just not for polymorphic entities but this asymmetry seems bad to me.  I like that you can add @Subclass entities to an existing data model right now and everything just works; this asymmetry means that adding a @Subclass could cause existing code to throw exceptions when you try to activate a now-polymorphic type.

I guess the argument could be made that polymorphism is rare, but I don't think that's going to be true.  It's rare in JPA-land because it's a PITA to configure and it makes the RDBMS structure messy.  Objectify polymorphism is really easy and once you start using it, you start using it everywhere... (well, I do at any rate).

Jeff

John Patterson

unread,
Nov 21, 2011, 12:32:09 AM11/21/11
to objectify...@googlegroups.com
On 20/11/2011 21:25, Jeff Schnitzer wrote:
On Sun, Nov 20, 2011 at 4:29 AM, John Patterson <jdpat...@gmail.com> wrote:

Actually Jeff, would you even need the load() keyword then?

ofy.type(Thing.class).all().filter("foo", foo)
ofy.type(Thing.class).delete(thingId);
ofy.type(Thing.class).deleteAll();  //ow!


It could be done this way, but remember that there are typeless queries and also the group() method.  It would mean that the Objectify interface would extend Query<Object>, which would mean a long confusing list when eclipse tries to complete "ofy<dot>".  Also, group() doesn't apply to put() or delete() (although it might in the future).

Firstly, as I have not actually tried to code any of this or read your code I'm likely to make stupid suggestions...

Objectify should not extend Query , for sure.

The above could be designed without group() applying to put() and delete(), Objectify not implementing Query and still keep your ObjectifyWrapper stuff.  I've left out the type parameters that refer to the implementation class.

put() would return PutCommand or something like that.

type(...) would return some kind of TypedCommand<T> which exposes all() and first() etc

all() would return QueryTypedCommand<List<T>>

first() would return QueryTypedCommand<Ref<T>>

count() would would return QueryTypedCommand<Integer>




Matthew Jaggard

unread,
Nov 21, 2011, 3:52:20 AM11/21/11
to objectify...@googlegroups.com

I would definitely be with Gavin on that (really looking forward to Ceylon!) but its 6 of one, half a dozen of the other.  But let it be duly noted that my preference is with Reference<T> and other full words.  Actually before Guice, I used PicoContainer which heavily used ObjectReference<T>... you would despise that!

What do others think?  Are many others still reading?  Which do you prefer to read/type?  Anyway, its your call at the end of the day.

I'm still reading!! I like readable code as first priority and shortness as second priority. In this case, I think "Ref" is just as readable as "Reference" because we use "ref" as a word in everyday life. So Ref wins for me on being just as readable (10 points each) shorter (2 points to Ref).

Also, I don't particularly like the idea of "Activating" an instance - I'll fetch what I need to using fetch groups. I can't quite imagine the situation yet, but if I ever wanted to query for an entity and then query again for some more data I think I'd be happy having a new instance. I just can't see myself doing that because getting a big entity is not much more expensive than querying a small one - I guess that all changes when embedded objects mean doing multiple queries.

Mat.

Jeff Schnitzer

unread,
Nov 21, 2011, 8:59:32 AM11/21/11
to objectify...@googlegroups.com
This is almost exactly how the (new) code works... although there are some subtleties at the query terminator.  Little things like count() on the low-level api has no async option so it actually returns an int on our interface.  But the good thing is that at each step in the command chain there are only the methods that apply to that step... it works really well in eclipse.


Note that these interfaces are a little bit behind the discussion while I work on the type conversion & load process.  However, the test harness has been updated and the new builder interface passes, so it works :-)  It's pretty easy to modify the command chains at this point so let's keep the discussion rolling.

Jeff

On Mon, Nov 21, 2011 at 1:32 AM, John Patterson <jdpat...@gmail.com> wrote:

Firstly, as I have not actually tried to code any of this or read your code I'm likely to make stupid suggestions...

Objectify should not extend Query , for sure.

The above could be designed without group() applying to put() and delete(), Objectify not implementing Query and still keep your ObjectifyWrapper stuff.  I've left out the type parameters that refer to the implementation class.

put() would return PutCommand or something like that.

type(...) would return some kind of TypedCommand<T> which exposes all() and first() etc

all() would return QueryTypedCommand<List<T>>

first() would return QueryTypedCommand<Ref<T>>

count() would would return QueryTypedCommand<Integer>







--
I am the 20%

Drew Spencer

unread,
Jan 24, 2012, 7:02:10 AM1/24/12
to objectify...@googlegroups.com
You know, just as you think you're getting the hang of this App Engine malarkey, you come across a thread like this and wend up crying into your bleeding hands.

No seriously, it's great to see this much work going into something that is just invaluable once it becomes your defacto way of access the DS.

"In Jeff We Trust"

Drew

Drew Spencer

unread,
Jan 24, 2012, 7:03:26 AM1/24/12
to objectify...@googlegroups.com
Sorry for the terribel typoos.

Dominik Mayer

unread,
Mar 2, 2012, 5:09:49 PM3/2/12
to objectify...@googlegroups.com
I have a question about the following line from the Wiki:

// Simple key fetch, always async
Thing th =      ofy.load().key(thingKey).get();

Does "always async" mean that Objectify does not access the datastore as long as I'm not invoking "th"?

Right now I use many keys in my code. In one part I get the id of an entity from the client and want to set it on the server (one-to-many relationship). I'm basically doing:

Key<Thing> thKey = Key.create(Thing.class, id);

If I switch to the new system I cannot do something like:

Thing th = Ref.create(Key.create(Thing.class, id));

This would return a Ref<Thing> instead of a Thing. So if I do

Key<Thing> key = Key.create(Thing.class, id);
Thing th = ofy.load().key(key).get();

does it access the datastore? Do I get billed for the operation?

Jeff Schnitzer

unread,
Mar 2, 2012, 5:20:08 PM3/2/12
to objectify...@googlegroups.com
That document is slightly out of date, but cutting out the context hurts the meaning:

// Simple key fetch, always async
Ref<Thing> th = ofy.load().key(thingKey);
Thing th =      ofy.load().key(thingKey).get();
Thing th =      ofy.load().key(thingKey).safeGet();     // throws NotFoundException

Fetching the Ref<Thing> kicks off the async operation.  Calling get() or safeGet() completes it.

Once you start the async operation, you will be charged whether or not you fetch the result.

Jeff

Dominik Mayer

unread,
Mar 2, 2012, 5:31:55 PM3/2/12
to objectify...@googlegroups.com
That document is slightly out of date, but cutting out the context hurts the meaning:


// Simple key fetch, always async
Ref<Thing> th = ofy.load().key(thingKey);
Thing th =      ofy.load().key(thingKey).get();
Thing th =      ofy.load().key(thingKey).safeGet();     // throws NotFoundException
Sorry. I thought the headline belonged to the whole group.

So I guess the best/cheapest way would be to continue using Keys or Refs as long as I create them with:

Ref.create(Key.create(Thing.class, id));

I don't quite understand the @Load part. Especially the first and last line of:

   @Load({"bigGroup", "smallGroup"})
   
SomeThing some;

   
@Load("bigGroup")
   
List<OtherThing> others;

   
@Load
   
Ref<OtherThing> refToOtherThing;

   
Ref<OtherThing> anotherRef;  // no @Load means never fetched automatically
What do I get if I have "SomeThing some" without "@Load"? Is that even possible? And what's the difference between a Ref with or without "@Load". The Ref with "@Load" already contains the OtherThing so I can get() it without contacting the database again?

Ruslan V

unread,
Mar 2, 2012, 5:37:32 PM3/2/12
to Dominik Mayer
Dear Dominik,


Friday, March 2, 2012, 2:31:55 PM, you wrote:


What do I get if I have "SomeThing some" without "@Load"? Is that even possible? And what's the difference between a Ref with or without "@Load". The Ref with "@Load" already contains the OtherThing so I can get() it without contacting the database again?


Without @Load entity won't be automatically loaded when you fetch it. But when you persist it key of "some" will still be saved.

/Ruslan
/Particles (2012-02-26) Part 1 - Stephen J. Kroos (Proton Radio)

Dominik Mayer

unread,
Mar 2, 2012, 5:43:24 PM3/2/12
to objectify...@googlegroups.com
Without @Load entity won't be automatically loaded when you fetch it. But when you persist it key of "some" will still be saved.

But what is the difference for me? In both cases I have to call "get()" before I can use it? Is this about speed? Or cost?

Oh, and one other question: If I have a "Ref<Thing> thing" and I "get()" it several times, are these operations billed once or several times? Right now I have "@Ignore Thing cacheThing" fields to avoid subsequent database reads.

Jeff Schnitzer

unread,
Mar 2, 2012, 5:54:55 PM3/2/12
to objectify...@googlegroups.com
You can have a field "SomeThing some" without @Load; this object acts as what I've been casually calling a "partial entity" or what JohnP called an "unactivated entity" in Twig.  It's just an entity whose key fields have been set but no others.  The native datastore type is a simple Key.

Note that Ref<Thing> without a @Load annotation is not loaded from the datastore.  If you call get() on an uninitialized Ref you will get an exception.  (note that getValue() returns null so that json serializers don't freak out)

For any Ref you can call get() as many times as you like.  The value is cached.

Jeff

Dominik Mayer

unread,
Mar 2, 2012, 6:29:18 PM3/2/12
to objectify...@googlegroups.com
You can have a field "SomeThing some" without @Load; this object acts as what I've been casually calling a "partial entity" or what JohnP called an "unactivated entity" in Twig.  It's just an entity whose key fields have been set but no others.  The native datastore type is a simple Key.

That's what I wanted to create without querying the datastore. Is that possible? If I have the id, I can create a key and that's all the datastore needs. But I can't find a way to create a partial entity similar to Key.create(...).

Note that Ref<Thing> without a @Load annotation is not loaded from the datastore.  If you call get() on an uninitialized Ref you will get an exception.  (note that getValue() returns null so that json serializers don't freak out)

So if I understand that correctly, I cannot call "get()" on a Ref I created via Ref.create(Key.create(...))?

Seems that I completely misunderstood the Refs... I thought they were the things that I used to create manually:

@Index Key<Thing> thingKey;
 
@Ignore Thing cacheThing; 
 
public void getThing() {
    if (cacheThing == null)
        cachedThing = ofy.load().key(thingKey).safeGet();
    return cacheThing;
}

I have a tree structure so I cannot preload the entities without getting the whole tree... I thought I could use the Refs and then just say "get()" whenever I need the entity. Hm... Maybe it's best to stick with the keys...

Jeff Schnitzer

unread,
Mar 2, 2012, 7:01:13 PM3/2/12
to objectify...@googlegroups.com
On Fri, Mar 2, 2012 at 6:29 PM, Dominik Mayer <domini...@gmail.com> wrote:
You can have a field "SomeThing some" without @Load; this object acts as what I've been casually calling a "partial entity" or what JohnP called an "unactivated entity" in Twig.  It's just an entity whose key fields have been set but no others.  The native datastore type is a simple Key.

That's what I wanted to create without querying the datastore. Is that possible? If I have the id, I can create a key and that's all the datastore needs. But I can't find a way to create a partial entity similar to Key.create(...).

You mean other than having a constructor for your SomeThing that accepts the key fields?

thing = new SomeThing(thingId);

However, I've been thinking of moving the key manipulation code out of ObjectifyFactory and into a static.  Since keys are defined statically via annotations, they never change, so this metadata is reasonable to place in a static context.  The main advantage of this is that the (new) Key/Ref.equivalent() methods could work with partial entities.  It would certainly be possible to have a createPartial() method but it seems a little silly.

Note that Ref<Thing> without a @Load annotation is not loaded from the datastore.  If you call get() on an uninitialized Ref you will get an exception.  (note that getValue() returns null so that json serializers don't freak out)

So if I understand that correctly, I cannot call "get()" on a Ref I created via Ref.create(Key.create(...))?

Correct. Unless you use the (recently checked in) Ref.create(key, entity) method.

The idea that you can create a bunch of Refs and then call get() to fetch them is tempting, but it would require a static ObjectifyFactory context.  And it makes transactions complicated.  

You can call Objectify.load().key(key) and get back a Ref that is primed to give you the results you want.

Seems that I completely misunderstood the Refs... I thought they were the things that I used to create manually:

@Index Key<Thing> thingKey;
 
@Ignore Thing cacheThing; 
 
public void getThing() {
    if (cacheThing == null)
        cachedThing = ofy.load().key(thingKey).safeGet();
    return cacheThing;
}

I have a tree structure so I cannot preload the entities without getting the whole tree... I thought I could use the Refs and then just say "get()" whenever I need the entity. Hm... Maybe it's best to stick with the keys...

The Ref is probably exactly what you want here, but you still need to fetch it explicitly.

@Index Ref<Thing> thing;

public void getThing() {
   if (thing.getValue() == null)
      ofy.load().ref(thing);
   return thing.getValue();
}

This is a little awkward because there's no way to determine if the Ref has been fetched, so using the getValue() method (and null response) is a surrogate.  It's not quite right because the actual value in the datastore might be null (ie, missing entity).  This is a case I haven't really thought about, but it wouldn't be too hard to create a more sophisticated handler.

It would even be possible to create a SmartRef which will go to the datastore ad-hoc.  You could actually do this right now just by registering a special TranslatorFactory which understands SmartRef, based on the existing RefTranslator.  You have to be careful of transaction boundaries, however... an Objectify instance in a transactional context is no good after the end of the transaction.

Jeff

Dominik Mayer

unread,
Mar 2, 2012, 7:26:46 PM3/2/12
to objectify...@googlegroups.com
That's what I wanted to create without querying the datastore. Is that possible? If I have the id, I can create a key and that's all the datastore needs. But I can't find a way to create a partial entity similar to Key.create(...).

You mean other than having a constructor for your SomeThing that accepts the key fields?

thing = new SomeThing(thingId);

Yes, because I don't want to create a new Thing, I just want to set a relationship. Like:

@Index Thing parentThing; 

public void setParent(Long id) {
    parentThing = Objectify.createPartial(Thing.class, id);
}

Technically I just want to set the field parentThing in the database to Key.create(Thing.class, id).
 
However, I've been thinking of moving the key manipulation code out of ObjectifyFactory and into a static.  Since keys are defined statically via annotations, they never change, so this metadata is reasonable to place in a static context.  The main advantage of this is that the (new) Key/Ref.equivalent() methods could work with partial entities.  It would certainly be possible to have a createPartial() method but it seems a little silly.

Yes. My misunderstanding. I thought I could create the partial and Objectify would magically load it from the database once I first invoke one of its methods...
 
Correct. Unless you use the (recently checked in) Ref.create(key, entity) method.

I'll give it a try once 4.0a4 is out. (Just out of curiosity: Why is the key not automatically created from the entity id?)
 
The Ref is probably exactly what you want here, but you still need to fetch it explicitly.

@Index Ref<Thing> thing;

public void getThing() {
   if (thing.getValue() == null)
      ofy.load().ref(thing);
   return thing.getValue();
}

Don't I have to say:

thing = ofy.load().ref(thing); 
 
It would even be possible to create a SmartRef which will go to the datastore ad-hoc.  You could actually do this right now just by registering a special TranslatorFactory which understands SmartRef, based on the existing RefTranslator.  You have to be careful of transaction boundaries, however... an Objectify instance in a transactional context is no good after the end of the transaction.

Sounds interesting. I'll have a look at the docs.

Jeff Schnitzer

unread,
Mar 2, 2012, 7:44:13 PM3/2/12
to objectify...@googlegroups.com
On Fri, Mar 2, 2012 at 7:26 PM, Dominik Mayer <domini...@gmail.com> wrote:

Yes, because I don't want to create a new Thing, I just want to set a relationship. Like:
[...]
Yes. My misunderstanding. I thought I could create the partial and Objectify would magically load it from the database once I first invoke one of its methods...

Ah, you want Thing to be a dynamic proxy.  This kind of stuff is possible with the new Translator system in Ofy4, but I myself am unlikely to implement it anytime soon.  You would need to include cglib or some other bytecode generator to build these, and there are probably all kinds of consequences to doing this that need to be thought through.

I'll give it a try once 4.0a4 is out. (Just out of curiosity: Why is the key not automatically created from the entity id?)

You bring up another reason why the key metadata needs to be stored in a static context rather than inside the ObjectifyFactory.

Look for this in the next couple weeks, it has become a pressing need of my own.  We should be able to say:

Key.create(thing);
Ref.create(thing);
 
The Ref is probably exactly what you want here, but you still need to fetch it explicitly.

@Index Ref<Thing> thing;

public void getThing() {
   if (thing.getValue() == null)
      ofy.load().ref(thing);
   return thing.getValue();
}

Don't I have to say:

thing = ofy.load().ref(thing); 

Nope.  ofy.load().ref(thingRef) actually sets the Result<Thing> inside that Ref instance.  In this way it behaves a little differently from the key() and entity() methods.  But it is much more useful, especially when you have Ref objects that get passed around various places.
 
Sounds interesting. I'll have a look at the docs.

I hope you mean javadocs, because unfortunately this is the only place the documentation lives right now.  And I haven't yet gotten around to rewriting the big javadoc on Transmog that explains how the whole system works.

I've been working frantically on Voost (which, btw, has launched:  https://www.voo.st/) and getting a lot of practical experience with Ofy4.  There are some minor things which will change as a consequence.  But it hasn't left any time to write Objectify documentation :-(

Jeff

Dominik Mayer

unread,
Mar 2, 2012, 7:58:03 PM3/2/12
to objectify...@googlegroups.com

Yes, because I don't want to create a new Thing, I just want to set a relationship. Like:
[...]
Yes. My misunderstanding. I thought I could create the partial and Objectify would magically load it from the database once I first invoke one of its methods...

Ah, you want Thing to be a dynamic proxy.  This kind of stuff is possible with the new Translator system in Ofy4, but I myself am unlikely to implement it anytime soon.  You would need to include cglib or some other bytecode generator to build these, and there are probably all kinds of consequences to doing this that need to be thought through.

I just use Ref for now.

Look for this in the next couple weeks, it has become a pressing need of my own.  We should be able to say:

Key.create(thing);
Ref.create(thing);

That would be great.
 
Don't I have to say:

thing = ofy.load().ref(thing); 

Nope.  ofy.load().ref(thingRef) actually sets the Result<Thing> inside that Ref instance.  In this way it behaves a little differently from the key() and entity() methods.  But it is much more useful, especially when you have Ref objects that get passed around various places.

Good to know. Thanks!
 
Sounds interesting. I'll have a look at the docs.

I hope you mean javadocs, because unfortunately this is the only place the documentation lives right now.

Yes.
 
And I haven't yet gotten around to rewriting the big javadoc on Transmog that explains how the whole system works.

The memorable name Transmog came up while debugging ;-).
 
I've been working frantically on Voost (which, btw, has launched:  https://www.voo.st/) and getting a lot of practical experience with Ofy4.

Looks cool. But: The closest event to Germany is in Eyjafjallajökull, Iceland :-D.

Jeff Schnitzer

unread,
Mar 2, 2012, 8:12:19 PM3/2/12
to objectify...@googlegroups.com
On Fri, Mar 2, 2012 at 7:58 PM, Dominik Mayer <domini...@gmail.com> wrote:
I've been working frantically on Voost (which, btw, has launched:  https://www.voo.st/) and getting a lot of practical experience with Ofy4.

Looks cool. But: The closest event to Germany is in Eyjafjallajökull, Iceland :-D.

If you register for that event, make sure your life insurance is paid up ;-)

Jeff

Dominik Mayer

unread,
Mar 2, 2012, 8:12:53 PM3/2/12
to objectify...@googlegroups.com
:-D
--
Dominik Mayer

Dominik Mayer

unread,
Mar 3, 2012, 3:20:19 AM3/3/12
to objectify...@googlegroups.com
Look for this in the next couple weeks, it has become a pressing need of my own.  We should be able to say:

Key.create(thing);
Ref.create(thing);

I was just thinking: What would happen if "thing" does not yet have an Id? Would create() return null, throw an exception or save "thing" and return the key? In the latter case: What would happen if the Id is of type long/String and cannot be created?

Jeff Schnitzer

unread,
Mar 3, 2012, 9:15:58 AM3/3/12
to objectify...@googlegroups.com
It will throw an exception.  The same thing happens right now if you try Key.create(theClass, null).

Remember, there is no datastore access outside the context of an Objectify instance.  The Objectify instance controls transaction behavior, cache behavior, consistency behavior, etc.  There is no static context.

Jeff

Dominik Mayer

unread,
Mar 3, 2012, 10:47:13 AM3/3/12
to objectify...@googlegroups.com
I'm forgetting this all the time...

But I should be able to write my own Ofy method:

public <T extends Model> Ref<T> safeCreateRef(final T entity) {
    Ref<T> ref;
    try {
        ref = Ref.create(entity);
    } catch (IllegalArgumentException e) {
        save().entity(entity).now();
        ref = Ref.create(entity);
    }
    return ref;
}


Simon Knott

unread,
Mar 3, 2012, 2:47:09 PM3/3/12
to objectify...@googlegroups.com
Just checked out Voost and got the attached page when going to https://www.voo.st/clubs - thought you might like to know!

On Saturday, 3 March 2012 00:44:13 UTC, Jeff Schnitzer wrote:
voost.png

Jeff Schnitzer

unread,
Mar 3, 2012, 2:53:27 PM3/3/12
to objectify...@googlegroups.com
Oops!  thanks for the bug report :-)

It's some sort of geoip issue more sophisticated than the usual "no data" - I'll look into it today.  In the mean time you can click on the null, null and reset it to something else.

Jeff

Dominik Mayer

unread,
Mar 4, 2012, 9:02:52 PM3/4/12
to objectify...@googlegroups.com
Another thing about Objectify 4. I mentioned the trouble I had/have with the fact that Objectify is building entities on registration. My entities use a UserService that gets the current user which is an entity and needs the UserService... I get an endless loop if I inject either a UserService or an Ofy.

So I decided to use Guice Providers. Problem was: Whenever I got an Ofy from a provider it was a new one (which needed to be initialized). I then put the provider from Jeff's sample application into a new class that can be bound as singleton. Maybe this helps others with similar problems. (Or maybe someone knows a better solution.)

public class OfyProvider implements Provider<Ofy> {
private OfyFactory ofyFactory;
private Ofy ofy;
@Inject
public OfyProvider(final OfyFactory ofyFactory) {

this.ofyFactory = ofyFactory;
}
@Override
public Ofy get() {
if (ofy == null)
ofy = ofyFactory.begin();
return ofy;
}
}

And in the guice configure() method:

bind(Ofy.class).toProvider(OfyProvider.class).in(Singleton.class);

Matthew Jaggard

unread,
Mar 5, 2012, 1:36:09 AM3/5/12
to objectify...@googlegroups.com

I think objectify with guice should be @RequestScoped so you get a new objectify per request.

Dominik Mayer

unread,
Mar 5, 2012, 6:50:53 AM3/5/12
to objectify...@googlegroups.com

I think objectify with guice should be @RequestScoped so you get a new objectify per request.

Thanks, Matthew. I had deactivated @RequestScoped without fully understanding it because it was masking errors. Reading more Guice docs I found out that I can replace @RequestScoped with @Singleton and it does the same thing as my Provider.

But I should not share the same Ofy between requests because all instances would use the same session cache, right? So Request A and Request B would both get the exact same object and changes made by Request A would be reflected in the object of Request B. (From the old doc: "The session cache is not thread-safe. You should never share an Objectify instance between threads.")

Matthew Jaggard

unread,
Mar 5, 2012, 7:07:11 AM3/5/12
to objectify...@googlegroups.com
That's correct. So you should be able to replace @Singleton with @RequestScoped now that you're using a Provider. The only difference being that on the first call to Provider.get on a non-first request, you'll get a new @RequestScoped object, but a re-used @Singleton object.

Dominik Mayer

unread,
Mar 5, 2012, 8:56:11 PM3/5/12
to objectify...@googlegroups.com
How can I enable the session cache? The wiki says: "Session cache enabled by default" but the Javadoc contradicts "The default options are: [...] Do NOT use a session cache". ObjectifyImpl has a Session field.

The following situation: "Entity" invokes a method that loads entities from the datastore, changes one setting and persists them. One of these entities is "Entity" itself. With session cache this would be the exact same instance of "Entity" that has just been altered, right? So the changes would be reflected in "Entity". Because right now I see the changes in the datastore and then "Entity" is persisted and the changes are overwritten because "Entity" was in the old state. So I suppose I'm not using any Session Cache? Or am I misunderstand something?

Jeff Schnitzer

unread,
Mar 6, 2012, 9:23:23 AM3/6/12
to objectify...@googlegroups.com
Objectify 4 has a session cache enabled by default.  The only place in the wiki I could find that text is the design doc for Ofy4 - is there somewhere else?  Note that it does not currently return the exact entity object instance - the cache only holds the raw data, which gets translated into a new object instance every load().  This is something that will change prior to a "real release"  -- after a lot of hands-on experience, plus a lot of discussion with Guido, we have decided that NDB and Objectify should both attempt to unify to single object instances in the session.

Objectify 3 does not enable the session cache by default, you must turn that on.

What version of Ofy are you using?  Also:  It would help a lot if you posted a small snippet of code or pseudocode to describe what you are doing.  I'm having a hard time picturing it.

Jeff

Matthew Jaggard

unread,
Mar 6, 2012, 9:30:36 AM3/6/12
to objectify...@googlegroups.com
Also, transactions affect the use of the session cache.

Dominik Mayer

unread,
Mar 6, 2012, 10:44:42 AM3/6/12
to objectify...@googlegroups.com
Objectify 4 has a session cache enabled by default.  The only place in the wiki I could find that text is the design doc for Ofy4 - is there somewhere else?

 
Note that it does not currently return the exact entity object instance - the cache only holds the raw data, which gets translated into a new object instance every load().  This is something that will change prior to a "real release"  -- after a lot of hands-on experience, plus a lot of discussion with Guido, we have decided that NDB and Objectify should both attempt to unify to single object instances in the session.

That clarifies things. Thank you. 
 
What version of Ofy are you using?

Objectify 4, that's why I'm posting in this thread ;-). (Exactly: Objectify 4.0a3.)
 
Also:  It would help a lot if you posted a small snippet of code or pseudocode to describe what you are doing.  I'm having a hard time picturing it.

I made up a simple example just to demonstrate what I'm talking about:

@Entity
public class Address {

    [...]

    private boolean pet = false;

    // Also sets the fields of everyone living here.
    public void setPet(boolean pet) {

        this.pet = pet;
        List<Person> persons = ofy.load().type(Person.class).
                filter("address", this).list();

        for (Person p : persons) {
            p.setPet(pet);
            ofy.save(p);
        }
    }
}

@Entity
public class Person {

    [...]

    @Index private Address address;

    private boolean livesWithPet;

    /*package*/ void setPet(boolean pet) {
        livesWithPet = pet;
    }

    public void persist() {

        address.persist();
        ofy.save(this);
    }
}

On the client (using RequestFactory), the person (including the address) is loaded and some fields are changed. Then the pet (in the address) is set to "true" and the whole thing is persisted: 

request.persist().using(person).fire(someReceiver);

Now on the server the following happens:
  1. RequestFactory loads the Person ("Peter", for now). "livesWithPet" is "false".
  2. RequestFactory sets the pet in the address.
  3. The Address loads all the persons living at that address (Peter, Paul and Mary), sets the boolean to "true" and saves. Peter now lives with a pet. This can be verified in the Datastore Viewer.
  4. RequestFactory persists Peter (instance of step 1) but because in that instance "livesWithPet" is still "false", the field in the Datastore ("true") is overwritten with "false".
So my question was whether I understand that correctly: Once the session cache returns the exact same object, the change to "livesWithPet" would be reflected in every instance of "Peter" (provided by an Ofy) so it would be "true" in step four.
It is loading more messages.
0 new messages