New ORM: backward compatibility issues

63 views
Skip to first unread message

Sam Minnée

unread,
Apr 4, 2011, 5:15:48 PM4/4/11
to SilverStripe Core Development
Hi everyone,

The new ORM has a few issues that break backward compatibility in ways
that aren't easily fixed:

- DataObject::get() doesn't return null if there are no records,
because the result is lazy loaded. You'd need to call
DataObject::get()->count() or DataObject::get()->empty()

- merge(), insertFirst(), shift(), replace(), push() don't work,
because the list of records is defined by a query.

- Pagination is an optional GoF Decorator (not an Extension); this
functionality isn't included by default.

My question is - how much of an issue is this, how much effort should
we put into maintaining compatibility with these methods?

There are a few options.

1. DataObject::get() returns a traditional DataObjectSet, where .
DataList::create() will access the new ORM. The disadvantage of this
approach is that it won't be consistently - it won't apply to relation
methods and other places where you get DataObjects without explicitly
calling DataObject::get().

2. Use dependency injection to replace all DataLists, with something
like LegacyHybridList, that acts as a DataList until merge() is called
for the first time, at which point the DataList is "realised" and the
merge() function called.

3. Provide an easy hack - for example, new
DataObjectSet(DataObject::get('SiteTree')) - that will turn a DataList
into a traditional DataObjectSet that you can use merge(), etc, on.

What do people think?

Sam Minnée

unread,
Apr 4, 2011, 6:15:43 PM4/4/11
to SilverStripe Core Development
Further to this...

In another post we've recommended renaming DataObjectSet to SS_List,
as a base generic list class. It's not clear, however, exactly what
SS_List should be:

- An abstract base class of all lists, implementing ArrayAccess,
IteratorAggregate, Countable, with no actual implementation itself.
(Presumably, getIterator and count() would be left abstract, an the
ArrayAccess methods have a default implementation with the use of
getRange()).

- A specific implementation of a list based on an internal array.
(like DataObjectSet is)

If SS_List were an abstract base class then we would need to create a
subclass of this called SS_ArrayList, SS_RealList, SS_SimpleList, or
something else, that implemented the current DataObjectSet functions.

The advantage of that is that we wouldn't need to stub out merge(),
insertFirst(), etc, on DataList, and we could instead leave them out
of the SS_List implementation. Instead, we would provide those
methods *only* on the SS_ArrayList class.

I recommend that we make SS_List an abstract base class, as described,
but I'd like to get others' feedback first.

Denis 'Alpheus' Čahuk

unread,
Apr 5, 2011, 5:07:04 AM4/5/11
to SilverStripe Core Development
Are there any plans to still have a Set-like atlernative to SS_List
(think Java here)? There are some problems that might surface on
Component relations of DataObjects, considering how the intent of
DataObjectSet was slightly misused in previous SS versions and will
probably be clarified with SS_List. However, storing a DataObject's
many-many relations in an SS_List might imply "order" (read: indexed)
on this list. Does this mean that by using a DataObjectManager you
would always need to make an object's *:many relation sortable?

To point #1 on the OP, is it necessary to change the original API? Old
vendors will still function properly with DataObject::get(), and all
post 3.0-era API's will use DataList::create or some other factory or
DI-approach to create a new data holder, or request data.

Sam Minnée

unread,
Apr 5, 2011, 5:57:22 AM4/5/11
to SilverStripe Core Development
> Are there any plans to still have a Set-like atlernative to SS_List
> (think Java here)?

The short answer is "no, not yet" although now would be an appropriate
time to consider it.

> There are some problems that might surface on
> Component relations of DataObjects, considering how the intent of
> DataObjectSet was slightly misused in previous SS versions and will
> probably be clarified with SS_List. However, storing a DataObject's
> many-many relations in an SS_List might imply "order" (read: indexed)
> on this list. Does this mean that by using a DataObjectManager you
> would always need to make an object's *:many relation sortable?

The main thing is that requests for data (via DataObject::get or
DataList::create) always have an implicit order, for example by
default_sort. It seems like the only advantage of allowing for a
SS_Set would be to allow backend implementations that didn't support
sorting, however it seems unlikely that such a backend implementation
would be of much use in a real-world application, so most back-ends
would need to support both SS_Set and SS_List.

It doesn't really seem like there's much benefit in supporting a
separate Set class; it would add to confusion when developer tried
putting SS_Sets into templates, and I don't see what operations would
be easier if we had a SS_Set distinct from SS_List.

Did you care to elaborate?

> To point #1 on the OP, is it necessary to change the original API? Old
> vendors will still function properly with DataObject::get(), and all
> post 3.0-era API's will use DataList::create or some other factory or
> DI-approach to create a new data holder, or request data.

My only concern is that this approach would mean that SilverStripe 3
developers would *have* to switch DataObject::get() in order to get
the full benefit of the new ORM. I have a feeling that relying on
DataObject::get() returning null is relatively rare.

Which is why I think that an optional backward-compatibility mode is
going to be the best approach here.

Denis 'Alpheus' Čahuk

unread,
Apr 5, 2011, 6:07:02 AM4/5/11
to SilverStripe Core Development
On Apr 5, 11:57 am, Sam Minnée <sam.min...@gmail.com> wrote:
> > Are there any plans to still have a Set-like atlernative to SS_List
> > (think Java here)?
>
> The short answer is "no, not yet" although now would be an appropriate
> time to consider it.
>
> > There are some problems that might surface on
> > Component relations of DataObjects, considering how the intent of
> > DataObjectSet was slightly misused in previous SS versions and will
> > probably be clarified with SS_List. However, storing a DataObject's
> > many-many relations in an SS_List might imply "order" (read: indexed)
> > on this list. Does this mean that by using a DataObjectManager you
> > would always need to make an object's *:many relation sortable?
>
> The main thing is that requests for data (via DataObject::get or
> DataList::create) always have an implicit order, for example by
> default_sort.  It seems like the only advantage of allowing for a
> SS_Set would be to allow backend implementations that didn't support
> sorting, however it seems unlikely that such a backend implementation
> would be of much use in a real-world application, so most back-ends
> would need to support both SS_Set and SS_List.
>
> It doesn't really seem like there's much benefit in supporting a
> separate Set class; it would add to confusion when developer tried
> putting SS_Sets into templates, and I don't see what operations would
> be easier if we had a SS_Set distinct from SS_List.
>
> Did you care to elaborate?

The main benefit I see here is with optimizing queries for inclusion-
checks for relations between entities, e.g. membership in a group,
people in a prize drawing, many-many relations to categories (a News'
categories do not need order), etc. Such queries can be of course done
the same way with a List implementation, but it would allow developers
to specify their intent and tag collections whose order may not be
relied upon.

Ingo Schommer

unread,
Apr 5, 2011, 4:24:02 PM4/5/11
to silverst...@googlegroups.com

On Tuesday, April 5, 2011 9:15:48 AM UTC+12, Sam Minnée wrote:

 1. DataObject::get() returns a traditional DataObjectSet, where .
DataList::create() will access the new ORM.  The disadvantage of this
approach is that it won't be consistently - it won't apply to relation
methods and other places where you get DataObjects without explicitly
calling DataObject::get().
Good point with implicit calls. We should try to avoid getting into situations like in 2.x,
where you sometimes get an empty ComponentSet vs. null/DataObject.
I know that I was the one to raise this issue originally, but now vote for 
changing the meaning of DataObject::get() to return a DataList.

 2. Use dependency injection to replace all DataLists, with something
like LegacyHybridList, that acts as a DataList until merge() is called
for the first time, at which point the DataList is "realised" and the
merge() function called.

 3. Provide an easy hack - for example, new
DataObjectSet(DataObject::get('SiteTree')) - that will turn a DataList
into a traditional DataObjectSet that you can use merge(), etc, on.
I assume that should read "DataObjectSet(DataList::create('SiteTree'))", right? 

On Tuesday, April 5, 2011 10:15:43 AM UTC+12, Sam Minnée wrote:
Further to this... 



 - An abstract base class of all lists, implementing ArrayAccess, 
IteratorAggregate, Countable, with no actual implementation itself. 
(Presumably, getIterator and count() would be left abstract, an the 
ArrayAccess methods have a default implementation with the use of 
getRange()). 

 - A specific implementation of a list based on an internal array. 
(like DataObjectSet is) 

If SS_List were an abstract base class then we would need to create a 
subclass of this called SS_ArrayList, SS_RealList, SS_SimpleList, or 
something else, that implemented the current DataObjectSet functions. 

The advantage of that is that we wouldn't need to stub out merge(), 
insertFirst(), etc, on DataList, and we could instead leave them out 
of the SS_List implementation.  Instead, we would provide those 
methods *only* on the SS_ArrayList class. 
+1 - lets keep it simple for now. I don't think we'll need to make $dataList1->merge($dataList2)
work for now if we allow $dataList1->toArray()->merge($dataList2->toArray()).

Sam Minnée

unread,
Apr 5, 2011, 8:07:24 PM4/5/11
to silverst...@googlegroups.com

> Good point with implicit calls. We should try to avoid getting into situations like in 2.x,
> where you sometimes get an empty ComponentSet vs. null/DataObject.
> I know that I was the one to raise this issue originally, but now vote for
> changing the meaning of DataObject::get() to return a DataList.

OK, cool.

> 3. Provide an easy hack - for example, new
> DataObjectSet(DataObject::get('SiteTree')) - that will turn a DataList
> into a traditional DataObjectSet that you can use merge(), etc, on.
> I assume that should read "DataObjectSet(DataList::create('SiteTree'))", right?

Well, DataObject::get('SiteTree') and DataList::create('SiteTree') do the same thing, given the above recommendation. But, do you think we should deprecate DataObject::get()?

Speaking of which, do we want to further refine the creation of DataLists?

We've seen these two:
- DataObject::get('SiteTree')
- DataList::create('SiteTree')

But we could make it a function (rather than a static method) for brevity:
- data('SiteTree')

We could also make it a method available on critical objects (to stop us from accessing a global)
- $this->data('SiteTree')

Or we could make it an object that had one property for each data type. This would make it easy to use DI to set $this->data.

- $this->data->SiteTree

Of all of those, I quite like the final one, but it's quite a shift. Here's an example:

function MenuItems() {
return $this->data->SiteTree->filter(array('ParentID' => 0, 'ShowInMenus' => true));
}

> +1 - lets keep it simple for now. I don't think we'll need to make $dataList1->merge($dataList2)
> work for now if we allow $dataList1->toArray()->merge($dataList2->toArray()).

It would probably be toArraylist() (because toArray() would return an actual array).

So, SS_ArrayList is a name that makes sense to everyone?

Marcus Nyeholt

unread,
Apr 5, 2011, 10:46:19 PM4/5/11
to silverst...@googlegroups.com
Without having had a chance to go through things underneath, my one main suggestion to this thread was to be using a property (eg ->data or ->dataLayer) to access things, so that it could be injected appropriately. 

The idea of a fluent interface to things is interesting - 

$this->data->list->SiteTree->filter(array('Field' => 'value:BeginsWith')->order(array('Created' => 'desc', 'ID' => 'desc'));

or

$this->data->Page->ID = 4;

To maintain backwards compatibility, static methods could still be available, but underneath utilise this type of fluidity. 

Cheers,

Marcus



--
You received this message because you are subscribed to the Google Groups "SilverStripe Core Development" group.
To post to this group, send email to silverst...@googlegroups.com.
To unsubscribe from this group, send email to silverstripe-d...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/silverstripe-dev?hl=en.


Sam Minnée

unread,
Apr 5, 2011, 11:38:30 PM4/5/11
to silverst...@googlegroups.com
> To maintain backwards compatibility, static methods could still be available, but underneath utilise this type of fluidity.

Yeah, I think that there would still need to be a global data access layer, injected where appropriate.

In effect, these two statements might be equivalent:

model()->SiteTree->filter("URLSegment", "about-us");
$this->model->SiteTree->filter("URLSegment", "about-us");

The object returned by model() / $this->model of these would represent the data model in its entirety, rather than a single DataObject class.

The simplest implementation of this object would be something like this:

class DataModel {
function __call($func, $args) {
$this->modelByName($func);
}

function __get($property) {
$this->modelByName($proprety);
}

function modelByName($name) {
return DataList::create($name);
}
}

Ingo Schommer

unread,
Apr 6, 2011, 5:52:25 AM4/6/11
to silverst...@googlegroups.com
I think we need to make this decision in context of PHP 5.3, which will at some point become a minimum requirement.
I don't want to go off-topic, but did we make a firm decision on that already or is it worth starting a new thread?

Specifically, late static binding could make for a lot less typing:
DataList::create('Member') becomes Member::all()
DataList::create('Member')->byId(5) becomes Member::by_id(5)
DataList::create('Member')->find('Name', 'Elmo') becomes Member::find('Name', 'Elmo') or even Member::find_by_name('Elmo') 

Yes, it looks an awful lot like Rails, but thats not a bad thing per se, right? :)

For a PHP 5.2 compat solution: I like the idea of making the model access injectable via $this->data,
but struggle to come up with a use case where you'd want to inject a different datamodel wrapper globally
(as opposed to using a different mapper based on model) - do you guys have an example?
Both $this->model and $this->data are very common property names, which isn't ideal,
and can lead to weird situations when redefining them unknowingly.

Definetly prefer it over global methods - yes, it is shorter, but it sends the wrong message for such
an integral part of an OOP framework.

Ingo

Sam Minnée

unread,
Apr 6, 2011, 6:28:12 AM4/6/11
to silverst...@googlegroups.com
On 6 April 2011 21:52, Ingo Schommer <in...@silverstripe.com> wrote:
I think we need to make this decision in context of PHP 5.3, which will at some point become a minimum requirement.
I don't want to go off-topic, but did we make a firm decision on that already or is it worth starting a new thread?

The decision we previously made on 5.3 was "not yet", and I think that still holds.  I we could potentially add 5.3-only features to the API that you could optionally use, but we couldn't use them in the CMS.
 
Specifically, late static binding could make for a lot less typing:
DataList::create('Member') becomes Member::all()
DataList::create('Member')->byId(5) becomes Member::by_id(5)
DataList::create('Member')->find('Name', 'Elmo') becomes Member::find('Name', 'Elmo') or even Member::find_by_name('Elmo') 

Yes, it looks an awful lot like Rails, but thats not a bad thing per se, right? :)

Looking like Rails isn't a bad thing, but all those static methods are... ;-)  Static methods share almost all the problems of functions.

I *do*, however, like the idea of being able to define custom filter methods like byName()...

Using the property name "find" rather than "model" gives you a fluenty kind of syntax, although I think the 'punctucation volume'

$this->find->SiteTree->byID(5);
$this->find->SiteTree->withURLSegment('about-us');
 
For a PHP 5.2 compat solution: I like the idea of making the model access injectable via $this->data,
but struggle to come up with a use case where you'd want to inject a different datamodel wrapper globally
(as opposed to using a different mapper based on model) - do you guys have an example?

 - It means that you can inject the full data model as a whole, rather than needing to inject each model class separately. 
 - If you consider switching to stage/live or changing translation as modifications to the model as a whole, these settings could be stored against the DataModel object and applied to each query that you make on it.
 - If you're switching to a test database you'd want to replace the datamodel globally (admittedly to one that runs the same queries on a different database, but that's beside the point)
 - If we're moving away from statics to instance data, we need to keep the list of all models *somewhere*.
 
Both $this->model and $this->data are very common property names, which isn't ideal,
and can lead to weird situations when redefining them unknowingly.

DI doesn't inject by property name, it injects by injection point, so people could use a different name if needs be.  But we should still have something that works most of the time.
But, remember that if we define it as an integral part of the ORM it's okay to expect people not to use it for their own devices - get() would be a common static method name if it wasn't already taken. ;-)
 
Definetly prefer it over global methods - yes, it is shorter, but it sends the wrong message for such
an integral part of an OOP framework.

Yeah I mostly agree, although I feel the same about static methods. ;-) 

Ingo Schommer

unread,
Apr 6, 2011, 7:15:05 AM4/6/11
to silverst...@googlegroups.com
On 6/04/2011, at 10:28 PM, Sam Minnée wrote:

On 6 April 2011 21:52, Ingo Schommer <in...@silverstripe.com> wrote:
I think we need to make this decision in context of PHP 5.3, which will at some point become a minimum requirement.
I don't want to go off-topic, but did we make a firm decision on that already or is it worth starting a new thread?

The decision we previously made on 5.3 was "not yet", and I think that still holds.  I we could potentially add 5.3-only features to the API that you could optionally use, but we couldn't use them in the CMS

Looking like Rails isn't a bad thing, but all those static methods are... ;-)  Static methods share almost all the problems of functions.
You can still inject the implementation with static getters though, right?
You'd lose the ability to customize injection points based on the class context,
but can't think of any use case for this, either (your examples below are all pointing towards "global injection").
Regardless of PHP5.3 in SS3 or not: Both Rails and Django use static entry points for this, I think its the most expressive of all presented options

I *do*, however, like the idea of being able to define custom filter methods like byName()...

Using the property name "find" rather than "model" gives you a fluenty kind of syntax, although I think the 'punctucation volume'

$this->find->SiteTree->byID(5);
$this->find->SiteTree->withURLSegment('about-us');
Heh, very neat. I think its problematic to tie it to a changeable instance variable though,
DataList::create('SiteTree')->withURLSegment(...) isn't as expressive.
I prefer findBy*() over with*(). For the sake of consistency, I'd also prefer findByID() over byID().
We could then also have the current filter() method renamed to find()?
Its semantically a bit odd when chained though:
$this->model->SiteTree->findByName('Elmo')->sort('Name', 'ASC')->findByLastname('Foo') vs.
$this->model->SiteTree->filterByName('Elmo')->sort('Name', 'ASC')->filterByLastname('Foo')
 
For a PHP 5.2 compat solution: I like the idea of making the model access injectable via $this->data,
but struggle to come up with a use case where you'd want to inject a different datamodel wrapper globally
(as opposed to using a different mapper based on model) - do you guys have an example?

 - It means that you can inject the full data model as a whole, rather than needing to inject each model class separately. 
I see per-model (not per-class-context) injection as desireable though - e.g., how would you see support for certain models in legacy databases (with different database drivers) otherwise?



---
Ingo Schommer | Senior Developer

Denis 'Alpheus' Čahuk

unread,
Apr 6, 2011, 9:30:53 AM4/6/11
to SilverStripe Core Development


On Apr 6, 12:28 pm, Sam Minnée <s...@silverstripe.com> wrote:
>  - If we're moving away from statics to instance data, we need to keep the
> list of all models *somewhere*.

That "*somewhere*" can be where ever you first created the instance
data (or model), be it on the Director-equivalent or at the routing/
dispatching level (just before giving processing over to Controllers/
RequestHandlers).

Considering no such "area" exists in Silverstripe right now due to
static access, it will have to be created. A ZF-esque approach might
be to use some kind of registry or injection container, e.g. assume an
Injector (think Guice here) is created in index.php which just makes a
$injector->getInstance('SSApplication') call. An implementation for
the said 'SS\Application' interface has an injection point for a
*model* or some other form of datalist factory which is created and
provided by the injector (naively assuming outside configuration here,
YAML, ini, etc.).

If you feel too of-topic about this, assume that the Director is an
implementation of the SSApplication interface.

Sam Minnée

unread,
Apr 6, 2011, 6:54:22 PM4/6/11
to silverst...@googlegroups.com

> Regardless of PHP5.3 in SS3 or not: Both Rails and Django use static entry points for this, I think its the most expressive of all presented options
> (see http://docs.djangoproject.com/en/1.3/topics/db/models/ and http://guides.rubyonrails.org/active_record_querying.html).

Yeah, I guess you're right. I might be getting a little carried away in pursuit of semantic purity... I think it's probably best to actually wait until someone (me?) has succeeded in implementing a project or 2 without reference to statics before pitching it as the Next Big Thing. I am keen to explore it, however.


> DataList::create('SiteTree')->withURLSegment(...) isn't as expressive.
> I prefer findBy*() over with*(). For the sake of consistency, I'd also prefer findByID() over byID().
> We could then also have the current filter() method renamed to find()?

Yeah, maybe. To be honest I don't really have strong opinions and I'd be happy to go with consensus on the exact naming.

However, it should be clear whether something is returning a DataList or a DataObject.


> I see per-model (not per-class-context) injection as desireable though - e.g., how would you see support for certain models in legacy databases (with different database drivers) otherwise?

Not *exactly* sure, but it would come down to how you construct the DataModel object. Or, more to the point, how you specify its construction in the DI config.

For example:

$model = new DataModel(array(
"Member" => "DataList('Member', array('database' => 'shared_auth_db')"),
"Group" => "DataList('Group', array('database' => 'shared_auth_db')"),
"Permission" => "DataList('Permission', array('database' => 'shared_auth_db')"),
));

It's an open question as to whether the arguments of this array should be a kind of parameter list or object construction strings.

For example, you could do this instead, which is simpler but less flexible.

$model = new DataModel(array(
"Member" => array('database' => 'shared_auth_db'),
"Group" => array('database' => 'shared_auth_db'),
"Permission" => array('database' => 'shared_auth_db'),
));

But it does open the door for a YAML config like this:

DataModel :
Member:
database: shared_auth_db
Group:
database: shared_auth_db
Permission:
database: shared_auth_db

Reply all
Reply to author
Forward
0 new messages