SS ORM and Database ideas.

46 views
Skip to first unread message

John Milmine

unread,
Feb 16, 2011, 11:13:19 PM2/16/11
to SilverStripe Core Development
Attended the meetup last night, thanks all.

Just some thoughts I'd had around database and ORM.

1) Query cache vs Object cache.
Currently SS uses a Query cache which is useful for multiple queries
doing the same query however not so much if you grab 10 objects and
then do a get_by_id. The object is already loaded, but since the
object dsoes't know it, it has to go back to the database to re-
receive the data.

One of the ideas, in a framework I've worked on was an ObjectMap which
is essentially a singleton, with a array of all objects loaded from
the database, keyed by id. This way if an object is already loaded you
don't have to access the database again.

The framework also stored IDs of objects which refer to it as a
foreign key. That way if someone did $car->wheels() the object could
check if all the ids it knew about were already loaded in the
ObjectMap and not touch the db. This however does have performance
issues, and requires parent objects to ALWAYS know about child objects
ids. So can be a bit tricky to implement.

2) Connection Pool
I mentioned this last night. On the same framework, objects statically
knew which database they were associated with. There was also a global
pool on connections strings for each database, that way an object
could connect to the database it was in.

3) After a first attempt of database intregration which worked OK We
re-thought how we would like to write the query and decided that
rather than SQL directly in code, we opted for chain-able queries
(similar to jQuery). We tried to limit the amount of strings and opted
for Objects and statics as much as possible. It also abstracts the
queries enough to remove database specific syntax. We then used a
series of adapters to turn the objects from expressions objects into
expressions the database type would understand.

$prototypeMember = new Member(null, true);

$query->select('ID')->select($prototypeMember, '*')->where('Phone',
Querty::EQUALS, 'asd')->orderby($prototypeMember, 'Firstname');

OR:

$query->select(
$prototypeMember, '*'
)->when(
new WhenExpression('Number', Querty::GT, 6),
new ThenExpression('Order', 2),
new ElseExpression('Order', 1)
)->join(
$prototypeMember,
new Role(null, true),
new Expression('ID', Querty::EQUALS, 'MemberID')
)->groupBy($prototypeMember, 'ID'
)->groupBy($prototypeMember, 'FirstName');

$results = $query->getObjects();

Modelling the way SQL is constructed into objects is difficult,
however we did have over 40 units test working which enabled us to
model queries independent of database type. This experiment was never
finished, although our first attempt allowed us to query MySQL and
MSSQL via PDO with transation support with no change in API. However
the first attempt had a custom query api for those queries that were
too hard to model. The second attempt was to try remove that need.
That however may be a un-realistic goal.

Our over-riding vision was trying to make an API that would adapt a
non-object based database into an Object base query structure.

If you would like to look at the objects created to model sql,
@spronkey will be making them available within the next 7 days. It may
or may not help, and at the very least may just be another way to look
at it and give you some ideas.

4) I mentioned removing MySQL from core in the meeting. Sam mentioned
you want out of the box functionality. I don't see this as a problem,
just because it's not in core, doesn't mean it's not part of the
download package?

Thoughts all?

Marcus Nyeholt

unread,
Feb 16, 2011, 11:45:09 PM2/16/11
to silverst...@googlegroups.com
I'm very keen to move to a situation where DB specific querying is abstracted behind some kind of a query object pattern. Even still, as you mentioned there's still going to be a small percentage of cases where this can't be abstracted and need SQL queries to be performed, but to have them done in a way that maintains DB agnosticism. One way I've seen of addressing this is to borrow from i18n and use a translation style approach to things - something like

_sql('QueryStrangely', 'SELECT count(hats), HoursPlayed FROM TF2 WHERE Player = ?', $arg1);

If an SQL translation has been provided for this query for the currently used DB engine, it can then be used. Parameterising queries makes this much easier to work with (and it's much better to be using prepared statements anyway right?). 

Marcus



--
You received this message because you are subscribed to the Google Groups "SilverStripe Core Development" group.
To post to this group, send email to silverst...@googlegroups.com.
To unsubscribe from this group, send email to silverstripe-d...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/silverstripe-dev?hl=en.


Dan Rye

unread,
Feb 17, 2011, 12:12:05 AM2/17/11
to silverst...@googlegroups.com
All great thoughts!  Sorry I couldn't make the trip, maybe next time.  I'm sure this isn't news, but I thought I'd bring up the pain I have using MSSQL for several months now on a large project.

I've had to tweak many modules with ` in them to use \" instead.  There have been a few other issues, but that is the most prominente one.  I've tried to stay on top of submitting patches to the maintainers, but will be happy to submit pull request if/when they move over to github.

-Dan

Marcus Nyeholt

unread,
Feb 17, 2011, 12:30:46 AM2/17/11
to silverst...@googlegroups.com
Things like field name quoting and value escaping should be handled lower down in the DB abstraction layer - especially when pitching the CMS as being simple enough for a designer to do some basic code things. So whatever changes are made should help this particular issue moving forwards!

Marcus

Artyom

unread,
Feb 17, 2011, 2:53:31 PM2/17/11
to silverst...@googlegroups.com

I also second that translation is a better approach.  I've written systems like what you describe (in java) and ultimately found it to be funny, but clunky.  I like the way the java query language works in J2EE.. it abstracts enough, and also allows the names of class types at the code level.  I don't see much disadvantage after years w it.  Thatnks for the direction .  The issue is quiyte valid. : - )

On Feb 16, 2011 9:30 PM, "Marcus Nyeholt" <nye...@gmail.com> wrote:

Things like field name quoting and value escaping should be handled lower down in the DB abstraction layer - especially when pitching the CMS as being simple enough for a designer to do some basic code things. So whatever changes are made should help this particular issue moving forwards!

Marcus



On Thu, Feb 17, 2011 at 4:12 PM, Dan Rye <dan...@gmail.com> wrote:
>

> All great thoughts!  Sorry...

--

You received this message because you are subscribed to the Google Groups "SilverStripe Core Develop...

John Milmine

unread,
Feb 17, 2011, 3:20:53 PM2/17/11
to SilverStripe Core Development
Hi all,

Marcus, totally agree that quoting should be done further down,
possibly handled by database adapters as each db may have it's own
style or strings vs table names vs column names etc.

Prepared statement is a good target to aim for, although I think
there's a bunch of work to be done, before that can happen.

The other thing which I didn't mention is pagination. Part of the
query design needs to take that into account. DB's have quite
different ways of doing this and returning things like total row
count. The framework I referred to have pagination working for both
MySQL and MSSQL however they were implemented in quite different ways.

The translation idea is a new one for me, so can't really comment as
I've never seen it in action, however if it works as well as you say
in J22 Artyom, then definately worth investigating.

My concern with translation is that it's still essentially SQL. You
still need to know SQL (of some type) in order to write it. In the
ideal world we'd have an Object Database, however a decent one of
those is not happening any time soon. It would be cool if you could
use different types of data sources like XML etc, however I think that
may be a bridge too far currently.

I quite like the idea of using the translation approach for custom
queries only, advanced SQL users can use it, but most users (and most
internal workings) use object queries.

John

Artyom

unread,
Feb 17, 2011, 5:35:05 PM2/17/11
to silverst...@googlegroups.com
Sorry, I meant "fun, but clunky" in my post.

Also, if anyone is curious about what I mean in J2EE, then search for "JQL"... it's basically a kind of SQL-like syntax that allows you to specify things in terms of the object names in the actually code, like "Employee", which would correspond to a DataObject at the php level... and secondarily, it also abstracts away differences in DB implementations, SQL dialects etc.

(Our current ORM relates to the php classes in a similar way actually)

I agree that people shouldn't be *required* to use it for simple things, as they don't need to now in many cases.  For that we should continue to offer simplified alternatives, like get_one etc... 


--

Marcus Nyeholt

unread,
Feb 17, 2011, 6:32:28 PM2/17/11
to silverst...@googlegroups.com
Yep, my suggestion for the _sql() translation type thing would be only to handle those 'custom' queries that are rarely needed - the 95% case would be to use query abstraction objects of some sort. Whether this would use a custom query language I'm not sure - personally, I'd rather stick with plain old SQL, a custom query language just adds complexity and runtime overhead. 




--

Sam Minnée

unread,
Feb 20, 2011, 6:48:43 PM2/20/11
to silverst...@googlegroups.com
I agree. If we added any non-database language, it would imply that any non-SQL data object systems would need to support it, which would be a royal pain in the arse.

The approach that I had been working on (more detailed write-up and code-so-far coming this week) was an map syntax where you could provide a number of different filters that were ANDed together, and a more verbose syntax for OR, since in practise it's used less frequently.

spronk

unread,
Feb 21, 2011, 5:09:20 PM2/21/11
to SilverStripe Core Development
There are a few things I've found thus far working looking through
various implementations and libraries, and while working with John
(above) on building a couple of these libraries (albeit from a
reasonably naive viewpoint):

1. Non-SQL query languages just add another layer of complexity for
the end user, and ultimately require a massive amount of work to
create and maintain if you want anything near the level of power SQL
affords.
2. Base-case SQL langauges in abstraction layers such as Pear::MDB2
don't provide enough power when you need to actually get your webapp
performing
3. The differences in 'everyday' functions (such as MySQL's LIMIT,
counting rows returned in results, utilising indices etc) between
RDBMSs is massive (can be good and bad)
4. Database optimisations are extremely difficult to perform without
reasonably intimate access to the SQL engine itself. Not a big issue
for CMS-only where you can use caching, but for any dynamic app that
requires constant queries this can make a massive difference.
5. You have to be very careful with assumptions about what is used
commonly and what isn't. In a CMS context, OR is a fringe case - but
if you have shallow inheritance hierarchies in your object layer, OR
becomes your best friend. Likewise, CASE doesn't get much love in
newbie tutorials, but made the biggest difference this side of complex
indexes in speeding up one of the webapps we built.

I also don't think you have to *sacrifice* good syntax for less
commonly used features. Look at SQL itself - the syntax for even
reasonably complex queries is just as straightforward as it is for
simple ones.

I'm a bit swamped at the moment but will hopefully get some time soon
to put up the code I wrote a long time back with John. It's coupled
closely to the framework that we wrote and it has some obvious
performance issues, so should very much be treated as "ideas only".

The slightly-further-down-the-line streamlined version of the example
query code John posted above is as follows. Internally, the query
creates a set of abstracted SQL objects (such as SelectQuery,
WhereExpressionSet, WhereExpression etc), but allows for a very tight
syntax for nesting logic. The downside is that it doesn't quite match
the grammar of SQL (sorta Japanese vs. English type stuff).

// in the following example, $post->query() uses func_get_args() to
allow arbitrary argument counts, where the like(), and gt() methods
each return an IExpression implementor, as do the and() and or()
methods - with inner expressions wrapped in an IExpressionSet
implementor for nesting
$filter = $post->filter();
$post->select(
$post->and(
$filter->or(
$filter->like('title', '%keyword%'), // returns a new Like
object and adds it to the set
$filter->like('description', '%keyword%')
), // returns the set with two like objects and OR operator/
concatenator
$filter->and( // the and is redundant here
$filter->gt('creationDate', DATE_NOW)
), // returns the set with one filter
$filter->gt('modificationDate', DATE_YESTERDAY), // works outside
a set.
)
)->orderBy(
'property DESC'
); // returns the filter AND concatenator and two nested sets, and a
filter applied

// outputs something along these lines
// SELECT * FROM posts WHERE ((title LIKE %keywords% OR description
LIKE %keywords%) AND (creationDate > '2009-11-16 12:02:33') AND
modificationDate > '2009-11-15 00:00:00'

spronk

unread,
Mar 1, 2011, 4:35:13 PM3/1/11
to SilverStripe Core Development
Damn earthquakes.

Unfortunately the quake has, for the short term at least, wiped out
our source repositories, and our backups (they'll be fine, just
cordoned!), so I haven't been able to put up any source for this.

John Milmine

unread,
Mar 16, 2011, 9:17:01 PM3/16/11
to SilverStripe Core Development
Sam still looking forward to seeing this map idea you've been working
on. Although I do have some reservations, with or being more verbose
than and. Typically in a CMS situation most of the time you do use
AND. However working on some web apps, I've found that OR is used a
lot more and also as spronk mentioned CASE comes in very handy for
increasing performance.

I heard you got that server back now spronk?

keith <spronk>

unread,
Apr 27, 2011, 11:30:09 PM4/27/11
to SilverStripe Core Development
Just to follow this up, I think a lot has been superceded already etc,
but for reference:

Username: guestaccess
Password: guest55

http://dev.tko.co.nz/svn/FoxFramework/trunk/framework/fox/sql/

This is the bulk of the code for the SQL library John and myself
built. It's pretty hacked together and is more proof of concept than
anything particularly S.O.L.I.D. There are also a bunch of unit tests
for this stuff but I don't have time to move them somewhere
permissible.

The idea is you build SQL queries with objects, then render those
objects into SQL strings based on specific drivers later on. We had
planned on implementing a fluent interface similar to that I posted
above on top of this once we had a decent object model down pat, but
development stopped long before that point.

There's a lot of coupling and heavily shortcut oo design in there, but
it could be useful.

Keith
Reply all
Reply to author
Forward
0 new messages