The "perfect" ORM?

Hal Fulton

unread,

Oct 25, 2005, 8:11:39 PM10/25/05

to

For many weeks I have had this at the back of my mind.

I want a really good ORM that is highly non-intrusive
(e.g., I don't have to inherit and I don't have to
clutter my classes and objects with metadata).

Someone told me that this was much the way Hibernate
works in Java. I can't comment.

So anyway, this is one of my highest priorities -- to
make an ORM (that works the way I like) to wrap
KirbyBase. (With additional code, it should/could wrap
any other db, of course.)

Logan's KirbyRecord was a step in the right direction,
it seemed to me. But it was like ActiveRecord -- it
forced you to store your metadata in your own objects.

Og is cool, but is even more intrusive.

In case anyone is morbidly curious, I want this so that
I can continue work on Tycho without getting bogged
down in data storage issues. (Someone teased me at the
conference that a year had gone by, and Tycho still
hadn't progressed any. And he wonders how that spider
got in his salad.)

Anyhow: Who knows Hibernate and can comment on its
usefulness as a model?

And who if anyone is interested in working on this
project?

Thanks,
Hal

James Britt

unread,

Oct 25, 2005, 9:40:06 PM10/25/05

to

Hal Fulton wrote:
..

>
> Logan's KirbyRecord was a step in the right direction,
> it seemed to me. But it was like ActiveRecord -- it
> forced you to store your metadata in your own objects.
>
> Og is cool, but is even more intrusive.

What do you mean by 'metadata'? The data types of your object's state?

James
--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

Lyndon Samson

unread,

Oct 25, 2005, 9:57:00 PM10/25/05

to

Ideally you'd be more generic than an ORM.

You'd start with objects.

1.Have some method of querying them ( and implicit in this is some
concept of unique ID/PK ).
2.Have some method of storing/retrieving them.
3.Have some form of collection which can contain the full object set
or a subset.

You could go overboard by being very abstract for 1&2 which would
allow you to utilise the storage engines query facility ( map to SQL
for example), or very low-level rubyesque ( blocks for filters on
properties etc ).

I don't think you should assume a RDBMS as the persistence mechanism.

Also, db4o is worth looking at, it uses QBE.

cheers
lyndon

ES

unread,

Oct 25, 2005, 10:12:29 PM10/25/05

to

I greatly specify code defining the database (Og)
versus having to deal with the DB directly (AR).

Using ruby's capabilities it would be fairly easy to create
a completely implicit ORM (obj.instance_variables etc.) even
without subclassing --this is not really a problem. The
problem is the corner cases where, for example, an instance
variable contains transient data which is not useful (or even
counterproductive) when stored so typically some way of either
opting in or out is desirable to have. The other issue may be
managing relationships between the stored objects.

That said, I would be happy to contribute what I can
towards such an interface.

> Thanks,
> Hal

E

Kev Jackson

unread,

Oct 25, 2005, 10:19:42 PM10/25/05

to

>
> Anyhow: Who knows Hibernate and can comment on its
> usefulness as a model?

I've spent a fair bit of time with Hibernate and I can safely say that
it is not the "ruby way" (even from the little experience I have with ruby)

Here's some basic example code for you to look at anyhoo.

Java class:

class Cat

private long id;
private String name;

public long getId() {
return this.id;
}

/** note that this is the id field that Hibernate uses, it should not
be directly setable by external clients - hence private */
private void setId(long id) {
this.id = id;
}

public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}
}

Now you have a simple JavaBean style of class, the only thing that
Hibernate imposes here is that it seems to be easier to use id values
that are auto-generated (long maps to number(19) in Oracle etc). You
can use Strings (auto generated hex values) or assign the id/primary key
yourself. Best practice in the Hibernate community is to let the
database auto-generate where possible and to always use surrogate keys.

So yes in the raw Java code for the model, Hibernate does not
interfere. However at this point, you still need to map the JavaBean to
the database table. This is done with a (verbose) xml mapping file. As
these are such a pain to write, most people use XDoclet to generate the
mapping automatically. For XDoclet to do this, you have to sprinkle
attributes into your Java code like fairy dust. So the code would
really look like...

/**
* @hibernate.class
* table="cat"
*/
class Cat

private long id;
private String name;

/**
* @hibernate.id
* column="cat_id"
* generator-class="sequence"
*/
public long getId() {
return this.id;
}

/** note that this is the id field that Hibernate uses, it should not
be directly setable by external clients - hence private */
private void setId(long id) {
this.id = id;
}

/**
* @hibernate.property
* column="name"
*/
public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}
}

In Java5, the introduction of annotations allows these special
@whatevers to be placed outside of comments. Hibernate3 supports both
styles (in comment and true annotations). If you can say that these
Attribute/Annotations don't couple themselves to your model code, then
yes, the assertion that Hibernate is unobtrusive is true. On the other
hand, actually keeping the metadata in a separate file (xml mapping in
Hibernates case), means that the turn-around on a change is fairly
significant. Trust me, coding up a Hibernate app without using Ant +
XDoclet is an exercise in pain, even with Ant + XDoclet, the change
code->deploy is still a drag.

There are some cool parts of Hibernate, being completely flexible on how
you configure every aspect is probably the most 'enterprise' feature of
it. It allows it to be used so much more easily with legacy data
(composite business keys, wierd table structures etc).

Right now, I'd say that Hibernate3 + Spring + J2EE services are very
useful when you have to actually build an enterprise application (access
n datastores of various forms across different locations etc). But
ActiveRecord (and the rest of Rails+Ruby) is a much more efficient (in
terms of coding time) way of getting to 80% of the functionality of
Hibernate3/EJB3 that it doesn't make sense to use Hibernate in all cases.

Erm wandered of the point there a little.

Summary
Hibernate is very good at allowing you to specify everything, but you
pay the price with overly complex and verbose configuration files that
*must* be in sync with your model code for the application to work -
this synchronization issue is the achilles heal of Hibernate in my
experience - I've wasted too much time when the server has cached an old
mapping file instead of deploying the new one.

Kev

Trans

unread,

Oct 25, 2005, 10:43:05 PM10/25/05

to

I think Og has a very good _outward_ design. I'm trying to get George
to trim it up and get rid of the intrusiveness. If this were done I
think Og would be pretty close to perfect.

T.

Jim Weirich

unread,

Oct 25, 2005, 11:13:38 PM10/25/05

to

On Tuesday 25 October 2005 08:11 pm, Hal Fulton wrote:
> For many weeks I have had this at the back of my mind.
>
> I want a really good ORM that is highly non-intrusive
> (e.g., I don't have to inherit and I don't have to
> clutter my classes and objects with metadata).
>
> Someone told me that this was much the way Hibernate
> works in Java.

Hahahaha ... oh, that's a good one :)

Actually, hibernate is very flexible, but you do end up specifying a lot of
metadata either through annotations, comments (as another poster
demonstrated), or via XML files. Much more intrusive than, say,
ActiveRecord.

--
-- Jim Weirich j...@weirichhouse.org http://onestepback.org
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)

Duane Johnson

unread,

Oct 25, 2005, 11:18:11 PM10/25/05

to

I've been looking in to the various options available to us Ruby
developers as well. A coworker (Troy Heninger) and I are looking at
implementing a "knowledge base" or ODBMS, but I haven't worked out
the details yet. Troy is much more knowledgeable in this area.

If you're interested, the 3 interesting packages I've found so far are:

Purple - http://purple.rubyforge.org/
DyBase - http://www.garret.ru/~knizhnik/dybase.html
Madeleine - http://madeleine.sourceforge.net/

Duane Johnson
(canadaduane)

Kev Jackson

unread,

Oct 26, 2005, 12:22:48 AM10/26/05

to

Jim Weirich wrote:

>On Tuesday 25 October 2005 08:11 pm, Hal Fulton wrote:
>
>
>>For many weeks I have had this at the back of my mind.
>>
>>I want a really good ORM that is highly non-intrusive
>>(e.g., I don't have to inherit and I don't have to
>>clutter my classes and objects with metadata).
>>
>>

Actually re-thinking this again,

If you don't want to inherit from a base class (a la ActiveRecord),
could you build some kind of Dependency Injection to use ActiveRecord
without having to inherit?

Kev

Kirk Haines

unread,

Oct 26, 2005, 12:52:48 AM10/26/05

to

On Tuesday 25 October 2005 6:11 pm, Hal Fulton wrote:
> For many weeks I have had this at the back of my mind.
>
> I want a really good ORM that is highly non-intrusive
> (e.g., I don't have to inherit and I don't have to
> clutter my classes and objects with metadata).

orm = KSDatabase.new('kirbybase:///var/db/msds',:pollute => true)

sodiums = Chemicals.chemical_name.like('sodium')
chlorides = Chemicals.chemical_name.like('chlorides')
sodium_chlorides = sodiums & chlorides

or

sodium_chlorides = Chemicals.select do |c|
(c.chemical_name =~ 'sodium') & (c.chemical_name =~ 'chloride)
end

Chemicals.new({:chemical_name => 'potassium chloride'})

schools_with_nacl = orm.select(:Schools, :Inventories, :Chemicals) do |s,i,c|
(c.chemical_name == 'sodium chloride') &
(i.chemical_idx == c.idx) &
(i.school_idx == s.idx)
end

If there is sufficient meta data in Kirbybase to identify relationships between tables (such as can be done in some dbs with foregin key constraints and the like):

schools_with_nacl = Chemicals.chemical_name.is('sodium chloride').inventories.collect {|i| i.school}.uniq

Otherwise one would have to specifically declare relationships:

Chemicals.to_many(:relationship => :inventories, :foreign_table => :inventories, :foreign_key => :chemical_idx)
Inventories.to_one(:relationship => :school, :foreign_table => :school, :local_key => :school_idx)

Which, using just a smidge of convention over configuration logic, could be written as simply as:

Chemicals.to_many(:relationship => :inventories)
Inventories.to_one(:relationsip => :school)

Now, to be honest, you can't do this, quite like this, yet. As Kansas works right now, database connection is uglier, there is no adaptor for KirbyBase, relationships must be manually declared either with that mechanism or with a class declaration, and a few other things. However, a couple days ago I started gutting parts of Kansas to modularize them, clean up interfaces, sniff metadata and act on it better, and make it easy to do things like write adaptors to non-SQL data sources like KirbyBase, or to give better optimization of generated queries on databases that allow it, such as by using hinting with Oracle queries, for example.

The above examples come directly from my current plan of how I want the library to work, based on my needs and the input that I have gotten from others. It's completely subject to change from internal or external influence at this point, as I'm still working on the modularization of the query generation/db interface code. The motivation for this, quite honestly, is so that I can have an adaptor to KirbyBase or even directly to a directory of CSV files which can be treated as a database of tables, or to other non-db data sources.

Kirk Haines

Alexandru Popescu

unread,

Oct 26, 2005, 8:25:13 AM10/26/05

to

#: Hal Fulton changed the world a bit at a time by saying on 10/26/2005 2:11 AM :#

Hi Hal!

I've been working with Hibernate for quite a while and imo it is correctly approaching so called
object - relational mismatch.

The real good thing about this approach is that it is not obtrusive in any ways with your domain
model objects and it let's you focus and work only on the objectual world.

On the dark side of the problem: you should provide in some way the mapping between the object world
and the relational world. While there are a few things that could be a little simplified (like
automatic type conversions), the big problem is the impossibility to use this simplified form on
relationships. If the parametrized types would have been implemented without the erasure mechanism
than this simplification could be brought further, but for the moment we have to use some other way
to describe relations: and here comes into play the metadata. There are a few different approaches
used: metadata through external XML, metadata through javadoc comments and lately metadata through
annotations.
There have been long discussion about the benefits and pitfalls of each of these approaches, so I
will not enter this discussion here.

hth,

/alex
--
w( the_mindstorm )p.

Bob Hutchison

unread,

Oct 26, 2005, 9:11:05 AM10/26/05

to

On Oct 25, 2005, at 8:11 PM, Hal Fulton wrote:

> For many weeks I have had this at the back of my mind.
>
> I want a really good ORM that is highly non-intrusive
> (e.g., I don't have to inherit and I don't have to
> clutter my classes and objects with metadata).
>

Do you want an ORM? or do you want a way to persist classes in a non-
intrusive way? Be careful what you ask for :-)

I am putting the final touches on a project (<http://rubyforge.org/
projects/xampl/>) that I've been working on for a while now. It has
its roots in a Java tool that I've been working on since 1998 or so
and that has been used in eight or nine quite large commercial
products (500k to 2000k lines of code). There is a Common Lisp
version as well. I am working through a small but relatively complex
example (in Ruby) just to make sure I've not missed anything that
Ruby needs (and a good thing I did too).

This will address some of the issues you raise and that you've
mentioned in previous posts.

It is unobtrusive as long as you play along. Persistence is only one
of the goals of the tool, it is also trying to provide a useful
framework for projects that use it.

There is more information on my weblog in the ruby category <http://
recursive.ca/hutch/index.php?cat=16>, a few additional articles in
the xampl category talking about the Java or CL version, if you are
curious. The articles mostly talk about xampl as an XML binding tool
-- which it also does do.

Right now, xampl is targeted at new code. Fitting it into existing
code can be done but requires some familiarity with the tool, and
there is no guarantee that it would be all that useful in the end.

>
> Someone told me that this was much the way Hibernate
> works in Java. I can't comment.
>
> So anyway, this is one of my highest priorities -- to
> make an ORM (that works the way I like) to wrap
> KirbyBase. (With additional code, it should/could wrap
> any other db, of course.)
>
> Logan's KirbyRecord was a step in the right direction,
> it seemed to me. But it was like ActiveRecord -- it
> forced you to store your metadata in your own objects.
>
> Og is cool, but is even more intrusive.
>
> In case anyone is morbidly curious, I want this so that
> I can continue work on Tycho without getting bogged
> down in data storage issues. (Someone teased me at the
> conference that a year had gone by, and Tycho still
> hadn't progressed any. And he wonders how that spider
> got in his salad.)
>
> Anyhow: Who knows Hibernate and can comment on its
> usefulness as a model?
>
> And who if anyone is interested in working on this
> project?
>
>
> Thanks,
> Hal
>
>
>
>
>
>

----
Bob Hutchison -- blogs at <http://www.recursive.ca/hutch/>
Recursive Design Inc. -- <http://www.recursive.ca/>
Raconteur -- <http://www.raconteur.info/>

Bob Hutchison

unread,

Oct 26, 2005, 11:06:56 AM10/26/05

to

On Oct 25, 2005, at 11:18 PM, Duane Johnson wrote:

> DyBase - http://www.garret.ru/~knizhnik/dybase.html
>

This guys stuff is very good. I've not used dybase, but I have used
several of his other tools (have a look around his site, it is
amazing what this one guy has done).

Alexander Lamb

unread,

Oct 26, 2005, 11:27:08 AM10/26/05

to

Why didn't anyone mention Cayenne. As a comparison it seems closer to
Ruby than Hibernate. It is a sort of advanced clone of what was EOF:
http://www.objectstyle.org/cayenne/

--
Alexander Lamb
Service d'Informatique Médicale
Hôpitaux Universitaires de Genève
Alexande...@sim.hcuge.ch
+41 22 372 88 62
+41 79 420 79 73

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:00:18 PM10/26/05

to

James Britt wrote:
>
> What do you mean by 'metadata'? The data types of your object's state?
>

Basically, yes.

Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:05:44 PM10/26/05

to

Lyndon Samson wrote:
> Ideally you'd be more generic than an ORM.

True, but I'd want to walk before I run, so to speak.

> 1.Have some method of querying them ( and implicit in this is some
> concept of unique ID/PK ).
> 2.Have some method of storing/retrieving them.
> 3.Have some form of collection which can contain the full object set
> or a subset.

That's a good analysis.

> You could go overboard by being very abstract for 1&2 which would
> allow you to utilise the storage engines query facility ( map to SQL
> for example), or very low-level rubyesque ( blocks for filters on
> properties etc ).

Mmm, again I wouldn't get too fancy in the early iterations.

> I don't think you should assume a RDBMS as the persistence mechanism.

Maybe not, but I'd want to start with some kind of backend that already

worked, as opposed to creating both a backend and a frontend.

> Also, db4o is worth looking at, it uses QBE.

I've never heard of it, but I will Google.

Thanks,
Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:11:08 PM10/26/05

to

ES wrote:
>
> Using ruby's capabilities it would be fairly easy to create
> a completely implicit ORM (obj.instance_variables etc.) even
> without subclassing --this is not really a problem. The
> problem is the corner cases where, for example, an instance
> variable contains transient data which is not useful (or even
> counterproductive) when stored so typically some way of either
> opting in or out is desirable to have. The other issue may be
> managing relationships between the stored objects.
>

Yes. The latter problem is worse than the former, I think. And
there are likely others I haven't seen yet.

KirbyBase has the concept of "calculated fields" which might
help relieve us of storing certain fields.

Thanks,
Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:24:20 PM10/26/05

to

Kev Jackson wrote:
> >
> I've spent a fair bit of time with Hibernate and I can safely say that
> it is not the "ruby way" (even from the little experience I have with ruby)

Thanks for this lengthy and informative post.

There's no question that a "clone" of Hibernate is wrong for Ruby.
But my understanding is that it at least keeps its hands off the
user classes (to some extent).

> Now you have a simple JavaBean style of class, the only thing that
> Hibernate imposes here is that it seems to be easier to use id values
> that are auto-generated (long maps to number(19) in Oracle etc). You
> can use Strings (auto generated hex values) or assign the id/primary key
> yourself. Best practice in the Hibernate community is to let the
> database auto-generate where possible and to always use surrogate keys.

Logan Capaldo and I talked about this. I'm still uncomfortable with it,

but will probably come around. I'd still like to code it the "other"
way
first -- manually assigned primary keys. The process of doing that will
clarify it in my mind, and I will be able to see the pros/cons better.

> So yes in the raw Java code for the model, Hibernate does not
> interfere. However at this point, you still need to map the JavaBean to
> the database table. This is done with a (verbose) xml mapping file. As
> these are such a pain to write, most people use XDoclet to generate the
> mapping automatically. For XDoclet to do this, you have to sprinkle
> attributes into your Java code like fairy dust. So the code would
> really look like...

I can tell you right now that whatever solution I end up using will not
employ XML or any XDoclet equivalent, and I hope to avoid fairy dust as
much as possible, as I am allergic to it. I'd like to centralize the
typing/mapping info as much as I can, and likewise use reflection as
much as I can.

> In Java5, the introduction of annotations allows these special
> @whatevers to be placed outside of comments. Hibernate3 supports both
> styles (in comment and true annotations). If you can say that these
> Attribute/Annotations don't couple themselves to your model code, then
> yes, the assertion that Hibernate is unobtrusive is true. On the other
> hand, actually keeping the metadata in a separate file (xml mapping in
> Hibernates case), means that the turn-around on a change is fairly
> significant. Trust me, coding up a Hibernate app without using Ant +
> XDoclet is an exercise in pain, even with Ant + XDoclet, the change
> code->deploy is still a drag.

The idea of rampant annotation is repugnant to me. So is the concept
of storing such info in a separate file.

Basically, I want the mapping info to be part of my code, but not part
of the objects I want to serialize. And I want it to be fairly
centralized rather than scattered here and there.

> There are some cool parts of Hibernate, being completely flexible on how
> you configure every aspect is probably the most 'enterprise' feature of
> it. It allows it to be used so much more easily with legacy data
> (composite business keys, wierd table structures etc).

I guess I'm more concerned with legacy objects than legacy tables. I'm
one of those who would like to build database tables from object defs,
rather than the other way around.

> Hibernate is very good at allowing you to specify everything, but you
> pay the price with overly complex and verbose configuration files that
> *must* be in sync with your model code for the application to work -
> this synchronization issue is the achilles heal of Hibernate in my
> experience - I've wasted too much time when the server has cached an old
> mapping file instead of deploying the new one.

As I said, I'd want this mapping info stored *in* my code, but not
scattered through it, and not in my stored classes. Hopefully that is
a viable world view.

Thanks much,
Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:26:36 PM10/26/05

to

Jim Weirich wrote:
> >
> > I want a really good ORM that is highly non-intrusive
> > (e.g., I don't have to inherit and I don't have to
> > clutter my classes and objects with metadata).
> >
> > Someone told me that this was much the way Hibernate
> > works in Java.
>
> Hahahaha ... oh, that's a good one :)
>
> Actually, hibernate is very flexible, but you do end up specifying a lot of
> metadata either through annotations, comments (as another poster
> demonstrated), or via XML files. Much more intrusive than, say,
> ActiveRecord.

Very discouraging, Jim. :) Read my post from five mins ago and
see if it seems on-target...

Thanks,
Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:29:22 PM10/26/05

to

Kev Jackson wrote:
> Actually re-thinking this again,
>
> If you don't want to inherit from a base class (a la ActiveRecord),
> could you build some kind of Dependency Injection to use ActiveRecord
> without having to inherit?

Welll... I don't really understand DI (sorry, Jamis) and I am not
totally thrilled with AR.

I'm going to write my own solution. It may be harder than I think,
and my goals may not be fully reachable. But even if I fail, it will
be a learning experience.

Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:32:10 PM10/26/05

to

Duane Johnson wrote:
> I've been looking in to the various options available to us Ruby
> developers as well. A coworker (Troy Heninger) and I are looking at
> implementing a "knowledge base" or ODBMS, but I haven't worked out
> the details yet. Troy is much more knowledgeable in this area.

The classic ODBMS has some implementation and usage problems. It's
a problem always worth revisiting, though.

Let's stay in touch on this.

> If you're interested, the 3 interesting packages I've found so far are:
>
> Purple - http://purple.rubyforge.org/
> DyBase - http://www.garret.ru/~knizhnik/dybase.html
> Madeleine - http://madeleine.sourceforge.net/

I looked at Madeleine and it seemed very restrictive to me. The others
I've never heard of. (Gosh, more reading...)

Thanks,
Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:33:59 PM10/26/05

to

Alexander Lamb wrote:
> Why didn't anyone mention Cayenne. As a comparison it seems closer to
> Ruby than Hibernate. It is a sort of advanced clone of what was EOF:
> http://www.objectstyle.org/cayenne/

I've never heard of it, but I will add it to my
mountain^H^H^H^H^H^H^H^H
list of things to evaluate.

Thanks,
Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:43:53 PM10/26/05

to

Kirk Haines wrote:

[snip snip snip]

> The above examples come directly from my current plan of how I want
> the library to work, based on my needs and the input that I have gotten
> from others. It's completely subject to change from internal or
> external influence at this point, as I'm still working on the
> modularization of the query generation/db interface code. The
> motivation for this, quite honestly, is so that I can have an adaptor
> to KirbyBase or even directly to a directory of CSV files which can be
> treated as a database of tables, or to other non-db data sources.

Hmm. I think the idea of multiple adaptors is great, and everybody
should
consider doing something similar. Your code as shown above, though,
seems
a tiny bit too low-level to me.

Of course, there's a good chance that what I'm trying to implement will
simply blow up in my face (figuratively speaking). If so, I may be very
happy with a lower-level solution.

Thanks,
Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 3:51:06 PM10/26/05

to

Alexandru Popescu wrote:
> I've been working with Hibernate for quite a while and imo it is correctly
> approaching so called
> object - relational mismatch.
>
> The real good thing about this approach is that it is not obtrusive in any ways with your domain
> model objects and it let's you focus and work only on the objectual world.

That sounds good so far.

> On the dark side of the problem: you should provide in some way the mapping between the object world
> and the relational world.

Yes, that is a necessary evil. Naturally I want the toolkit to be as
smart
as possible, and its usage to be painless as possible.

> While there are a few things that could be a little simplified (like
> automatic type conversions), the big problem is the impossibility to use this simplified form on
> relationships. If the parametrized types would have been implemented without the erasure mechanism
> than this simplification could be brought further, but for the moment we have to use some other way
> to describe relations: and here comes into play the metadata.

That sounds interesting, but I did not understand any of it. :) I am
not
sure what you mean by parametrized types, or what an erasure mechanism
is.

> There are a few different approaches
> used: metadata through external XML, metadata through javadoc comments and lately metadata through
> annotations.

All of these seem wrong to me. My approach will be: Metadata through
Ruby code
external to the stored objects.

I will try to give an example in a week or so.

Thanks,
Hal

rubyh...@gmail.com

unread,

Oct 26, 2005, 4:01:24 PM10/26/05

to

Bob Hutchison wrote:
>
> Do you want an ORM? or do you want a way to persist classes in a non-
> intrusive way? Be careful what you ask for :-)

Hshs... I really want the latter, but I am considering implementing
it using the former. ;)

> I am putting the final touches on a project (<http://rubyforge.org/
> projects/xampl/>) that I've been working on for a while now. It has
> its roots in a Java tool that I've been working on since 1998 or so
> and that has been used in eight or nine quite large commercial
> products (500k to 2000k lines of code). There is a Common Lisp
> version as well. I am working through a small but relatively complex
> example (in Ruby) just to make sure I've not missed anything that
> Ruby needs (and a good thing I did too).

Great, I will look that over as soon as I find time.

> It is unobtrusive as long as you play along. Persistence is only one
> of the goals of the tool, it is also trying to provide a useful
> framework for projects that use it.

"Playing along" seems reasonable, for appropriate values of "playing
along." (Another fine tautology from Hal.)

Hmm, what would the framework do besides provide persistence?

> There is more information on my weblog in the ruby category <http://
> recursive.ca/hutch/index.php?cat=16>, a few additional articles in
> the xampl category talking about the Java or CL version, if you are
> curious. The articles mostly talk about xampl as an XML binding tool
> -- which it also does do.

XML! Back, thou fiend, back! /me makes sign of cross

> Right now, xampl is targeted at new code. Fitting it into existing
> code can be done but requires some familiarity with the tool, and
> there is no guarantee that it would be all that useful in the end.
>

A lot fo things are like that. I hope to avoid a little of that if
I can.

Thanks,
Hal

Alexandru Popescu

unread,

Oct 26, 2005, 5:14:22 PM10/26/05

to

#: rubyh...@gmail.com changed the world a bit at a time by saying on 10/26/2005 9:52 PM :#

> Alexandru Popescu wrote:
>> I've been working with Hibernate for quite a while and imo it is correctly
>> approaching so called
>> object - relational mismatch.
>>
>> The real good thing about this approach is that it is not obtrusive in any ways with your domain
>> model objects and it let's you focus and work only on the objectual world.
>
> That sounds good so far.
>
>> On the dark side of the problem: you should provide in some way the mapping between the object world
>> and the relational world.
>
> Yes, that is a necessary evil. Naturally I want the toolkit to be as
> smart
> as possible, and its usage to be painless as possible.
>
>> While there are a few things that could be a little simplified (like
>> automatic type conversions), the big problem is the impossibility to use this simplified form on
>> relationships. If the parametrized types would have been implemented without the erasure mechanism
>> than this simplification could be brought further, but for the moment we have to use some other way
>> to describe relations: and here comes into play the metadata.
>
> That sounds interesting, but I did not understand any of it. :) I am
> not
> sure what you mean by parametrized types, or what an erasure mechanism
> is.
>

He he... no problem. Probably I can help here.

Considering a small example Foo has a 1 - N relation with Bar. In your objectual world, considering
that you would like/need to have both way navigation you should have

class Foo {
List<Bar> myBars;
}

class Bar {
Foo myParentFoo;
}

List<Bar> is a parametrized type; List<Bar> is a collection whose elements are Bar-s. Having this
strongly typed you could do some magic and don't have to describe through metadata the relation
between Foo and Bar. Unfortunately the java compiler is removing the Bar part, so you are left with
a untyped collection => there is not way to know that Foo has some relation to Bar.

>> There are a few different approaches
>> used: metadata through external XML, metadata through javadoc comments and lately metadata through
>> annotations.
>
> All of these seem wrong to me. My approach will be: Metadata through
> Ruby code
> external to the stored objects.
>

Even if you accept it or not this is still metadata and the format is more or less same verbose (at
least for me). I would probably agree that maybe in Ruby this makes sense, but to do this on Java
would be plain wrong ;-).

/alex
--
w( the_mindstorm )p.

> I will try to give an example in a week or so.
>
>
> Thanks,
> Hal
>
>
>

zimbatm

unread,

Oct 26, 2005, 5:37:55 PM10/26/05

to

Hi guys,

def Person
attr_reader :name, :surname, :day_of_birth

def initialize(name, surname, day_of_birth)
@name, @surname, @day_of_birth = name, surname, day_of_birth
end

def name=(name)
@name = name
end
end

Take this simple class.

Many implementations exist to decribe relation, data types, ... but
nothing forbids you to separate the meta-data description in another
file. Even if it's described in the class.

I'm really bad at writing english explanations so I hope you got it.
Unlike java, ruby allows you to redefine classes. I don't say this
because I think you don't know it, but in this particular discussion I
see it was not talked. I think it's an interesting approach at the
first sight.

Cheers,
zimba-tm

Adam Van Den Hoven

unread,

Oct 26, 2005, 5:37:52 PM10/26/05

to

I'm not sure if this is relevant but in my opinion the perfect ORM is
no ORM.

Why do we think we need ORM? Why do we build wonderful things like
Rails?

Because our applications have objects and we need to persist those
objects in a useful way. We want to be able to find those objects
and change them and save them.

Personally, I'm willing to sacrifice a LOT to get really simple
object persistence. Then again I'm not writing applications that need
to handle tens of thousands of object finds every second.

I would love to be able to do something like:

class Person < ActiveObject::Base
field :last_name, String
field :first_name, String
has_many :aliases, String
transient :some_transient_value
belongs_to :team
has_many_ordered :roles

#methods and the like
end

This would be enough to create a basic object, its schema, and
everything else. We might also do:

field :last_name, String, validate_max_length( 55 ),
validate_min_length( 10 )

After creating your class you would have to run some sort of
generation script to create your database tables. Beyond that, you
would get an object that works (in my perfect world) exactly like
ActiveRecord objects do.

I realize that this makes certain common database optimizations
impossible. I'm not sure that's a problem, like I said, I don't do
large scale applications.

Adam

On 25-Oct-05, at 5:11 PM, Hal Fulton wrote:

> For many weeks I have had this at the back of my mind.
>

> I want a really good ORM that is highly non-intrusive
> (e.g., I don't have to inherit and I don't have to
> clutter my classes and objects with metadata).
>
> Someone told me that this was much the way Hibernate

James Britt

unread,

Oct 26, 2005, 5:57:12 PM10/26/05

to

Adam Van Den Hoven wrote:
> I'm not sure if this is relevant but in my opinion the perfect ORM is
> no ORM.
>
> Why do we think we need ORM? Why do we build wonderful things like Rails?
>
> Because our applications have objects and we need to persist those
> objects in a useful way. We want to be able to find those objects and
> change them and save them.
>
> Personally, I'm willing to sacrifice a LOT to get really simple object
> persistence. Then again I'm not writing applications that need to
> handle tens of thousands of object finds every second.
>
> I would love to be able to do something like:
>
> class Person < ActiveObject::Base
> field :last_name, String
> field :first_name, String
> has_many :aliases, String
> transient :some_transient_value
> belongs_to :team
> has_many_ordered :roles
>
> #methods and the like
> end

Looks like Og. Except you can leave out the inheritance part

http://www.nitrohq.com

James

--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

George Moschovitis

unread,

Oct 27, 2005, 9:01:46 AM10/27/05

to

> So anyway, this is one of my highest priorities -- to
> make an ORM (that works the way I like) to wrap
> KirbyBase. (With additional code, it should/could wrap
> any other db, of course.)

FYI, the development release of Og includes a KirbyBase wrapper.

> Og is cool, but is even more intrusive.

Why is Og intrusive? can you elaborate?

--
http://www.gmosx.com
http://www.navel.gr
http://www.nitrohq.com

itsm...@hotmail.com

unread,

Oct 27, 2005, 9:35:02 AM10/27/05

to

rubyh...@gmail.com wrote:

> As I said, I'd want this mapping info stored *in* my code, but not
> scattered through it, and not in my stored classes. Hopefully that is
> a viable world view.

What would be wrong with using re-opening classes for the mapping e.g.

#file A.rb
class A
def foo ...
def bar ...
end

#file A_store.rb
class A
prop :foo, String
has_many :bars, B
end

Is it the number of additional instance methods added to A?

George Moschovitis

unread,

Oct 27, 2005, 3:27:37 PM10/27/05

to

> What would be wrong with using re-opening classes for the mapping e.g.

FYI, the development version of Og also supports this :)

-g.

rubyh...@gmail.com

unread,

Oct 27, 2005, 3:55:07 PM10/27/05

to

George Moschovitis wrote:
> > So anyway, this is one of my highest priorities -- to
> > make an ORM (that works the way I like) to wrap
> > KirbyBase. (With additional code, it should/could wrap
> > any other db, of course.)
>
> FYI, the development release of Og includes a KirbyBase wrapper.

That is interesting.

> > Og is cool, but is even more intrusive.
>
> Why is Og intrusive? can you elaborate?

This is only my opinion.

I dislike putting the metadata for my objects into the objects
themselves.

As someone pointed out, if I "reopen" the class, it is a little
better, but I am still unconfortable this way.

In addition, my memory of Og is that it encourages thinking in
database terms (like "has_many") -- true or not?

What I want is:

1. To think in object (and persistence) terms, not database terms.
2. To specify the minimum information necessary in order to marshal
each of my types.
3. To store the metadata separately from my classes/objects so as to
minimize impact on them. (But probably not in a separate file.)

Does that make any sense?

Hal

James Britt

unread,

Oct 27, 2005, 5:08:19 PM10/27/05

to

rubyh...@gmail.com wrote:

> George Moschovitis wrote:
>>Why is Og intrusive? can you elaborate?
>
>
> This is only my opinion.
>
> I dislike putting the metadata for my objects into the objects
> themselves.
>
> As someone pointed out, if I "reopen" the class, it is a little
> better, but I am still unconfortable this way.

Here's a Devil's Advocate argument. It may have actual merit; I'm not
entirely convinced.

There was a time when people believed you could create distributed
objects that would let you code as if all code was local, and move
objects to different machines at will. You, the coder, did not have to
do anything special when dealing with such objects. Just create an
instance and invoke methods.

But the reality is that sending message over the wire has a cost, and
one really does need to keep this in mind when designing and working
with distributed objects.

Likewise for autopersisted objects. It might be nice if one could just
use objects and have them magically saved/loaded with no special
consideration from the coder, but since it has a real cost, the coder
benefits from having at least some indication that this is what is
happening. So, putting the metadata in the class definition is Good and
Helpful because it alerts the coder to special conditions. It also makes
more clear when some attributes are to be saved and others are transient.

>
> In addition, my memory of Og is that it encourages thinking in
> database terms (like "has_many") -- true or not?

Interesting. I don't see "has_many" as being database-centric, just a
means for referring to some form of a relationship that can occur with
or without any persistence mechanism. But maybe I've just become immune
to the effects of certain words and phrases.

>
> What I want is:
>
> 1. To think in object (and persistence) terms, not database terms.
> 2. To specify the minimum information necessary in order to marshal
> each of my types.
> 3. To store the metadata separately from my classes/objects so as to
> minimize impact on them. (But probably not in a separate file.)
>
> Does that make any sense?

It does, and this is one of the reasons I prefer Og to ActiveRecord. I
can just code my objects without thinking in terms of a database, and
migrate to a persistence mechanism, if and when I need one, with a few
minor class-code annotations. That my class has explicit indicators of
storage metadata is less of an issue for me, and is arguably a feature.

James Britt

gwt...@mac.com

unread,

Oct 27, 2005, 5:29:41 PM10/27/05

to

On Oct 27, 2005, at 5:08 PM, James Britt wrote:
> There was a time when people believed you could create distributed
> objects that would let you code as if all code was local, and move
> objects to different machines at will. You, the coder, did not
> have to do anything special when dealing with such objects. Just
> create an instance and invoke methods.

The failure modes are radically different also. If you don't design/
code
for those failure modes, the illusion of transparency will dissolve when
you are least prepared.

Gary Wright

Kirk Haines

unread,

Oct 27, 2005, 5:31:32 PM10/27/05

to

On Thursday 27 October 2005 1:57 pm, rubyh...@gmail.com wrote:

> This is only my opinion.
>
> I dislike putting the metadata for my objects into the objects
> themselves.
>
> As someone pointed out, if I "reopen" the class, it is a little
> better, but I am still unconfortable this way.
>
> In addition, my memory of Og is that it encourages thinking in
> database terms (like "has_many") -- true or not?

This is a hard thing to deal with, though. A relational database has to use
keys to implement relationships between the tables. Some databases make it
very clear what the relationship is.

If the database has a foreign key constraint on a table, this means that the
database is saying that field X in the table references field Y in another
table. If a database supports this sort of thing, then the ORM can
automatically tell from the database structure that one table, and thus, one
object, has a relationship with another. It can create that relationship for
you.

But if you start from the other end, with the objects, you are not starting
with any of that meta information, so you have to do something to identify
classes which should map to tables, and if you intend for one field to store
only objects or arrays of objects of another class which is also represented
in the db by a table, you have to do something on the ruby side to declare
that. The has_one, has_many, many_to_many, and similar terms are commonly
accepted terms for describing these relationships. In thinking a bit about
it, though, I do suppose that one need not need to actually use those terms.
I could pretty easily make the following work in Kansas today.

class Schools
# The line below would indicate that there is a one to many relationship
# between a school and the inventories. So one school can be associated
# with many inventory records.
relationship Inventories.school_idx
end

class Inventories
# This line indicates that there is a one to one relationship between
# an Inventories object (and thus, db record) and a Chemicals object.
relationship chemical_idx => Chemicals
end

> 3. To store the metadata separately from my classes/objects so as to
> minimize impact on them. (But probably not in a separate file.)

I am not seeing why it would be beneficial to keep this annotation seperate
from the classes. The information has to be looked up somewhere, and if the
annotation doesn't interfere with the class otherwise, what is the downside
to having that information attached? If it is seperate, you still have to,
somehow, associate the two, and the information still needs to be looked up.
What is the benefit?

Kirk Haines

Jeremy Kemper

unread,

Oct 27, 2005, 7:02:39 PM10/27/05

to

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Oct 27, 2005, at 2:08 PM, James Britt wrote:
> Likewise for autopersisted objects. It might be nice if one could
> just use objects and have them magically saved/loaded with no
> special consideration from the coder, but since it has a real cost,
> the coder benefits from having at least some indication that this
> is what is happening. So, putting the metadata in the class
> definition is Good and Helpful because it alerts the coder to
> special conditions. It also makes more clear when some attributes
> are to be saved and others are transient.

You hit the nail on the head here, James. Persisting objects to a
relational database has repeatedly and raucously denied pixie dust
treatment. If giving an inch means a simple table <-> class ORM can
take us a mile, we should meditate on the nature of pragmatism.

>> In addition, my memory of Og is that it encourages thinking in
>> database terms (like "has_many") -- true or not?
>
> Interesting. I don't see "has_many" as being database-centric,
> just a means for referring to some form of a relationship that can
> occur with or without any persistence mechanism. But maybe I've
> just become immune to the effects of certain words and phrases.

While terminology like has_many is not restricted to db-think, it is
most commonly found there. Fowleresque ActiveRecord ORMs encourage a
similar pattern of thought: their metadata are clearly relational
hints sitting in your class, so it feels like you're mapping database
- -> objects in your head as you develop your app.

This is very different from composing a domain model, coding it up,
then devising a mapper to persist your object graph. As far as I
know there are no Ruby ORM that attempt this.

Having done it both ways, I prefer those little hints. The apparent
cost of a generic domain mapper is deceptively low due to the "it
seems nice" discount, but its true cost is far higher: high
conceptual overhead, difficult mapping bugs, and carpal tunnel.

> rubyh...@gmail.com wrote:
>> What I want is:
>> 1. To think in object (and persistence) terms, not database terms.
>> 2. To specify the minimum information necessary in order to marshal
>> each of my types.
>> 3. To store the metadata separately from my classes/objects so as to
>> minimize impact on them. (But probably not in a separate file.)
>> Does that make any sense?

For deeper satisfaction, look at how Smalltalkers have done it. Why
introduce the R and M to O at all?

jeremy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (Darwin)

iD8DBQFDYVxsAQHALep9HFYRApV/AKCcfOZYwXHpzddB9/bvbRzR1c3tLgCgpeJn
XdqzF9f9/nRKYu19KEKe/5Y=
=fbcX
-----END PGP SIGNATURE-----

Hal Fulton

unread,

Oct 27, 2005, 7:21:02 PM10/27/05

to

Jeremy Kemper wrote:
>
> This is very different from composing a domain model, coding it up,
> then devising a mapper to persist your object graph. As far as I know
> there are no Ruby ORM that attempt this.

If I understand you right, that is the way I want to do it.

> Having done it both ways, I prefer those little hints. The apparent
> cost of a generic domain mapper is deceptively low due to the "it seems
> nice" discount, but its true cost is far higher: high conceptual
> overhead, difficult mapping bugs, and carpal tunnel.

Well, I guess I will work on it and see where it goes. Maybe it will
prove impractical. Maybe no one will like it but me. We'll see.

Thanks,
Hal

George Moschovitis

unread,

Oct 28, 2005, 1:37:03 PM10/28/05

to

> I dislike putting the metadata for my objects into the objects
> themselves.

The metadata is stored in the object class not the actual instances.

> As someone pointed out, if I "reopen" the class, it is a little

> better.

Og 0.24.0 allows reopening.

> In addition, my memory of Og is that it encourages thinking in
> database terms (like "has_many") -- true or not?

I dont think that has many is a database term, it just decribes object
relations amd allows Og to automagiaclly generate some useful methods.
This is an abstraction.

> 1. To think in object (and persistence) terms, not database terms.

I think, using Og you almost forget that you are using a database. In
fact you dont need an RDBMS store.

> 2. To specify the minimum information necessary in order to marshal
> each of my types.

Og supports this.

> 3. To store the metadata separately from my classes/objects so as to
> minimize impact on them. (But probably not in a separate file.)

You can do this in the latest version.

Og is constantly evolving stay tunned for even better abstractions.
Even better you can help us with suggestions and/or patches. Join the
mailing list ;-)

regards,
George.

George Moschovitis

unread,

Oct 28, 2005, 1:43:23 PM10/28/05

to

> This is very different from composing a domain model, coding it up,
> then devising a mapper to persist your object graph. As far as I
> know there are no Ruby ORM that attempt this.

Ehm, Og does that.

First you get your somain model:

class User
attr_accessor :name
end

class Book
attr_accessor :title
end

Then you can annotate the model as needed (even in another source file):

class User
ann :name, :klass => String
has_many :books
end

class Book
belongs_to :user
end

Of course you can combine the two steps in one if you want (and typically you
want this ;-))

then you just issue:

user.save to perist yourt object
or
User.find_by_name(...)
to query etc etc...

Og even allows for inheritance and polymorphic relations. You can use
a non RDBMS store if you want, or no store at all (use it just as an
object manager). I suggest you have a look at this.

regards,
George.

rubyh...@gmail.com

unread,

Oct 28, 2005, 5:42:15 PM10/28/05

to

Kirk Haines wrote:

[snip]

> But if you start from the other end, with the objects, you are not starting
> with any of that meta information, so you have to do something to identify
> classes which should map to tables, and if you intend for one field to store
> only objects or arrays of objects of another class which is also represented
> in the db by a table, you have to do something on the ruby side to declare
> that. The has_one, has_many, many_to_many, and similar terms are commonly
> accepted terms for describing these relationships. In thinking a bit about
> it, though, I do suppose that one need not need to actually use those terms.
> I could pretty easily make the following work in Kansas today.
>
> class Schools
> # The line below would indicate that there is a one to many relationship
> # between a school and the inventories. So one school can be associated
> # with many inventory records.
> relationship Inventories.school_idx
> end
>
> class Inventories
> # This line indicates that there is a one to one relationship between
> # an Inventories object (and thus, db record) and a Chemicals object.
> relationship chemical_idx => Chemicals
> end

Yes, but that's just giving different names to the same things, isn't
it?
If I see "relationship" in someone's code, I don't really know what it
means. I assume it's some database stuff.

As far as I can see, all that's really needed is:
1. Let each class map to a table
2. Let each table have a known unique primary key

Then we don't need all this "relationship" stuff, do we?

When I think of an object containing a sub-object (and yes, I certainly
know these are only *references* internally), I don't think "a Foo
object has a one-to-one relationship with a Bar object"; I just think
"Foo has a field bar, which will typically be a Bar." (And no, I'm not
a fan of static typing, either.)

As for a "has-many" relationship (in objects, not in DBs) -- isn't that

just what we call an "array"? The difference being that Ruby arrays are

heterogeneous whereas rows of a table all represent the same type?

Reflection could tell us that a field is an array. It could also tell
us
the type of each element in the array.

Over the yeats I've stuck thousands of arrays into thousands of
objects,
all without ever thinking about the "relationship" of the container to
the containee; the former contains the latter, that's about it. And
I've
never dwelled long on the fact that an array indeed "has many" items in
it,
or felt the need to annotate that fact explicitly.

I want to do as little specification as possible to store my objects.
That's where I'm coming from. I want the persistence framework to be as
smart as possible and make as many reasonable assumptions as possible.

I want to spend as little time coding the metadata portion as I can,
and
I want it all stuck in the same place in my code, in as few lines as
possible.

Again, I'm not criticizing your opinions or anyone else's. This is just
me. This sort of thing is as personal as the choice of variable and
method names.

> I am not seeing why it would be beneficial to keep this annotation seperate
> from the classes. The information has to be looked up somewhere, and if the
> annotation doesn't interfere with the class otherwise, what is the downside
> to having that information attached? If it is seperate, you still have to,
> somehow, associate the two, and the information still needs to be looked up.
> What is the benefit?

It's highly subjective. If I do reflection and look at the stuff in my
classes, I don't want to see the extraneous stuff.

Again I stress, it's just me. I'm not arguing it's *wrong* to do it the
other
way. If a class inherits from another or includes a module, then those
are more
tightly coupled than if I simply pass an object into a method of
another class
(which is the ultimate decoupling other than not interacting at all).

Hal

rubyh...@gmail.com

unread,

Oct 28, 2005, 5:55:31 PM10/28/05

to

George Moschovitis wrote:
> > I dislike putting the metadata for my objects into the objects
> > themselves.
>
> The metadata is stored in the object class not the actual instances.

Yes, I realize that.

> > In addition, my memory of Og is that it encourages thinking in
> > database terms (like "has_many") -- true or not?
>
> I dont think that has many is a database term, it just decribes object
> relations amd allows Og to automagiaclly generate some useful methods.
> This is an abstraction.

I think it was originally a database term which then found its way into
things like UML. In classic (non-UML) discussions of OOP, I have never
seen these terms.

> I think, using Og you almost forget that you are using a database. In
> fact you dont need an RDBMS store.

I like Og, it just feels verbose to me.

But I am the type who would rather do

arr.mapf(:downcase)

instead of

arr.map {|x| x.downcase }

And I feel that with Og, I am coupling my classes to the persistence
mechanism. Rather than "teaching my objects to persist" I would like
to give information to a persistence framework, and then just pass in
pristine objects (undecorated, unannotated, no tattoos or bumper
stickers).

> > 2. To specify the minimum information necessary in order to marshal
> > each of my types.
>
> Og supports this.
>
> > 3. To store the metadata separately from my classes/objects so as to
> > minimize impact on them. (But probably not in a separate file.)
>
> You can do this in the latest version.

OK, I didn't know that at all. :)

> Og is constantly evolving stay tunned for even better abstractions.
> Even better you can help us with suggestions and/or patches. Join the
> mailing list ;-)

I will try first to produce something of my own that I like. If I fail,
then Og will be my favorite framework. ;) And then I suppose I would
join the list and so on.

Thanks,
Hal

James Edward Gray II

unread,

Oct 28, 2005, 5:58:21 PM10/28/05

to

On Oct 28, 2005, at 4:47 PM, rubyh...@gmail.com wrote:

> I want to do as little specification as possible to store my objects.

I just have to say first that I've really enjoyed reading these
planning messages of yours Hal. It's clear you have a vision and
know what you want. I can't wait to see the results.

That said, I don't understand the full vision, so forgive my dumb
question.

The whole time I'm reading your posts though I keep thinking you just
want:

File.open("objects", "w") do |file| Marshal.dump(whatever, file) end

Or YAML. Or PStore if you also want transactions.

Can you explain how what you want differs from this?

James Edward Gray II

Kirk Haines

unread,

Oct 28, 2005, 6:55:36 PM10/28/05

to

On Friday 28 October 2005 3:47 pm, rubyh...@gmail.com wrote:

> Yes, but that's just giving different names to the same things, isn't
> it?

Well, yeah. :)

> As far as I can see, all that's really needed is:
> 1. Let each class map to a table
> 2. Let each table have a known unique primary key
>
> Then we don't need all this "relationship" stuff, do we?

In some cases, you can't do without it.

ORM can be approached from two basic directions. It can be database driven,
in that the structure of the database dictates the language structures, or
one can have the language structures dictate the database structure.

For instance, if you have a database that you need to write an application to
interact with, that database structure is going to dictate your Ruby language
structures, and your use of those structures should not change the structure
of the database.

In a case like that, depending on what database you are using, the database
structure might explicitly describe the relationship between tables for you,
or it may not.

If it does, IMHO, the ideal for an ORM, and where I am going with Kansas, is
for the ORM to be able to understand that and act on it, so that one
automatically gets Ruby classes and methods that simply make that database
structure accessible with no effort on the part of the programmer. It's
automatic.

On the other hand, if the database does not provide this information, or the
code is being written so that it can be used on multiple dbs, and one of the
potential targets does not provide this information, then if one wants to
make use of relationships, one must provide this information. There's no way
around it.

And the reason why relationships are useful beyond having a simple mapping of
class to table is because they make the code convenient.

In the MSDS application that my examples come from, one can select a single
chemical and view information on it. One thing that can be done at that
point is to see other chemicals manufactured by the same manufacturer that
the chemical being viewed comes from.

So, if that chemical's record is in @chemical:

@chemical.manufacturer.chemicals

And you have your list. The relationship information provided the necessary
string to tie those together without the programmer having to explicity write
the code. It is a tremendous timesaver.

Now, the other direction that an ORM can go is to let the language structures
dictate the database structures.

So, for instance, you declare a class:

class Foo
attr_accessor :a, :b
end

And somehow, that class is mapped to the database, creating or altering the
table definition as necessary. When doing this, if a field is going to store
objects of another class that is also mapped to a table in the database, you
have to tell the ORM about that so that it knows how the tables should look
in order to store your data.

Regardless of the direction that one is approaching the ORM task from, the
annotation still serves the same purpose -- it makes sure that the ORM is
doing what you want it to in cases where interpretation is ambiguous or even
impossible.

> As for a "has-many" relationship (in objects, not in DBs) -- isn't that
>
> just what we call an "array"? The difference being that Ruby arrays are
>
> heterogeneous whereas rows of a table all represent the same type?

Yep.

> I want to do as little specification as possible to store my objects.
> That's where I'm coming from. I want the persistence framework to be as
> smart as possible and make as many reasonable assumptions as possible.
>
> I want to spend as little time coding the metadata portion as I can,
> and
> I want it all stuck in the same place in my code, in as few lines as
> possible.

Those are all my goals, too. :)

Let's say that you have two classes:

class Manufacturers
attr_accessor :idx, :name, :address
end

class Chemicals
attr_accessor :idx, :name, :manufacturer
end

Consider the manufacturer field in the Chemicals class. An ORM can not look
at this and know that you intend to store Manufacturers objects in it. And
that information is important to determine the structure of the database.

What if you have a Manufacturers object, and you want to know all of the
chemicals that have that manufacturer? You could write a method manually to
do that:

class Manufacturers
def chemicals
#query the db and retrieve an array of Chemicals records where
# manufacturer == self.idx
end
end

But if all of that typing can be reduced to simply telling the ORM about the
relationship, using some syntax or other, isn't that a win?

class Manufacturers
relates_to Chemicals.manufacturer
end

Thanks,

Kirk Haines

Daniel Amelang

unread,

Oct 28, 2005, 10:45:34 PM10/28/05

to

(Someone teased me at the
> conference that a year had gone by, and Tycho still
> hadn't progressed any. And he wonders how that spider
> got in his salad.)

Hehe. That was me. And the spider...hey! That was you!?

Dan Amelang

Hal Fulton

unread,

Oct 29, 2005, 2:11:19 PM10/29/05

to

Not a dumb question at all. And I wouldn't dignify my ideas as a "vision"
yet. If it works well, I'll retroactively dub it a vision. ;)

Basically the only thing missing from a YAML solution or something is
the ability to do sophisticated queries without storing all the objects
in memory at once.

Given that I may have 100,000 objects or so, I don't want to store them
all in a giant array, but I *do* want to be able to find them by the
values of their accessors.

Make sense?

Hal

Hal Fulton

unread,

Oct 29, 2005, 2:19:55 PM10/29/05

to

Kirk Haines wrote:
>
> In some cases, you can't do without it.
>
> ORM can be approached from two basic directions. It can be database driven,
> in that the structure of the database dictates the language structures, or
> one can have the language structures dictate the database structure.

I think I am definitely object-driven.

[snippage]

> And the reason why relationships are useful beyond having a simple mapping of
> class to table is because they make the code convenient.
>
> In the MSDS application that my examples come from, one can select a single
> chemical and view information on it. One thing that can be done at that
> point is to see other chemicals manufactured by the same manufacturer that
> the chemical being viewed comes from.
>
> So, if that chemical's record is in @chemical:
>
> @chemical.manufacturer.chemicals
>
> And you have your list. The relationship information provided the necessary
> string to tie those together without the programmer having to explicity write
> the code. It is a tremendous timesaver.

It is, and I see the usefulness of it, but it doesn't fit my brain.

That is the sort of thing I'd use a query for, rather than just grabbing the
value of what looks to me like an accessor.

When you call chemicals in that way, is it then doing a query (late binding)
or did it get done recursively when you retrieved @chemical?

> Now, the other direction that an ORM can go is to let the language structures
> dictate the database structures.
>
> So, for instance, you declare a class:
>
> class Foo
> attr_accessor :a, :b
> end
>
> And somehow, that class is mapped to the database, creating or altering the
> table definition as necessary. When doing this, if a field is going to store
> objects of another class that is also mapped to a table in the database, you
> have to tell the ORM about that so that it knows how the tables should look
> in order to store your data.

[snip]

Yes, this is my personal preference.

[snip]

> What if you have a Manufacturers object, and you want to know all of the
> chemicals that have that manufacturer? You could write a method manually to
> do that:
>
> class Manufacturers
> def chemicals
> #query the db and retrieve an array of Chemicals records where
> # manufacturer == self.idx
> end
> end
>
> But if all of that typing can be reduced to simply telling the ORM about the
> relationship, using some syntax or other, isn't that a win?
>
> class Manufacturers
> relates_to Chemicals.manufacturer
> end

It's a win if you want to think that way. I'd rather just make the
query syntax easy/flexible and forget about "relates_to" and such.

I understand a simple query. But every time I saw "relates_to" I
would have to stop and ask myself how it worked and what it meant.

Hal

DCC

unread,

Oct 29, 2005, 3:15:43 PM10/29/05

to

Hal Fulton wrote:

> Kirk Haines wrote:
> >
> > In some cases, you can't do without it.
> >
> > ORM can be approached from two basic directions. It can be database driven,
> > in that the structure of the database dictates the language structures, or
> > one can have the language structures dictate the database structure.
>
> I think I am definitely object-driven.
>
> [snippage]
>
> > And the reason why relationships are useful beyond having a simple mapping of
> > class to table is because they make the code convenient.
> >
> > In the MSDS application that my examples come from, one can select a single
> > chemical and view information on it. One thing that can be done at that
> > point is to see other chemicals manufactured by the same manufacturer that
> > the chemical being viewed comes from.
> >
> > So, if that chemical's record is in @chemical:
> >
> > @chemical.manufacturer.chemicals
> >
> > And you have your list. The relationship information provided the necessary
> > string to tie those together without the programmer having to explicity write
> > the code. It is a tremendous timesaver.
>
> It is, and I see the usefulness of it, but it doesn't fit my brain.
>
> That is the sort of thing I'd use a query for, rather than just grabbing the
> value of what looks to me like an accessor.

I think it's easy to underestimate the value of being able to pass
around views of your data as objects. Using a query can will quickly
clog up your code (check how clogged up even LINQ queries can become
for the coming C# 3.0). I've dramatically reduced the size of some apps
using that approach. Then again, there's no point in condensing your
code if it doesn't read easily for you.
After much resistance my brain now fits in with Kirk's view of things.
I see my model as a database and want to design for a database.
However, I want my database to be a ruby object. I also want to reduce
the number of queries which always introduces errors and debugging for
me.
This may sound odd, but I want the purity of design of the relational
model, as well as the purity of design of ruby OO, without one
polluting the other. So I like one layer of my app explicitly for that
purpose.

>
> It's a win if you want to think that way. I'd rather just make the
> query syntax easy/flexible and forget about "relates_to" and such.
>
> I understand a simple query. But every time I saw "relates_to" I
> would have to stop and ask myself how it worked and what it meant.

You wouldn't call relates_to very much apart from when you initialize
the app. I personally find that switching my brain to 'database mode'
in those few necessary cases isn't too expensive in runtime.

DCC

James Edward Gray II

unread,

Oct 29, 2005, 6:09:49 PM10/29/05

to

On Oct 29, 2005, at 1:11 PM, Hal Fulton wrote:

> Not a dumb question at all. And I wouldn't dignify my ideas as a
> "vision"
> yet. If it works well, I'll retroactively dub it a vision. ;)
>
> Basically the only thing missing from a YAML solution or something is
> the ability to do sophisticated queries without storing all the
> objects
> in memory at once.
>
> Given that I may have 100,000 objects or so, I don't want to store
> them
> all in a giant array, but I *do* want to be able to find them by the
> values of their accessors.
>
> Make sense?

Yes, I think I understand now, finally. It's a very interesting
idea. Can't wait to see what you come up with...

James Edward Gray II

Dave Burt

unread,

Oct 30, 2005, 7:46:40 AM10/30/05

to

So Marshalled objects plus indexes?

Dave

Duane Johnson

unread,

Nov 2, 2005, 12:52:24 PM11/2/05

to

On Oct 30, 2005, at 5:52 AM, Dave Burt wrote:
>> Basically the only thing missing from a YAML solution or something is
>> the ability to do sophisticated queries without storing all the
>> objects
>> in memory at once.
>>
>> Given that I may have 100,000 objects or so, I don't want to store
>> them
>> all in a giant array, but I *do* want to be able to find them by the
>> values of their accessors.
>
> So Marshalled objects plus indexes?
>

Nice summary, Dave. If this is indeed what Hal is talking about, it
seems like a very nice "fit".

From off the top of my head, an ideal data repository has the
following qualities:

1. Infinite storage capacity
2. Zero access time
3. Persistent / Failsafe

Current technology (i.e. a hard drive) marries persistence with
storage capacity and unfortunately increases access time. In-memory
data reverses the advantages and disadvantages--it decreased access
time, but it has a smaller storage capacity and it is no longer
persistent.

Indexing is a netherworld. By imposing some structure on the data
(e.g. "The 'id' attribute will always contain an integer") we can
store ordered information about an otherwise haphazard data web. The
ordering gives us the ability to predict where to look for
information (e.g. sort the 'id' attribute numerically). This is
important--we only need structure where we need to predict
something. If we don't need to predict where to find information
then an index is unnecessary. The "imposed structure" of database
tables goes away if we don't need a bird's eye view of the data.

It seems that "Marshalled objects + Indexes" gives us this happy
middle ground--most of the time we don't need to predict where to
find information (e.g. many array attributes) but in the cases where
we do, we could impose that "thread" of structure (aka an index) on a
YAML file.

Duane Johnson
(canadaduane)

Jeffrey Moss

unread,

Nov 2, 2005, 11:12:08 PM11/2/05

to

There is a package called DyBase out there, it's somewhat dated and unmaintained, but it
basically does the marshalled objects with indexes, and it has an API for Ruby, PHP, Python
and some other language I've never heard of.

The only reason not to use it is because it stores data in a binary format file, it's
lightning fast for what it does.

-Jeff

Hal Fulton

unread,

Nov 2, 2005, 11:46:54 PM11/2/05

to

Duane Johnson wrote:
>
> On Oct 30, 2005, at 5:52 AM, Dave Burt wrote:
>
>>> Basically the only thing missing from a YAML solution or something is
>>> the ability to do sophisticated queries without storing all the objects
>>> in memory at once.
>>>
>>> Given that I may have 100,000 objects or so, I don't want to store them
>>> all in a giant array, but I *do* want to be able to find them by the
>>> values of their accessors.
>>
>>
>> So Marshalled objects plus indexes?
>>
>
> Nice summary, Dave. If this is indeed what Hal is talking about, it
> seems like a very nice "fit".

I guess I never replied to this one. I'm not sure that this is the
way I would state it, but it's mostly correct.

After all, complex queries don't depend on indexes. Indexes just make
them faster.

> From off the top of my head, an ideal data repository has the
> following qualities:
>
> 1. Infinite storage capacity
> 2. Zero access time
> 3. Persistent / Failsafe

I would add transparency with regard to objects. That is, I don't want
to assemble and disassemble my objects from records manually.

> It seems that "Marshalled objects + Indexes" gives us this happy middle
> ground--most of the time we don't need to predict where to find
> information (e.g. many array attributes) but in the cases where we do,
> we could impose that "thread" of structure (aka an index) on a YAML file.

The paradigm of "marshalling + indexes" is an interesting one indeed. But
when I think of queries, I think databases. That is how my interest in
KirbyBase arose.

So for now I will build some kind of solution on top of KB rather than
add my own indexing/querying scheme to YAML or something.

Hal

Molitor, Stephen L

unread,

Nov 3, 2005, 3:16:34 PM11/3/05

to

What about one the Ruby dbm libraries (dbm, gdmb,...)?

Steve

rubyh...@gmail.com

unread,

Nov 3, 2005, 5:03:46 PM11/3/05

to

Molitor, Stephen L wrote:
> What about one the Ruby dbm libraries (dbm, gdmb,...)?

I used to store YAML inside DBM. That's kind of a clunky solution.
What's more:
1. DBM doesn't work on Windows.
2. DBM files aren't readable cross-platform.

See the very, very beginning of this topic on March 2, my post
"A wish: Simple database" -- to that list of requirements I would
now add object support and *complex* object support.

Note that Jamey Cribbs responded within an hour, beginning
my membership in the KirbyBase fan club.

Hal

rubyh...@gmail.com

unread,

Nov 3, 2005, 5:24:45 PM11/3/05

to

Molitor, Stephen L wrote:
> What about one the Ruby dbm libraries (dbm, gdmb,...)?

I used to store YAML inside DBM. That's kind of a clunky solution.

why the lucky stiff

unread,

Nov 3, 2005, 8:06:55 PM11/3/05

to

rubyh...@gmail.com wrote:

>I used to store YAML inside DBM. That's kind of a clunky solution.
>What's more:
> 1. DBM doesn't work on Windows.
> 2. DBM files aren't readable cross-platform.
>
>

No, MouseHole does it.

Also, KirbyBase is cool and MouseHole may switch to KirbyBase after all.

_why

Bill Kelly

unread,

Nov 3, 2005, 8:30:05 PM11/3/05

to

Hi,

From: "why the lucky stiff" <ruby...@whytheluckystiff.net>
>
> rubyh...@gmail.com wrote:
>
>>I used to store YAML inside DBM. That's kind of a clunky solution.
>>What's more:
>> 1. DBM doesn't work on Windows.
>> 2. DBM files aren't readable cross-platform.
>>
>>
> No, MouseHole does it.

Unless something has changed recently, my experience with
SDBM under Windows on both Ruby and Perl, is that it begins
to malfunction after N keys are inserted, where N varies
depending on size of keys/values inserted. . . .

On October 17, 2004, I tried the following:

One of my tests yesterday was to try to store
1,000,000 key/values. The keys were always length 8.
The values were random, between length 1 and length
511. After several minutes it managed to store about
47,000 keys before it upchucked a "sdbm_store failed".
And the .dbm file was up to 1.7 GB.

As an aside - I *think* Zed Shaw's odeum bindings
[ http://www.zedshaw.com/projects/ruby_odeum/ ] allow access
to the lower level hash-based data storage facilities, not
just the higher level inverse index capability... FWIW

Regards,

Bill

Jimmie Houchin

unread,

Nov 22, 2005, 12:14:02 AM11/22/05

to

I don't know if you've seen this one or not.
But I think a natural OO/Ruby interface for persistance on top of SQLite
for the backend could be interesting.

Here is one that is being done in Python.
Looks interesting.

http://divmod.org/trac/wiki/DivmodAxiom

Jimmie