renaming path to id

98 views
Skip to first unread message

Lukas Kahwe Smith

unread,
Mar 7, 2011, 4:48:09 AM3/7/11
to symfony-cmf-devs
Aloha,

we discussed this briefly at the hackday this saturday. we agreed to rename "path" to "id" to align things with the other ODM's. obviously this can create some confusion, but at least this way we only have to explain it once and it will be much easier to make Bundle's that work with other ODM's.

for those wondering about uuid's. these really should not be used at all. but in the cases where people do want to use them, they will be available as "uuid".

furthermore we will add ID generator's. one will be the "assigned" id generator. which is effectively what we have now, just that instead of passing the path with persist, you set it in the document instance. there will be other id generators, for example i can see one with a callback, that allows defining the path via a category+date etc in order to better "span" documents across subnodes to prevent hitting the infamous 10k subnode limit of jackrabbit.

i will work on this sometime this week, unless someone is faster then me :)

regards,
Lukas Kahwe Smith
m...@pooteeweet.org

Jacopo Romei

unread,
Mar 7, 2011, 5:50:02 AM3/7/11
to symfony-...@googlegroups.com
Hi,

> we agreed to rename "path" to "id"[...]. obviously this can create some confusion

We could still provide 'path oriented' proxy methods to feature a more
robust interface.

--
Jacopo Romei
http://www.sviluppoagile.it/
http://twitter.com/jacoporomei
http://www.anonimarmonisti.com/

Benjamin Eberlei

unread,
Mar 7, 2011, 1:13:36 PM3/7/11
to symfony-...@googlegroups.com
I think this is a bad idea. JCR has both a path and UUID (id). They are both important concepts for a tree storage implementation that is jackalope/jackrabbit (JCR).

Lukas Kahwe Smith

unread,
Mar 7, 2011, 1:52:20 PM3/7/11
to symfony-...@googlegroups.com

On 07.03.2011, at 19:13, Benjamin Eberlei wrote:

> I think this is a bad idea. JCR has both a path and UUID (id). They are both important concepts for a tree storage implementation that is jackalope/jackrabbit (JCR).


ok. let me first work on aligning the ODM interfaces. then i will look into making the AdminBundle work with MongoDB and PHPCR ODM and then i will see how much of a problem it is to deal with id vs. path in code that needs to introspect the classmetadata. then we can make a more educated decision, because yes a path is not as immutable as an id.

@ideato: i noticed you guys had an old version of the AdminBundle (BaseApplicationBundle) in one of the cmf-sandbox branches. did you guys get it to run? would be surprised with the obvious API difference with persist().

regards,
Lukas Kahwe Smith
sm...@pooteeweet.org

Lukas Kahwe Smith

unread,
Mar 7, 2011, 3:54:58 PM3/7/11
to symfony-...@googlegroups.com

On 07.03.2011, at 19:52, Lukas Kahwe Smith wrote:

>
> On 07.03.2011, at 19:13, Benjamin Eberlei wrote:
>
>> I think this is a bad idea. JCR has both a path and UUID (id). They are both important concepts for a tree storage implementation that is jackalope/jackrabbit (JCR).
>
>
> ok. let me first work on aligning the ODM interfaces. then i will look into making the AdminBundle work with MongoDB and PHPCR ODM and then i will see how much of a problem it is to deal with id vs. path in code that needs to introspect the classmetadata. then we can make a more educated decision, because yes a path is not as immutable as an id.


here is a quick 30min pull to add id generator support:
https://github.com/doctrine/phpcr-odm/pull/1

Francesco Trucchia

unread,
Mar 8, 2011, 6:23:44 PM3/8/11
to symfony-...@googlegroups.com
On Mon, Mar 7, 2011 at 7:52 PM, Lukas Kahwe Smith <sm...@pooteeweet.org> wrote:
> @ideato: i noticed you guys had an old version of the AdminBundle (BaseApplicationBundle) in one of the cmf-sandbox branches. did you guys get it to run? would be surprised with the obvious API difference with persist().
>

Thank you very much for notice, but our integration with
BaseApplicationBundle never worked ;-) !! Now we are play with the cmf
sandbox in our fork https://github.com/ideatosrl/cmf-sandbox where we
removed the BaseApplicationBundle.

Additionally as mentioned last Saturday, during the hack day, we agree
with your decision to make the persist interface compliant to standard
ODM persist interface.

--
Francesco (cphp) Trucchia

Scrivimi: fran...@cphp.it
Leggi il mio blog: http://www.cphp.it
Guarda il mio profilo: http://www.linkedin.com/in/trucchia
Seguimi su twitter: http://twitter.com/cphp
Chiamami: callto://trucchia

David Buchmann

unread,
Mar 9, 2011, 9:49:41 AM3/9/11
to symfony-...@googlegroups.com, Lukas Kahwe Smith
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>> I think this is a bad idea. JCR has both a path and UUID (id). They are both important concepts for a tree storage implementation that is jackalope/jackrabbit (JCR).
>
> ok. let me first work on aligning the ODM interfaces. then i will look into making the AdminBundle work with MongoDB and PHPCR ODM and then i will see how much of a problem it is to deal with id vs. path in code that needs to introspect the classmetadata. then we can make a more educated decision, because yes a path is not as immutable as an id.

i feel its bether to have the path to be the "id" in odm terms.
otherwise people will start to use the uuid all over the place.
but lets see what lukas tells us when he worked on the interfaces.

if you want to link nodes together, phpcr has the concept of references,
where you do not pass ids around but directly the node, and the api
handles everything transparently for you. you can have weak references
(attribute points to other element) and strong references (referential
integrity), as well as store paths to nodes in ids. only the path
version will break if you change the path of the target node.

not sure if/how this concept is best mapped into odm terms.

cheers,david

- --
Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk13k4AACgkQqBnXnqWBgItSqwCgkWQy88i/33UOBxDc0LCer9/W
A/AAnREt9rofxGXmWUt9yzkcL0UPxBHO
=/eze
-----END PGP SIGNATURE-----

Benjamin Eberlei

unread,
Mar 9, 2011, 10:12:42 AM3/9/11
to symfony-...@googlegroups.com
Yes but the id is not the path.

An ID is unique and it should never change.

Say we have a folder:

jcr:root/www.whitewashing.de/blog/articles/

And i have tons of blog articles in there:

* symfony-cmf-rules
* doctrine-release
* cookies-taste-good

Then the question is: What is the ID?

Neither the nodename, nor the full path can be really thought of as ID.
The shortname could be the id, because "renaming" a node could be thought
as deleting and re-inserting. However renaming a folder would require
changing the ID of many subnodes if the full-path is the id. The UnitOfwork
should handle the generation of the Full Path of a node for any given
document. So writing the full path into the document is sort of no-go,
because you should use the Object-Graph for moving and all that stuff and
not change a full path and still have the references dangling at the wrong
objects.

I would say rename the Path to ID, if and only if its only passed the
"shortname" and not the full path.

I am not sure how its done at the moment.

Could any of Liip maybe write up some more detailed information about how
the mapping takes place at the moment? I fail to understand it fully atm.

On Wed, 09 Mar 2011 15:49:41 +0100, David Buchmann
<david.b...@liip.ch>
wrote:

Lukas Kahwe Smith

unread,
Mar 9, 2011, 10:29:48 AM3/9/11
to symfony-...@googlegroups.com

On 09.03.2011, at 16:12, Benjamin Eberlei wrote:

> Yes but the id is not the path.
>
> An ID is unique and it should never change.

you are missing one more: the ID should be required.

a JCR path is unique, required, but it may change, though when it changes you can leave a reference to make it available under the old path and the new path.

> Say we have a folder:
>
> jcr:root/www.whitewashing.de/blog/articles/
>
> And i have tons of blog articles in there:
>
> * symfony-cmf-rules
> * doctrine-release
> * cookies-taste-good
>
> Then the question is: What is the ID?

the path would be the absolute path
aka
* jcr:root/www.whitewashing.de/blog/articles/symfony-cmf-rules
* jcr:root/www.whitewashing.de/blog/articles/doctrine-release
* jcr:root/www.whitewashing.de/blog/articles/cookies-taste-good

now the thing is, JCR does not mandate any other identifier. a UUID is optional and discouraged. and this is really the crux.

> Neither the nodename, nor the full path can be really thought of as ID.
> The shortname could be the id, because "renaming" a node could be thought
> as deleting and re-inserting. However renaming a folder would require
> changing the ID of many subnodes if the full-path is the id. The UnitOfwork
> should handle the generation of the Full Path of a node for any given
> document. So writing the full path into the document is sort of no-go,
> because you should use the Object-Graph for moving and all that stuff and
> not change a full path and still have the references dangling at the wrong
> objects.

well the entire path management stuff is handled by the JCR backend. internally a JCR backend might store the entire absolute path or it might just store the path of the parent and then the list of relative paths. but the user need not concern himself with this "denormalization" (aka a node path containing the parent path + relative path).

> I would say rename the Path to ID, if and only if its only passed the
> "shortname" and not the full path.

well this would mean that we essentially split the path in parent path and relative path. i am not sure if this will really be helpful, especially once my change makes it in that moves path "management" to id generators.

> I am not sure how its done at the moment.
>
> Could any of Liip maybe write up some more detailed information about how
> the mapping takes place at the moment? I fail to understand it fully atm.


I guess David and Jordi are the experts here ..

Lukas Kahwe Smith

unread,
Mar 10, 2011, 4:19:25 AM3/10/11
to symfony-...@googlegroups.com
BTW: I would appreciate help implementing the ODM interfaces:
http://www.doctrine-project.org/jira/browse/DCOM-28

Lukas Kahwe Smith

unread,
Mar 11, 2011, 5:14:32 AM3/11/11
to symfony-...@googlegroups.com
Hi,

I am spending a bit of time implementing the common ODM interfaces [1] and things are looking quite good. The last class to fix is the ClassMetadata, where I am looking at aligning things also with a ClassMetadataInfo class just like in MongoDB and ORM. While I am at it I also wanted to add support for Generator configuration. So now is sort of the time for me to decide if to rename "id" to "path".

In post databases the id is a PK which has the following criteria:
1) unique
2) non NULL
3) immutable (though this is not enforced)

Now in our case a path matches 1) and 2) clearly. However it matches 3) "less" than in most other databases, especially since the path contains the parent path as a prefix to the relative node path.

However I think this is still acceptable, since in cases where one is concerned about keeping the path "immutable" one can simply leave a (weak) reference behind. This way the node will then still be available under the previous path and a separate process can then concern itself with maybe eventually cleaning up such references.

So I am still convinced that we should rename "path" to "id".

Now because the parent path is contained in any node's path, we will obviously have to provide facilities to moving entire subnodes as well as some event to easily update a path to match the parent node. For the later I propose we define an interface which defines methods to read both the parent node and the relative path from a Document instance. Anyone implementing this interface can use this both for the path generator as well as in a provided event callback.

regards,
Lukas Kahwe Smith
sm...@pooteeweet.org

[1] http://www.doctrine-project.org/jira/browse/DCOM-28

Lukas Kahwe Smith

unread,
Mar 11, 2011, 7:13:21 AM3/11/11
to symfony-...@googlegroups.com

On 11.03.2011, at 11:14, Lukas Kahwe Smith wrote:

> Now because the parent path is contained in any node's path, we will obviously have to provide facilities to moving entire subnodes as well as some event to easily update a path to match the parent node. For the later I propose we define an interface which defines methods to read both the parent node and the relative path from a Document instance. Anyone implementing this interface can use this both for the path generator as well as in a provided event callback.


Thought about this some more and rather than an interface I think its better to handle this via ClassMetadata, aka long the lines of the ability to inject a node instance, I will add the same option for the parent node and the subpath string.

This means we would have the following terminology:
id: path
node: node instance
parent: parent node instance
subpath: id without the parent path prefix

Benjamin Eberlei

unread,
Mar 11, 2011, 7:54:36 AM3/11/11
to symfony-...@googlegroups.com
injecting the node should really be the last resort. Maybe we should
really go with using the full path explicitly as id, its not the nicest
solution, because the path is actually determined by references between the
different objects, but that happens to be this way then.

greetings,
Benjamin

Lukas Kahwe Smith

unread,
Mar 11, 2011, 8:00:26 AM3/11/11
to symfony-...@googlegroups.com

On 11.03.2011, at 13:54, Benjamin Eberlei wrote:

> injecting the node should really be the last resort. Maybe we should
> really go with using the full path explicitly as id, its not the nicest
> solution, because the path is actually determined by references between the
> different objects, but that happens to be this way then.


injecting the node is supported already. of course this is all optional. however its definitely a must have feature, because otherwise users cannot really easily benefit from all the node traversal and versioning API provided by PHPCR/JCR. so right now i expect most people to have configure things so that PHPCR will inject the jackalope node instance.

Benjamin Eberlei

unread,
Mar 11, 2011, 8:08:51 AM3/11/11
to symfony-...@googlegroups.com
Injecting the node is currently only necessary because the PHPCR ODM has
no features with regard to associations (parent and children). If that is
implemented injecting the node becomes completly unnecessary. The
Versioning API should work by passing a PHPCR Document to the
DocumentManager versioning API methods.

/**
* @jcr:document
*/
class Article
{
/** @jcr:parent(targetEntity="Blog") */
private $parent;

/** @jcr:children(filter="match only the comment children",
targetEntity="Comment") */
private $comments;

Lukas Kahwe Smith

unread,
Mar 11, 2011, 8:23:59 AM3/11/11
to symfony-...@googlegroups.com

On 11.03.2011, at 14:08, Benjamin Eberlei wrote:

> Injecting the node is currently only necessary because the PHPCR ODM has
> no features with regard to associations (parent and children). If that is
> implemented injecting the node becomes completly unnecessary. The
> Versioning API should work by passing a PHPCR Document to the
> DocumentManager versioning API methods.
>
> /**
> * @jcr:document
> */
> class Article
> {
> /** @jcr:parent(targetEntity="Blog") */
> private $parent;
>
> /** @jcr:children(filter="match only the comment children",
> targetEntity="Comment") */
> private $comments;
> }


hmm i see. we would then support different fetch modes to also support lazy loading. so yeah in this case the need to access the node will become rare enough that in those cases one would just take the path (aka id) from the Document and talk to jackalope directly.

as for versioning: this isnt part of the node interface in JCR-283 anymore, since they wanted to keep the node interface from getting too big. so here we would still need some API in PHPCR ODM to easily be able to talk to the Jackalope version history manager but still get things cast into PHPCR ODM Document instances.

but this only changes things in so far as when i before talked about injecting the parent node, it would now inject the parent Document when configuring things accordingly. speaking of which since a single node in theory can have multiple parents we will probably need to support some configuration if the parent should be a collection or if it should assume there is only a single parent (and raise an exception if not).

so what i am trying to say is:
- in terms of renaming path to id nothing has changed for me
- we would however drop the support for injecting the node in favor of making it possible to inject (with lazy loading support etc just like in the ORM/ODM) parent/child Document instances

Lukas Kahwe Smith

unread,
Mar 11, 2011, 8:31:06 AM3/11/11
to symfony-...@googlegroups.com


one more thing:
so how would now move a node to a new place within the node graph?
would we simply on flush() check if the id (aka path) has changed?
maybe optionally in case the parent Document is injected we would instead check that and then automatically update the path.
similarly we would support automatically updating the path on flush() in case the subpath is changed

David Buchmann

unread,
Mar 11, 2011, 10:15:02 AM3/11/11
to symfony-...@googlegroups.com, Lukas Kahwe Smith
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi lukas,

good news that you are making progress, looking forward to these changes.

i think using path as id sounds good. unique and not null is guaranteed.
immutable is not exactly guaranteed, but as long as you work within the
phpcr repository only, you can use node references as you said. and if
you need references from outside and expect paths to change, you can go
for the uuid.
the discussion about leaving references when moving: you would not do
that in phpcr-odm (or at least not by default, right?)

on changing path property to move nodes:
changing the path property could give some funny results. if i have
/some/path/parent/node and set the path of "parent" to
/some/other/path/parent and in the same session change the path of node
to /some/path/parent/something/node, result will depend on the order. if
you first move parent, then node, node ends up at the expected path. if
you first move node, then parent, node will end up in
/some/other/path/parent/something/node, because a node is moved with all
its children. (unless phpcr-odm changes this behaviour and does not do a
$session->move() )

in phpcr, you can only add nodes as child of existing nodes, and moving
is only possible if the parent already exists.
the second rule could make sense for phpcr-odm too. the first one is
currently not respected if i am right? phpcr-odm just adds
nt:unstructured nodes to have the parents unless i am mistaken.

about the hierarchy stuff: it would be awesome to have parent and child
information known to doctrine. i dug up an older email from this list,
in reply to jordi asking about relations.
for the parent i just remember that jcr-283 actually allows cyclic
graphs, in other words a node can have more than one parent. however,
the api still has the getParent method that returns you exactly one node.

Bulat Shakirzyanov - 25.1.2011

Well in MongDB ODM along with

* @mongodb:ReferenceOne(targetDocument="SomeClass")
* @mongodb:ReferenceMany(targetDocument="SomeClass")

we have

* @mongodb:EmbedOne(targetDocument="SomeClass") and
* @mongodb:EmbedMany(targetDocument="SomeClass")


i think we will still need to access the phpcr nodes for other things
(there is a couple of modules in jcr-283, and we would really either be
copying the api or rewriting it in different methods.)
if there are other areas all doctrine-odm does in a consistent way, it
would of course make sense to map this in phpcr-odm if we can model it
on top of phpcr.

cheers,
david

- --

Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk16PHYACgkQqBnXnqWBgIt9gQCgrLqOGH/Y+xAozJ93Mco9GXhi
tqkAoMwRbjsGv9mFooglBPQWb8iF0deA
=6xA+
-----END PGP SIGNATURE-----

Lukas Kahwe Smith

unread,
Mar 11, 2011, 10:26:03 AM3/11/11
to David Buchmann, symfony-...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 11.03.2011, at 16:15, David Buchmann wrote:

> the discussion about leaving references when moving: you would not do
> that in phpcr-odm (or at least not by default, right?)

not by default. we might just provide facilities to make this easier. but the big question is more what you discuss below, aka how to move stuff in the node tree. once we have that its easy to make it configure to leave references in case we want to make this easier.

> on changing path property to move nodes:
> changing the path property could give some funny results. if i have
> /some/path/parent/node and set the path of "parent" to
> /some/other/path/parent and in the same session change the path of node
> to /some/path/parent/something/node, result will depend on the order. if
> you first move parent, then node, node ends up at the expected path. if
> you first move node, then parent, node will end up in
> /some/other/path/parent/something/node, because a node is moved with all
> its children. (unless phpcr-odm changes this behaviour and does not do a
> $session->move() )

"same session" you mean "same flush".
the main tricky bit is that of course with such moving objects could get out of sync. then again Doctrine2 has a reference to all of these objects so we could automatically get them in sync again.

as for doing crazy stuff that leads to breakage, wouldnt the fact that we do the flush inside a transaction cause the transaction to fail in the storage backend (aka Jackrabbit) and therefore we need not worry about that too much?

> in phpcr, you can only add nodes as child of existing nodes, and moving
> is only possible if the parent already exists.

right. every work space has a single root node.

> the second rule could make sense for phpcr-odm too. the first one is
> currently not respected if i am right? phpcr-odm just adds
> nt:unstructured nodes to have the parents unless i am mistaken.

not sure i understand.

> about the hierarchy stuff: it would be awesome to have parent and child
> information known to doctrine. i dug up an older email from this list,
> in reply to jordi asking about relations.
> for the parent i just remember that jcr-283 actually allows cyclic
> graphs, in other words a node can have more than one parent. however,
> the api still has the getParent method that returns you exactly one node.
>
> Bulat Shakirzyanov - 25.1.2011
>
> Well in MongDB ODM along with
>
> * @mongodb:ReferenceOne(targetDocument="SomeClass")
> * @mongodb:ReferenceMany(targetDocument="SomeClass")
>
> we have
>
> * @mongodb:EmbedOne(targetDocument="SomeClass") and
> * @mongodb:EmbedMany(targetDocument="SomeClass")

right .. we should definitely do references.

> i think we will still need to access the phpcr nodes for other things
> (there is a couple of modules in jcr-283, and we would really either be
> copying the api or rewriting it in different methods.)
> if there are other areas all doctrine-odm does in a consistent way, it
> would of course make sense to map this in phpcr-odm if we can model it
> on top of phpcr.


well the question is how often do you need these APIs? because after all you can always take the path from a Document instance and fetch the node from jackalope directly. not sure of jackalope already does, but i would expect jackalope to have a simple map path -> node instance to check before fetching a node from the backend again. so you should easily with little overhead be able to fetch the jackalope node instance for any Document instance.

regards,
Lukas Kahwe Smith
sm...@pooteeweet.org

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAk16PwsACgkQ2FnmJDTfRPdtrwCeKOYX1KDZcD/Tzo9r3s0PX6oq
mckAn2JHrRFiXjQsVZSjqAfMkLnA3G74
=+RED
-----END PGP SIGNATURE-----

David Buchmann

unread,
Mar 12, 2011, 3:38:25 AM3/12/11
to symfony-...@googlegroups.com, Lukas Kahwe Smith
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi lukas,

>> on changing path property to move nodes:
>> changing the path property could give some funny results. if i have
>> /some/path/parent/node and set the path of "parent" to
>> /some/other/path/parent and in the same session change the path of
>> node to /some/path/parent/something/node, result will depend on the
>> order. if you first move parent, then node, node ends up at the
>> expected path. if you first move node, then parent, node will end up
>> in /some/other/path/parent/something/node, because a node is moved
>> with all its children. (unless phpcr-odm changes this behaviour and
>> does not do a $session->move() )
>
> "same session" you mean "same flush".

arr... yes, exactly. in PHPCR, its SessionInterface::save(). i should
have said transaction instead of session.

> the main tricky bit is that of course with such moving objects could
> get out of sync. then again Doctrine2 has a reference to all of these
> objects so we could automatically get them in sync again.

jackalope should record the changes in order. (not sure if everything is
correctly implemented on that end atm, though. we should make sure that
we have enough tests that check this. i will look into that)

does odm also know keep the order of operations? well, i guess it could
at max keep the order of the calls to persist? if it does, then its the
responsibility of the user to make sure he does what he wants and takes
care of the right order. but there could be surprises if you are not
aware of the hierarchical nature of things.

> as for doing crazy stuff that leads to breakage, wouldnt the fact
> that we do the flush inside a transaction cause the transaction to
> fail in the storage backend (aka Jackrabbit) and therefore we need
> not worry about that too much?

if you try something that is violating constraints (for example with
node types restricting allowed children), phpcr-odm will get a phpcr
exception on flush, yes. (most of the constraints are only checked on
the flush -> PHPCR Session->save() operation, and not immediatly because
we rely on the jackrabbit backend to check things)

but my example is not leading to breakage, its totally ok for phpcr. but
if order of operations is not maintained, the result is unpredictable.

>> in phpcr, you can only add nodes as child of existing nodes, and
>> moving is only possible if the parent already exists.
>
> right. every work space has a single root node.
>
>> the second rule could make sense for phpcr-odm too. the first one is
>> currently not respected if i am right? phpcr-odm just adds
>> nt:unstructured nodes to have the parents unless i am mistaken.
>
> not sure i understand.

i think in phpcr-odm you can add a node at /path/to/node to an empty
repository and phpcr-odm would create nodes "path" and "to" in order to
attach "node" at the right place. or am i wrong about this?
if its true, this is a bit different behaviour from phpcr where you
create the tree node by node.

>> i think we will still need to access the phpcr nodes for other things
>> (there is a couple of modules in jcr-283, and we would really either
>> be copying the api or rewriting it in different methods.)
>> if there are other areas all doctrine-odm does in a consistent way,
>> it would of course make sense to map this in phpcr-odm if we can
>> model it on top of phpcr.
>
> well the question is how often do you need these APIs? because after
> all you can always take the path from a Document instance and fetch
> the node from jackalope directly. not sure of jackalope already does,
> but i would expect jackalope to have a simple map path -> node
> instance to check before fetching a node from the backend again. so
> you should easily with little overhead be able to fetch the jackalope
> node instance for any Document instance.

totally agree. i wrote this to support your position on the topic :-)
my point was this can be done at some point when needed, but i see no
need now.
and yes, jackalope caches nodes it loaded during the session, so getting
the node from the phpcr session instead of from doctrine would add just
one more method call and one array lookup, this seems totally acceptable
for me.

cheers,david

- --

Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch

-----BEGIN PGP SIGNATURE-----


Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk17MP0ACgkQqBnXnqWBgItiLgCgtB+k/1m7b8ugcjJg1zmCQW4U
glsAoJezDNNUFkW7JfXuA1uQhjXfHGIp
=MNB3
-----END PGP SIGNATURE-----

Benjamin Eberlei

unread,
Mar 13, 2011, 6:08:38 AM3/13/11
to symfony-...@googlegroups.com
We should really think of how the internals work when object references
change in terms of examples:

/jcr:root
/jcr:root/articles/foo/comments/bar
/jcr:root/articles/bar/comments/baz

Now in code we move a comment from article "foo", to article "bar"

$comment = $fooArticle->comments[0];
unset($fooArticle->comment[0]);
$barArticle->commments[] = $comment;
$dm->flush();

At that point the only thing that has changed is the two collections, i.e.
comment removed from one collection and added to another.

So: When adding an object to a collection the path changes. The DM
UnitOfWork has to detect that.

Next Use case with the previous example, renaming "foo" article to "baz"
article. This means all the subpaths of "foo" also change. For this reason
I think we need an identity map that is path based, so that we can find all
objects below a certain path for in memory changes.

Also this operation should use a move() operation on jackalope (if such a
thing exists) to be efficient on the storage side.

This are just two use-cases, which end up pretty complicated. Having the
node inside the Document makes it easier maybe, but then PHPCR ODM is of no
use, importing the node is like injecting a database connection into an
entity.

I can probably think of much more use-cases. We have to really make a list
and start finding a way how we can handle all this internally, thinking
about the datastructure, otherwise this whole concept will end up with lots
of special cases that break the usage. It has to be simple to use and just
work with associations and traversering, modifying them.


On Sat, 12 Mar 2011 09:38:25 +0100, David Buchmann
<david.b...@liip.ch>
wrote:

Lukas Kahwe Smith

unread,
Mar 13, 2011, 8:49:33 AM3/13/11
to symfony-...@googlegroups.com

On 13.03.2011, at 11:08, Benjamin Eberlei wrote:

> We should really think of how the internals work when object references
> change in terms of examples:
>
> /jcr:root
> /jcr:root/articles/foo/comments/bar
> /jcr:root/articles/bar/comments/baz
>
> Now in code we move a comment from article "foo", to article "bar"
>
> $comment = $fooArticle->comments[0];
> unset($fooArticle->comment[0]);
> $barArticle->commments[] = $comment;
> $dm->flush();
>
> At that point the only thing that has changed is the two collections, i.e.
> comment removed from one collection and added to another.
>
> So: When adding an object to a collection the path changes. The DM
> UnitOfWork has to detect that.

in the same spirit we would need to define what happens if the reference in the previous collection isnt unset():
1) clone
2) set a reference
3) set a weak reference

> Next Use case with the previous example, renaming "foo" article to "baz"
> article. This means all the subpaths of "foo" also change. For this reason
> I think we need an identity map that is path based, so that we can find all
> objects below a certain path for in memory changes.
>
> Also this operation should use a move() operation on jackalope (if such a
> thing exists) to be efficient on the storage side.

i think it already has such an in memory map, but i am not 100% sure.

> This are just two use-cases, which end up pretty complicated. Having the
> node inside the Document makes it easier maybe, but then PHPCR ODM is of no
> use, importing the node is like injecting a database connection into an
> entity.
>
> I can probably think of much more use-cases. We have to really make a list
> and start finding a way how we can handle all this internally, thinking
> about the datastructure, otherwise this whole concept will end up with lots
> of special cases that break the usage. It has to be simple to use and just
> work with associations and traversering, modifying them.


yes this is probably the right approach. aka looking for use cases that would lead to manipulations of the path.

so i think the game plan from my POV for next 1-2 weeks:
- go through and define as many unique use cases as possible
- define the API based on this but keep an eye on the existing ODM's
- implement the changes

i will talk to Liip mgmt to ensure we can allocate the necessary resources, but obviously now is the time for others to join in to help make this happen. i expect this to be the last major API refactoring of the PHPCR ODM that we need a head of our first release. but i also feel like we should move this forward quickly because everything else needs to build on this.

David Buchmann

unread,
Mar 13, 2011, 9:17:29 AM3/13/11
to symfony-...@googlegroups.com, Lukas Kahwe Smith
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>> Also this operation should use a move() operation on jackalope (if such a


>> thing exists) to be efficient on the storage side.
>
> i think it already has such an in memory map, but i am not 100% sure.

yes, PHPCR defines a Session->move(srcpath, targpath) operation that
moves a node and all its children somewhere else.

>> This are just two use-cases, which end up pretty complicated. Having the
>> node inside the Document makes it easier maybe, but then PHPCR ODM is of no
>> use, importing the node is like injecting a database connection into an
>> entity.
>>
>> I can probably think of much more use-cases. We have to really make a list
>> and start finding a way how we can handle all this internally, thinking
>> about the datastructure, otherwise this whole concept will end up with lots
>> of special cases that break the usage. It has to be simple to use and just
>> work with associations and traversering, modifying them.
>
> yes this is probably the right approach. aka looking for use cases that would lead to manipulations of the path.

i wonder if we really want this kind of "implicit move". it seems like
we have to define a lot of things and implement a lot of logic.
at the same time, it will become much more complicated for the user to
understand what he is doing implicitly with some setter operation.
instead we could restrict the odm to just add and remove stuff and
declare the path immutable.
when the user needs a move operation, he can explicitly do that through
phpcr (or phpcr exposed with explicit move and clone operations by
phpcr-odm)

thus, references would still be possible, but containment would be read
only. changing hierarchy would happen through explicit move operations.


just in case you prefer the other way, some thoughts on adding an entity
somewhere else:

> in the same spirit we would need to define what happens if the
> reference in the previous collection isnt unset():
> 1) clone
> 2) set a reference
> 3) set a weak reference

maybe have multiple parents? i think all of the 3 propositions would be
unexpected behaviour

http://www.day.com/specs/jcr/2.0/3_Repository_Model.html#3.9%20Shareable%20Nodes%20Model

i think references should only be used for reference type, never for
containment.

cheers,david


- --
Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk18w+UACgkQqBnXnqWBgIvlcwCdHLkDDIVJwpUX/6UMS0RylpN2
5hMAn3/AiaAm4FdM65f8oZ+hdbyOpGqh
=ARvv
-----END PGP SIGNATURE-----

Uwe Jäger

unread,
Mar 14, 2011, 6:01:38 AM3/14/11
to symfony-...@googlegroups.com
Hi,

already a bit late in the discussion but i agree with David, that keeping things simple in the ODM will save us trouble and effort. PHPCR or JCR basically differs from other data stores that it is a hierarchical storage (the most common storage of that kind that comes into my mind is a filesystem). The filesystem could be an analogy for the ODM operations that everybody should easily understand. I also think that the PHPCR-ODM should expose that hierarchical structure in the API and should differ from other ODMs (what would the point be to use PHPCR anyway if it works the same as any other storage). The main practical reason to use the ODM is actually the possibility to use simple PHP objects (I need to get used to write POPOs ...) in you code that then easily integrates with other components.

While we work along the line to get more cm-functionality we will see what is really needed and useful in the API.

Kind regards
Uwe


2011/3/13 David Buchmann <david.b...@liip.ch>

David Buchmann

unread,
Mar 14, 2011, 6:16:29 AM3/14/11
to symfony-...@googlegroups.com, Uwe Jäger
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi,

i checked some things on PHPCR and i think there is one operation we
should ask the user to do it through the odm rather than directly on
phpcr: Session::move

https://github.com/doctrine/phpcr-odm/wiki/Tree-operations

cheers,
david

Am 14.03.2011 11:01, schrieb Uwe J�ger:
> Hi,
>
> already a bit late in the discussion but i agree with David, that
> keeping things simple in the ODM will save us trouble and effort. PHPCR
> or JCR basically differs from other data stores that it is a
> hierarchical storage (the most common storage of that kind that comes
> into my mind is a filesystem). The filesystem could be an analogy for
> the ODM operations that everybody should easily understand. I also think
> that the PHPCR-ODM should expose that hierarchical structure in the API
> and should differ from other ODMs (what would the point be to use PHPCR
> anyway if it works the same as any other storage). The main practical
> reason to use the ODM is actually the possibility to use simple PHP
> objects (I need to get used to write POPOs ...) in you code that then
> easily integrates with other components.
>
> While we work along the line to get more cm-functionality we will see
> what is really needed and useful in the API.
>
> Kind regards
> Uwe
>
>
> 2011/3/13 David Buchmann <david.b...@liip.ch

> <mailto:david.b...@liip.ch>>

iEYEARECAAYFAk196v0ACgkQqBnXnqWBgIs12QCeI/m1diVa1rA5oe3Ik+b0lvsX
I0kAnA0JZ0skiYVTlsEJfxMEuNvmnF7M
=UTn2
-----END PGP SIGNATURE-----

Uwe Jäger

unread,
Mar 14, 2011, 7:06:11 AM3/14/11
to symfony-...@googlegroups.com
Hi,

yes, agree. And having getParent/getChildren might be useful as well. In case some nodes are not mapped by the ODM we need either to ignore them or have a generic "document" that has the node injected.

The clone opertations mentioned in the wiki page could be helpful when you want to implement a kind of staging system. But I think that is beyond the scope of the ODM ...

Cheers
Uwe


2011/3/14 David Buchmann <david.b...@liip.ch>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi,

i checked some things on PHPCR and i think there is one operation we
should ask the user to do it through the odm rather than directly on
phpcr: Session::move

https://github.com/doctrine/phpcr-odm/wiki/Tree-operations

cheers,
david

Lukas Kahwe Smith

unread,
Mar 14, 2011, 8:02:44 AM3/14/11
to symfony-...@googlegroups.com
Hi,

I was just talking to David over lunch.
In ODM's there is the distinction between references and embedded documents.

Now for a referenced Document, assigning it to another Document as a child, it would set a (weak) reference. For an embedded Document it might however do a clone or a move. Anyway, I will try to work on a use case overview tonight.

Everybody is invited to help once I have created that page, or start it before me.

A few more notes on the aims of PHPCR ODM from my POV:
- making it easier for developers to take their ODM knowledge and apply it to the CMF
- making it easier to share code that works with other ODM's

Now neither of these are binary (aka it might end up that things are 80% compatible but thats already better than only 30%).

But everything that doesnt sensible fit within the ODM API, we will just support via Jackalope. For example when one manually wants to create a transaction with the ORM, its done on the DBAL connection. We do not duplicate the method on the EntityManager. In the same spirit if there is something PHPCR specific for which we do not find a sensible representation that matches other ODM's we should probably just require that the user deal directly with Jackalope.

regards,
Lukas

Lukas Kahwe Smith

unread,
Mar 15, 2011, 5:11:45 AM3/15/11
to symfony-...@googlegroups.com
Hi,

I just quickly wrote down what I have learned from this thread so far:
https://github.com/doctrine/phpcr-odm/wiki/Path-handling

Lukas Kahwe Smith

unread,
Mar 17, 2011, 11:21:04 AM3/17/11
to symfony-...@googlegroups.com

On 15.03.2011, at 10:11, Lukas Kahwe Smith wrote:

> Hi,
>
> I just quickly wrote down what I have learned from this thread so far:
> https://github.com/doctrine/phpcr-odm/wiki/Path-handling


In case it wasnt obvious .. I am kind of waiting on feedback on this :)

Also help in fleshing out the concrete examples would be appreciated. Especially if we do decide to leverage the separation between path renaming and referencing via embedded/reference annotations as used in MongoDB.

Uwe Jäger

unread,
Mar 21, 2011, 8:46:20 AM3/21/11
to symfony-...@googlegroups.com
Hi there,

just my comments:

The JCR defines a hierarchical data storage which has no equivalent in MongoDB or CouchDB. I agree with Lukas that we should have the stuff that is the same in all ODMs named the same way and working the same way. But what should we do with the things that are different?

1. Path: while the path acts like an id and the ODM and code on top actually has no use of the internal id used by the jcr, it also expresses more than the well known id: first we learn where in the hierarchy the node lives (this is also used by the ODM to create nodes) and the path can act as a human readable textual representation of the node (aka slugs), for pages it could map directly to urls. So I would prefer to keep the path named path. The jcr id could be exposed in case someone really needs it.

2. Changing paths: as Lukas points out changing the path of a node through the ODM can have different reasons and maybe different results are expected. In order to keep things simple (at least for now) I propose that there be a move operation (maybe in the DocumentManager) or PHPCR must be used. Oh, and path should be immutable ...

3. References: they should be implemented the JCR way: the document has the Reference annotation, the underlying node a property of either type REFERENCE or WEAKREFERENCE. The difference could be specified by a flag weak of the annotation, defaults to true. The JCR spec requires referenced nodes to have the type mix:referencable, so either this must be declared on the document or added to the node while it is detected that the node will be referenced. The latter would require a check everytime ...

4. Parent-Child relation: this is a different concept than embedded documents and basically a USP for JCR (compared to the others mentioned above). While in most cases a child will be created by giving it the right path sometimes it might be easier to have this cascaded by using a child annotation. That way the ODM will take care of creating the specified children and also pull the children from the repository when the parent is fetched. Since the children are just ordinary nodes they can be fetched, modified and referenced just like any other node. Btw this was quite useful while trying to implement files in the ODM.

5. PHPCR: for things not in the ODM we can always use PHPCR directly. But that makes all code on top of the ODM dependent on PHPCR. So if there is anything PHPCR-special that can be abstracted by the ODM it should be done because it makes application code on top simpler and maybe a bit more portable. 

Kind regards
Uwe

2011/3/17 Lukas Kahwe Smith <sm...@pooteeweet.org>

Lukas Kahwe Smith

unread,
Mar 21, 2011, 9:11:17 AM3/21/11
to symfony-...@googlegroups.com

On 21.03.2011, at 13:46, Uwe Jäger wrote:

> Hi there,
>
> just my comments:
>
> The JCR defines a hierarchical data storage which has no equivalent in MongoDB or CouchDB. I agree with Lukas that we should have the stuff that is the same in all ODMs named the same way and working the same way. But what should we do with the things that are different?
>
> 1. Path: while the path acts like an id and the ODM and code on top actually has no use of the internal id used by the jcr, it also expresses more than the well known id: first we learn where in the hierarchy the node lives (this is also used by the ODM to create nodes) and the path can act as a human readable textual representation of the node (aka slugs), for pages it could map directly to urls. So I would prefer to keep the path named path. The jcr id could be exposed in case someone really needs it.

With "jcr id" you mean uuid?
My point is the fact that a JCR path also provides more than an id, doesnt mean that we shouldnt rename it to id for better compatibility with the other ODM's. Aka having it renamed to id doesnt mean you cannot also map things so that the id (or rather subpath) is also included in a slug property for PHPCR ODM, while its mapped and stored separately in MongoDB.

As for the hierarchy. I think in practice one would rather want to have a way to access parent or child Document instances rather than introspecting the path, especially since a path can also be misleading in case of references and multiple parents.

> 3. References: they should be implemented the JCR way: the document has the Reference annotation, the underlying node a property of either type REFERENCE or WEAKREFERENCE. The difference could be specified by a flag weak of the annotation, defaults to true. The JCR spec requires referenced nodes to have the type mix:referencable, so either this must be declared on the document or added to the node while it is detected that the node will be referenced. The latter would require a check everytime ...

Right aside from requiring "reference-able" and giving the option of "weak" references, this is fairly in line with how things are in the other ODM's. The other ODM's also support "properties" on references which we can use to flag if the reference is weak or not, though imho we should default to "weak".

As for reference-able, we can determine if a node is referenced since we have the complete mapping configuration, so we can handle to automatically set this, though for the "paranoid" we could make this behavior optional. Also I am not sure if this is relevant for us in theory there could also be references to node's which are not mapped in PHPCR ODM, for which we would then not be able to automatically set reference-able. Oh and of course if a previously unreferenced node becomes referenced we wouldnt update the nodes to become referenceable at runtime, though we might provide some tool to update existing nodes.

> 4. Parent-Child relation: this is a different concept than embedded documents and basically a USP for JCR (compared to the others mentioned above). While in most cases a child will be created by giving it the right path sometimes it might be easier to have this cascaded by using a child annotation. That way the ODM will take care of creating the specified children and also pull the children from the repository when the parent is fetched. Since the children are just ordinary nodes they can be fetched, modified and referenced just like any other node. Btw this was quite useful while trying to implement files in the ODM.

Yeah, I agree that probably making "children" equivalent to "embedded" was stretching things a bit too far. Then again at least with MongoDB you can interact quite easily with embedded documents directly as well.

The main reason why I proposed to think of them in equivalent terms was more about a "special" kind of child relation and not the general child relations.

Aka if you have a page tree, then I wouldnt consider subpages to be "embedded documents". Rather what I meant was content that really is considered inseparable with the parent, a concept that doesnt really exist in JCR, but that still might be useful as when it comes to fetching content with fewer roundtrips.

The idea being that if I have a blog post and I attach a specific image to that blog post (instead of placing it in the media manager and referencing it in my blog post) along with a teaser and a body, then while I might store all of these as children, but in effect I will always fetch the parent and children as one. This is then fairly similar to embedded documents.

> 5. PHPCR: for things not in the ODM we can always use PHPCR directly. But that makes all code on top of the ODM dependent on PHPCR. So if there is anything PHPCR-special that can be abstracted by the ODM it should be done because it makes application code on top simpler and maybe a bit more portable.

Right. But if we have to invent a specialized API to expose the feature in PHPCR ODM for which we already have a useable API in Jackalope it would be needless maintenance and runtime overhead to wrap it into another API in PHPCR ODM. In the end Jackalope is a hard dependency anyway already, so all that we have to take care is to make sure that the Jackalope instance is easily reachable. For this we should add a method on the DocumentManager along the lines of getConnection() in the other ODM's.

Uwe Jäger

unread,
Mar 21, 2011, 10:20:14 AM3/21/11
to symfony-...@googlegroups.com
Hi,

2011/3/21 Lukas Kahwe Smith <sm...@pooteeweet.org>


On 21.03.2011, at 13:46, Uwe Jäger wrote:

> Hi there,
>
> just my comments:
>
> The JCR defines a hierarchical data storage which has no equivalent in MongoDB or CouchDB. I agree with Lukas that we should have the stuff that is the same in all ODMs named the same way and working the same way. But what should we do with the things that are different?
>
> 1. Path: while the path acts like an id and the ODM and code on top actually has no use of the internal id used by the jcr, it also expresses more than the well known id: first we learn where in the hierarchy the node lives (this is also used by the ODM to create nodes) and the path can act as a human readable textual representation of the node (aka slugs), for pages it could map directly to urls. So I would prefer to keep the path named path. The jcr id could be exposed in case someone really needs it.

With "jcr id" you mean uuid?

Yes, I mean jcr:uuid as the property is called in JCR.
 
My point is the fact that a JCR path also provides more than an id, doesnt mean that we shouldnt rename it to id for better compatibility with the other ODM's. Aka having it renamed to id doesnt mean you cannot also map things so that the id (or rather subpath) is also included in a slug property for PHPCR ODM, while its mapped and stored separately in MongoDB.

I can live with renaming path to id ... :-)
 
As for the hierarchy. I think in practice one would rather want to have a way to access parent or child Document instances rather than introspecting the path, especially since a path can also be misleading in case of references and multiple parents.

Yes, see below. I think in most cases you will already know the parent when you traverse the hierarchy. But it  is easy to add a @Parent to the mapping. In case a repository node is not mapped by the ODM we can simply return the PHPCR node or wrap it in a "Node" document.

No opinion on Changing paths?
 
> 3. References: they should be implemented the JCR way: the document has the Reference annotation, the underlying node a property of either type REFERENCE or WEAKREFERENCE. The difference could be specified by a flag weak of the annotation, defaults to true. The JCR spec requires referenced nodes to have the type mix:referencable, so either this must be declared on the document or added to the node while it is detected that the node will be referenced. The latter would require a check everytime ...

Right aside from requiring "reference-able" and giving the option of "weak" references, this is fairly in line with how things are in the other ODM's. The other ODM's also support "properties" on references which we can use to flag if the reference is weak or not, though imho we should default to "weak".

As for reference-able, we can determine if a node is referenced since we have the complete mapping configuration, so we can handle to automatically set this, though for the "paranoid" we could make this behavior optional. Also I am not sure if this is relevant for us in theory there could also be references to node's which are not mapped in PHPCR ODM, for which we would then not be able to automatically set reference-able. Oh and of course if a previously unreferenced node becomes referenced we wouldnt update the nodes to become referenceable at runtime, though we might provide some tool to update existing nodes.

I think we might not know in advance if a document will be referenced since you can keep the reference type free. So I would opt for declaring something as reference-able in the Document annotation. That way the required mixin will be added on node creation and doesn't need to be checked. Same now with the versionable. 
 
> 4. Parent-Child relation: this is a different concept than embedded documents and basically a USP for JCR (compared to the others mentioned above). While in most cases a child will be created by giving it the right path sometimes it might be easier to have this cascaded by using a child annotation. That way the ODM will take care of creating the specified children and also pull the children from the repository when the parent is fetched. Since the children are just ordinary nodes they can be fetched, modified and referenced just like any other node. Btw this was quite useful while trying to implement files in the ODM.

Yeah, I agree that probably making "children" equivalent to "embedded" was stretching things a bit too far. Then again at least with MongoDB you can interact quite easily with embedded documents directly as well.

The main reason why I proposed to think of them in equivalent terms was more about a "special" kind of child relation and not the general child relations.

Aka if you have a page tree, then I wouldnt consider subpages to be "embedded documents". Rather what I meant was content that really is considered inseparable with the parent, a concept that doesnt really exist in JCR, but that still might be useful as when it comes to fetching content with fewer roundtrips.

The idea being that if I have a blog post and I attach a specific image to that blog post (instead of placing it in the media manager and referencing it in my blog post) along with a teaser and a body, then while I might store all of these as children, but in effect I will always fetch the parent and children as one. This is then fairly similar to embedded documents.

That would be the main use for such a mapping, agree, but I would vote for Child. With EmbeddedDocument there is the notion that something is stored in the node as property (of course you could do this: serialize the document and put it in a property ...).
 
> 5. PHPCR: for things not in the ODM we can always use PHPCR directly. But that makes all code on top of the ODM dependent on PHPCR. So if there is anything PHPCR-special that can be abstracted by the ODM it should be done because it makes application code on top simpler and maybe a bit more portable.

Right. But if we have to invent a specialized API to expose the feature in PHPCR ODM for which we already have a useable API in Jackalope it would be needless maintenance and runtime overhead to wrap it into another API in PHPCR ODM. In the end Jackalope is a hard dependency anyway already, so all that we have to take care is to make sure that the Jackalope instance is easily reachable. For this we should add a method on the DocumentManager along the lines of getConnection() in the other ODM's.

getPhpcrSession() is already there ...

Kind regards
Uwe

Lukas Kahwe Smith

unread,
Mar 21, 2011, 10:26:48 AM3/21/11
to symfony-...@googlegroups.com

On 21.03.2011, at 15:20, Uwe Jäger wrote:

> Yes, see below. I think in most cases you will already know the parent when you traverse the hierarchy. But it is easy to add a @Parent to the mapping. In case a repository node is not mapped by the ODM we can simply return the PHPCR node or wrap it in a "Node" document.

Yes, I agree it would be good to add some annotation to include the parent(s) and children Document instances for easy access (including support for lazy loading).

> I think we might not know in advance if a document will be referenced since you can keep the reference type free. So I would opt for declaring something as reference-able in the Document annotation. That way the required mixin will be added on node creation and doesn't need to be checked. Same now with the versionable.

Yeah, agreed. This however will not apply to already existing and stored node isntances of that Document type. This is where I meant that some tool to resync this could come in handy. We can definitely support "eventually migrations" along the lines of MongoDB ODM:
http://www.doctrine-project.org/docs/mongodb_odm/1.0/en/reference/migrating-schemas.html

> Right. But if we have to invent a specialized API to expose the feature in PHPCR ODM for which we already have a useable API in Jackalope it would be needless maintenance and runtime overhead to wrap it into another API in PHPCR ODM. In the end Jackalope is a hard dependency anyway already, so all that we have to take care is to make sure that the Jackalope instance is easily reachable. For this we should add a method on the DocumentManager along the lines of getConnection() in the other ODM's.
>
> getPhpcrSession() is already there ...

Ah ok .. we might want to rename it to getConnection() though :)

David Buchmann

unread,
Mar 21, 2011, 2:26:20 PM3/21/11
to symfony-...@googlegroups.com
hi,

>> Yes, see below. I think in most cases you will already know the parent when you traverse the hierarchy. But it is easy to add a @Parent to the mapping. In case a repository node is not mapped by the ODM we can simply return the PHPCR node or wrap it in a "Node" document.
>
> Yes, I agree it would be good to add some annotation to include the parent(s) and children Document instances for easy access (including support for lazy loading).

if for example you render a page with data identified by the path, you
do not necessarily know parents. in the cmf NavigationBundle, i
currently use phpcr to traverse the tree.
it would be convenient to support the Node::getParent method (where you
specify the depth) and directly wrap into your model class.
of course this is wrapping another layer around the phpcr api, but its
nicer than working on phpcr and then again with doctrine to get your
model class.

lazy loading would be a must, otherwise one request would load up the
whole tree...
we might want to do this even for the embed relation - or have some
option when loading a node to tell doctrine-odm not to load the embedded
nodes. you might not always need them and it could be possibly expensive.
* can one have nested embeds?
* is the embeded entity itself an entity (that i could load directly)?


oh, and one more point for the path as id. i think it was mentioned
sometimes, but: nodes that are not referencable are not required to have
a uuid. internally jackrabbit seems to do it, but that is implementation
specific and thus relying on it would be not very proper.

cheers,
david

David Buchmann

unread,
Mar 21, 2011, 4:06:22 PM3/21/11
to symfony-...@googlegroups.com, Lukas Kahwe Smith
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi lukas,

i finally read the use cases. i think they cover about what we have to
support. just two remarks:

> Creating a new Document instance and storing it

=> we definitely need a way to specify where to attach the document to.
people should be encouraged to structure their content. the id generator
should only be a last resort. if you do not want to structure where you
store your content, you should probably use mongo anyway.

> Renaming a subpath of a node

i am not sure if i understand fully. by path you mean the path to the
current document, right? should we not call this "move"?
in the doc it should also be called rename a document, because moving is
what has to be used to change the name. (the name used in the path, of
course, not any kind of name property you might want to set)


cheers,david


oh, by the way, do we have a annotation to access Node::getName(), that
is the element name of the node? we could use the last part of path, of
course, but it could be important enough to merit its own annotation.

i documented the available annotations in the readme, but did not find
that. can somebody check if the list is correct or if i missed something?

when we decide to really use path as id, we should also add some
information to the path annotation that this is defining the desired
path where to store a node and immutable after persisting the node.

- --

Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2Hr74ACgkQqBnXnqWBgIsS7QCgiM8j119MRBvLdJ0vD3oZ/+R3
O8YAn2C2DoLKtjkm2GLSkmUsxixYefGx
=y97K
-----END PGP SIGNATURE-----

Lukas Kahwe Smith

unread,
Mar 21, 2011, 5:30:07 PM3/21/11
to symfony-...@googlegroups.com

On 21.03.2011, at 19:26, David Buchmann wrote:

> hi,
>
>>> Yes, see below. I think in most cases you will already know the parent when you traverse the hierarchy. But it is easy to add a @Parent to the mapping. In case a repository node is not mapped by the ODM we can simply return the PHPCR node or wrap it in a "Node" document.
>>
>> Yes, I agree it would be good to add some annotation to include the parent(s) and children Document instances for easy access (including support for lazy loading).
>
> if for example you render a page with data identified by the path, you
> do not necessarily know parents. in the cmf NavigationBundle, i
> currently use phpcr to traverse the tree.
> it would be convenient to support the Node::getParent method (where you
> specify the depth) and directly wrap into your model class.
> of course this is wrapping another layer around the phpcr api, but its
> nicer than working on phpcr and then again with doctrine to get your
> model class.
>
> lazy loading would be a must, otherwise one request would load up the
> whole tree...
> we might want to do this even for the embed relation - or have some
> option when loading a node to tell doctrine-odm not to load the embedded
> nodes. you might not always need them and it could be possibly expensive.
> * can one have nested embeds?
> * is the embeded entity itself an entity (that i could load directly)?


Doctrine has quite a lot of infrastructure for lazy loading and its supported in the other ORM/ODMs via proxy objects.

Lukas Kahwe Smith

unread,
Mar 21, 2011, 5:39:25 PM3/21/11
to David Buchmann, symfony-...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 21.03.2011, at 21:06, David Buchmann wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> hi lukas,
>
> i finally read the use cases. i think they cover about what we have to
> support. just two remarks:
>
>> Creating a new Document instance and storing it
>
> => we definitely need a way to specify where to attach the document to.
> people should be encouraged to structure their content. the id generator
> should only be a last resort. if you do not want to structure where you
> store your content, you should probably use mongo anyway.

right this is what i see as the role of the RepositoryGenerator. it basically delegates determining the path to the DocumentRepository::generatePath() method that received the $document instance as a parameter. It will usually do something like

$path = $document->getParent()->getPath().'/'.$document->getCreatedAt().'/'.$document->getSlug();

>> Renaming a subpath of a node
>
> i am not sure if i understand fully. by path you mean the path to the
> current document, right? should we not call this "move"?
> in the doc it should also be called rename a document, because moving is
> what has to be used to change the name. (the name used in the path, of
> course, not any kind of name property you might want to set)

whenever I said subpath I meant the last portion of the path

so for a path:
/foo/bar/ding/ding/lala

with subpath i meant:
lala

> oh, by the way, do we have a annotation to access Node::getName(), that
> is the element name of the node? we could use the last part of path, of
> course, but it could be important enough to merit its own annotation.

so getName() is "lala" from the above example?

> i documented the available annotations in the readme, but did not find
> that. can somebody check if the list is correct or if i missed something?

regards,
Lukas Kahwe Smith
sm...@pooteeweet.org

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAk2HxY0ACgkQ2FnmJDTfRPf0oQCgoNWG7mDmFNSObyDAkZ5SkGl+
t5wAnjmBdmAB/aYu9c7MDkmUFzemjv6u
=EcMJ
-----END PGP SIGNATURE-----

David Buchmann

unread,
Mar 22, 2011, 11:01:48 AM3/22/11
to Lukas Kahwe Smith, symfony-...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>> => we definitely need a way to specify where to attach the document to.


>> people should be encouraged to structure their content. the id generator
>> should only be a last resort. if you do not want to structure where you
>> store your content, you should probably use mongo anyway.
>
> right this is what i see as the role of the RepositoryGenerator. it basically delegates determining the path to the DocumentRepository::generatePath() method that received the $document instance as a parameter. It will usually do something like
>
> $path = $document->getParent()->getPath().'/'.$document->getCreatedAt().'/'.$document->getSlug();

a parent is only known when you define it. how does this work when you
add a new node to the repository?

> whenever I said subpath I meant the last portion of the path
>
> so for a path:
> /foo/bar/ding/ding/lala
>
> with subpath i meant:
> lala

ah, ok. well, i could want to move anything.
like move node ding at /foo/bar/ to /some/thing/else/
so we could change more than one level of the hierarchy. but this is
definitely a move.
i vote for considering everything a move operation, even the special
case of only changing the last part of the path.

there is a reason PHPCR has no setName method to change the thing you
call subpath.

>> oh, by the way, do we have a annotation to access Node::getName(), that
>> is the element name of the node? we could use the last part of path, of
>> course, but it could be important enough to merit its own annotation.
>
> so getName() is "lala" from the above example?

yes, it is.

cheers,david
- --
Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch

-----BEGIN PGP SIGNATURE-----


Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2IudwACgkQqBnXnqWBgIspSgCfRQpFGbds+NZvjNv2zf1NpP2Q
fk4An1s5e6GEpwhfsiAUIO0s25grh1/m
=rolU
-----END PGP SIGNATURE-----

Lukas Kahwe Smith

unread,
Mar 22, 2011, 11:10:53 AM3/22/11
to David Buchmann, symfony-...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 22.03.2011, at 16:01, David Buchmann wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>>> => we definitely need a way to specify where to attach the document to.
>>> people should be encouraged to structure their content. the id generator
>>> should only be a last resort. if you do not want to structure where you
>>> store your content, you should probably use mongo anyway.
>>
>> right this is what i see as the role of the RepositoryGenerator. it basically delegates determining the path to the DocumentRepository::generatePath() method that received the $document instance as a parameter. It will usually do something like
>>
>> $path = $document->getParent()->getPath().'/'.$document->getCreatedAt().'/'.$document->getSlug();
>
> a parent is only known when you define it. how does this work when you
> add a new node to the repository?

then you would need to manually assign the parent. however i think in many situations a parent should be a given.
aka i want to add a new news page i will likely first have to select a main category (which in this case would be a specific parent path) etc.

>> whenever I said subpath I meant the last portion of the path
>>
>> so for a path:
>> /foo/bar/ding/ding/lala
>>
>> with subpath i meant:
>> lala
>
> ah, ok. well, i could want to move anything.
> like move node ding at /foo/bar/ to /some/thing/else/
> so we could change more than one level of the hierarchy. but this is
> definitely a move.
> i vote for considering everything a move operation, even the special
> case of only changing the last part of the path.
>
> there is a reason PHPCR has no setName method to change the thing you
> call subpath.

k .. imho we do not need to support this via the PHPCR ODM API.

regards,
Lukas Kahwe Smith
sm...@pooteeweet.org

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAk2Iu/4ACgkQ2FnmJDTfRPdD+gCeN7C/pYbmjeb4fpc/llKs139/
ZUwAnRgcrTulA+xNGdtBRenCxbwfOYxE
=voqV
-----END PGP SIGNATURE-----

Lukas Kahwe Smith

unread,
Apr 3, 2011, 9:12:56 AM4/3/11
to symfony-...@googlegroups.com, David Buchmann
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

So whats the conclusion here?

Do we rename "path" to ID or not?
In the "or not" case we will have to rename a few things that currently use "id".

I am still +1 on renaming path to id, because I believe it will make it easier for people to start and also to share code with other ODM implementations. Furthermore from the POV of the ODM it will be an ID, since "path operations" will need to be done via Jackalope. So imho the only drawback is of course that people who then do use Jackalope need to realize that indeed the Jackalope (PHPCR/JCR) path is the id in the ODM.

regards,
Lukas Kahwe Smith
sm...@pooteeweet.org

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAk2YclkACgkQ2FnmJDTfRPfalwCgpVrjAuQ46GPilThINiekA6bZ
GckAnA1antBlB0OLWTXaelwHnT1jGucd
=y/Ea
-----END PGP SIGNATURE-----

David Buchmann

unread,
Apr 3, 2011, 12:36:50 PM4/3/11
to symfony-...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi lukas,

still +1 from me. its the only real option imo, as uuid is just not
there for non-referencable objects.

cheers,
david

ps: as a side note about the move: we currently have sort of two layers
that handle UnitOfWork. first in the odm code and then in the jackalope
implementation. i am not so sure if this is a good thing. but this
really deserves its own thread...


Am 03.04.2011 15:12, schrieb Lukas Kahwe Smith:
> Hi,
>
> So whats the conclusion here?
>
> Do we rename "path" to ID or not?
> In the "or not" case we will have to rename a few things that currently use "id".
>
> I am still +1 on renaming path to id, because I believe it will make it easier for people to start and also to share code with other ODM implementations. Furthermore from the POV of the ODM it will be an ID, since "path operations" will need to be done via Jackalope. So imho the only drawback is of course that people who then do use Jackalope need to realize that indeed the Jackalope (PHPCR/JCR) path is the id in the ODM.
>
> regards,
> Lukas Kahwe Smith
> sm...@pooteeweet.org
>
>
>

- --

Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch

-----BEGIN PGP SIGNATURE-----


Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2YoiIACgkQqBnXnqWBgIsj2wCeL35yvwxqgLxM9XguqTHsAOsp
6ZIAoLXPm7db+K/EJRE+vpZxbYNaFird
=N6Yt
-----END PGP SIGNATURE-----

Uwe Jäger

unread,
Apr 3, 2011, 3:27:27 PM4/3/11
to symfony-...@googlegroups.com
Hi,

there is one more thing we should consider: what do we do, when we
want to use the soon to be implemented @Parent relation to add nodes
to the hierarchy? What we then need for the current node is what PHPCR
calls a name, basically the last part of an absolute path. The current
implementation in the ODM requires the node to have an absolute path,
so in that case id is just fine. I think it makes sense to think a
little bit about the way we create nodes in the CMF bundles before we
just rename. Therefore I would just postpone that decision a little
bit: contrary to an id a path can also be relative to the parent and
could be used in the case described with the parent annotation.

Kind regards
Uwe

2011/4/3 David Buchmann <david.b...@liip.ch>:

Lukas Kahwe Smith

unread,
Apr 3, 2011, 3:30:30 PM4/3/11
to symfony-...@googlegroups.com

On 03.04.2011, at 21:27, Uwe Jäger wrote:

> Hi,
>
> there is one more thing we should consider: what do we do, when we
> want to use the soon to be implemented @Parent relation to add nodes
> to the hierarchy? What we then need for the current node is what PHPCR
> calls a name, basically the last part of an absolute path. The current
> implementation in the ODM requires the node to have an absolute path,
> so in that case id is just fine. I think it makes sense to think a
> little bit about the way we create nodes in the CMF bundles before we
> just rename. Therefore I would just postpone that decision a little
> bit: contrary to an id a path can also be relative to the parent and
> could be used in the case described with the parent annotation.


To me its pretty clear how it will work:
- you will assign a parent node to some property
- you will assign a node name to some property
- you will set the RepositoryPathGenerator on the Document

done ..

Uwe Jäger

unread,
Apr 3, 2011, 3:40:05 PM4/3/11
to symfony-...@googlegroups.com
Hi,

OK, we can do it that way. Then the id would not be used during
creation of the node and just filled afterwards. You could even
implement the RepositoryPathGenerator (should be RepositoryIDGenerator
then :-)) to use a property of the parent. And you actually do not
need to store the name giving property in PHPCR. OK, case closed. Go.

Kind regards
Uwe


2011/4/3 Lukas Kahwe Smith <sm...@pooteeweet.org>:

Lukas Kahwe Smith

unread,
Apr 3, 2011, 6:30:40 PM4/3/11
to symfony-...@googlegroups.com

On 03.04.2011, at 21:40, Uwe Jäger wrote:

> Hi,
>
> OK, we can do it that way. Then the id would not be used during
> creation of the node and just filled afterwards. You could even
> implement the RepositoryPathGenerator (should be RepositoryIDGenerator
> then :-)) to use a property of the parent. And you actually do not
> need to store the name giving property in PHPCR. OK, case closed. Go.


ok .. i will work on a pull to do the renaming tomorrow.
i will then follow up with another pull to finally add support to selecting an ID generator .. unless someone does it before I do *nudge nudge* :)

Lukas Kahwe Smith

unread,
Apr 4, 2011, 4:39:12 AM4/4/11
to symfony-...@googlegroups.com

On 04.04.2011, at 00:30, Lukas Kahwe Smith wrote:

>
> On 03.04.2011, at 21:40, Uwe Jäger wrote:
>
>> Hi,
>>
>> OK, we can do it that way. Then the id would not be used during
>> creation of the node and just filled afterwards. You could even
>> implement the RepositoryPathGenerator (should be RepositoryIDGenerator
>> then :-)) to use a property of the parent. And you actually do not
>> need to store the name giving property in PHPCR. OK, case closed. Go.
>
>
> ok .. i will work on a pull to do the renaming tomorrow.

https://github.com/doctrine/phpcr-odm/pull/12

David Buchmann

unread,
Apr 4, 2011, 4:59:52 AM4/4/11
to symfony-...@googlegroups.com, Uwe Jäger
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi,

> OK, we can do it that way. Then the id would not be used during
> creation of the node and just filled afterwards.

yes, and this is good, because it prevents you from trying to add a node
somewhere where the parent does not exist :-)

then node name will be some special annotation, right? this is cool, i
just saw that one of my developers used a normal string property for the
name and thus just duplicated the information :-)

cheers,david


- --
Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2ZiIUACgkQqBnXnqWBgItSLACdHTDvzVIxxjD8FzQDKXtTPHeK
kjgAoLco/v+hMd6sXHMk7BFYU19Y7Uvz
=Ks29
-----END PGP SIGNATURE-----

Uwe Jäger

unread,
Apr 4, 2011, 5:37:35 AM4/4/11
to symfony-...@googlegroups.com
Hi,

I don't think we don't need an extra annotation for the node name. Say
you are creating a blog system. Every blog entry might have a title.
So on the BlogEntry class there is a (PHP) property $title (which in
this case would be mapped to a PHPCR property). Then you create a
custom DocumentRepository class that implements
RepositoryPathGenerator (for now ...) and specify that class in the
document annotation in the BlogEntry. And your custom
DocumentRepository class implements the generatePath method in a way
that it first gets the parent's path and then append a name derived
from the $title property (e.g. filtering/replacing invalid characters
and so on).

Kind regards
Uwe

2011/4/4 David Buchmann <david.b...@liip.ch>:

Benjamin Eberlei

unread,
Apr 4, 2011, 6:15:17 AM4/4/11
to symfony-...@googlegroups.com
For an assigned ID Generator i propose to tell people to use something
along the lines:

class Parent
{
/** @Id */
private $path;

public function addChild($data)
{
$this->children[] = new Child($this, $this->path, $data);
}
}


For the RepositoryPath Generator I propose:

class Child
{
/** @Id @GeneratedValue(name="other") */
private $path;

/** @Parent */
private $parent;

/** @Field */
private $other;
}

Requirements:

1. Node HAS to have @parent
2. Node HAS to set an name property to the "name".

For example name could be "2011-04-04-slugfoo" for an Article, or
"2011-04-04-14:15-username" for a Comment.

greetings,
Benjamin

David Buchmann

unread,
Apr 4, 2011, 7:35:28 AM4/4/11
to symfony-...@googlegroups.com, Uwe Jäger
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> I don't think we don't need an extra annotation for the node name. Say


> you are creating a blog system. Every blog entry might have a title.
> So on the BlogEntry class there is a (PHP) property $title (which in
> this case would be mapped to a PHPCR property). Then you create a
> custom DocumentRepository class that implements
> RepositoryPathGenerator (for now ...) and specify that class in the
> document annotation in the BlogEntry. And your custom
> DocumentRepository class implements the generatePath method in a way
> that it first gets the parent's path and then append a name derived
> from the $title property (e.g. filtering/replacing invalid characters
> and so on).

its helpful if there is magic to do this, but maybe you want to let the
(end) user override what the magic would do.
and you need a way to read the system name too, to use in urls. of
course you could just take the last token from the path, but imho name


important enough to merit its own annotation.

cheers,david

- --
Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2ZrPwACgkQqBnXnqWBgIs5GgCgp/zzvY0zVkWk+MP1IHzhviGM
5dEAn0dhLk+LAys4wrFf4b7AnQu/9ant
=FvUx
-----END PGP SIGNATURE-----

Benjamin Eberlei

unread,
Apr 4, 2011, 7:44:34 AM4/4/11
to symfony-...@googlegroups.com
See my example. The @Id and associated generator annotations should
specify the field that is used for generating the path.

Another approach would be like FLOW3s Persistence layer to specifiy
"@Identifier" on one or several columns which is then used to generate
the path:

http://git.typo3.org/FLOW3/Packages/Blog.git?a=blob;f=Classes/Domain/Model/Post.php;h=02ddf19b5a22328a5e63b0674e001666fe790d19;hb=9df90bddf137a946b51362ac7abcbfa3fb36906d#l53

However in this case you have to specifiy the order, so i would still
prefer /* @Id @GeneratedValue(name={"fieldA", "fieldB"}) */ private
$path; or something

Uwe Jäger

unread,
Apr 4, 2011, 8:42:03 AM4/4/11
to symfony-...@googlegroups.com
Hi,

maybe Lukas can shed some light here on his comment in this context:

> To me its pretty clear how it will work:
> - you will assign a parent node to some property
> - you will assign a node name to some property
> - you will set the RepositoryPathGenerator on the Document

I just wrote what I thought this means.
Maybe we just need some code to see how this works.

Kind regards
Uwe

2011/4/4 Benjamin Eberlei <kon...@beberlei.de>:

Lukas Kahwe Smith

unread,
Apr 4, 2011, 8:49:56 AM4/4/11
to symfony-...@googlegroups.com

On 04.04.2011, at 14:42, Uwe Jäger wrote:

> Hi,
>
> maybe Lukas can shed some light here on his comment in this context:
>
>> To me its pretty clear how it will work:
>> - you will assign a parent node to some property
>> - you will assign a node name to some property
>> - you will set the RepositoryPathGenerator on the Document


basically it all comes down to how generateId() is implemented in the repository. imho it can do whatever it needs to. i do not think we need to force anything onto the user. what Benjamin describes could be another generator that follows a more rigid structure.

one thing i have been thinking of is making something else than the assigned id generator the default. we could have a generator that in case no parent is configured simply falls back on the root and if no name is configured simply tries to do some stupid hashing. this would be more inline with the default "auto increment" of the other ORM/ODM implementations.

Uwe Jäger

unread,
Apr 4, 2011, 9:00:42 AM4/4/11
to symfony-...@googlegroups.com
Hi,

so in code:

in the repository class:

public function generateId($document)
{
$parentId = $document->parent->id;
$blogTitle = $document->title;
return $parentId . '/' . urlencode($blogTitle); // or better jcrEncode ...
}

with
/** @Document(alias="blogentry", repositoryClass="BlogEntryRepository") */
class BlogEntry
{
/** @Parent */
public $parent;
/** @Id(strategy="repository") */
public $id:

public $title;
}

Kind regards
Uwe

2011/4/4 Lukas Kahwe Smith <sm...@pooteeweet.org>:

Benjamin Eberlei

unread,
Apr 4, 2011, 9:10:40 AM4/4/11
to symfony-...@googlegroups.com
1. I am against having a method determining the ID. This should really
be configured by in the mapping. My example would work if you'd use a
method in the constructor to generate an ID for example. So my case
covers the method generation case if people want to implement this in
userland.

2. Assigned ID Generator is the default in ORM, Mongo and Couch aswell.
You have to set the @GeneratedValue to make it use some auto increment
stuff or the likes.

Lukas Kahwe Smith

unread,
Apr 4, 2011, 9:24:18 AM4/4/11
to symfony-...@googlegroups.com

On 04.04.2011, at 15:10, Benjamin Eberlei wrote:

> 1. I am against having a method determining the ID. This should really be configured by in the mapping. My example would work if you'd use a method in the constructor to generate an ID for example. So my case covers the method generation case if people want to implement this in userland.

That seems pretty inflexible. Imagine I have some method that builds me the Document instance and now I want to associate it with a parent and persist it. I do not want the parent to handle the Document creation and even less I want to have to pass my data for the new Document through the parent.

Moreover, can you explain why you are against letting the Repository generate the ID? It seems the most flexible to me (think of use cases where the Repository is supposed to automatically spread out articles into subpaths for example using the creation date).

> 2. Assigned ID Generator is the default in ORM, Mongo and Couch aswell. You have to set the @GeneratedValue to make it use some auto increment stuff or the likes.


ah ok.

Benjamin Eberlei

unread,
Apr 4, 2011, 10:06:47 AM4/4/11
to symfony-...@googlegroups.com
On Mon, 4 Apr 2011 15:24:18 +0200, Lukas Kahwe Smith wrote:
> On 04.04.2011, at 15:10, Benjamin Eberlei wrote:
>
>> 1. I am against having a method determining the ID. This should
>> really be configured by in the mapping. My example would work if you'd
>> use a method in the constructor to generate an ID for example. So my
>> case covers the method generation case if people want to implement
>> this in userland.
>
> That seems pretty inflexible. Imagine I have some method that builds
> me the Document instance and now I want to associate it with a parent
> and persist it. I do not want the parent to handle the Document
> creation and even less I want to have to pass my data for the new
> Document through the parent.

This was the example for the Assigned ID generator. You can of course
use any way to assign the right path, just an example.

>
> Moreover, can you explain why you are against letting the Repository
> generate the ID? It seems the most flexible to me (think of use cases
> where the Repository is supposed to automatically spread out articles
> into subpaths for example using the creation date).

Don't you need more nodes when allowing for subpaths? How would this
work?

I am not really sure how we should handle this use-case, say you have a
node "Article" at /foo/bar/Article/1, comments are supposed to be spread
into subfolders by creation date, "/foo/bar/Article/1/2010-03-04/1". How
do we overcome this difference by two, not one node? There is not really
a way to describe what type of node "2010-03-04" should be.

Btw, For the Repository Path Generator you need the @Parent to be
assigned, otherwise there is no way to do it, or am i wrong?

Three examples for Generated Path Values, Parent is always required,
otherwise exception:

class Comment
{
/** @Id @GeneratedValue(name="commentid") */


private $path;

/** @Parent */

private $article;

private $commentid;

public function __construct(Article $parent, $username)
{
$this->commentid = date('Y-m-d_H:i:s', time())."-".$username;
$this->article = $parent;
}
}

class Comment
{
/** @Id @GeneratedValue(method="getCommentId") */


private $path;

/** @Parent */

private $article;

private $username;
private $created;

public function __construct(Article $parent, $username)
{
$this->article = $parent;
$this->username = $username;
$this->created = new \DateTime("now");
}

public function getCommentId()
{
return $this->create->format('Y-m-d_H:i:s')."-".$this->username);
}
}

class Comment
{
/** @Id @GeneratedValue(name="commentid") */


private $path;

/** @Parent */

private $article;

private $commentid;

public function __construct(Article $parent, $username)
{
$this->commentid = count($parent->getComments()+1);
$this->article = $parent;

Uwe Jäger

unread,
Apr 4, 2011, 4:13:40 PM4/4/11
to symfony-...@googlegroups.com
Hi,

2011/4/4 Benjamin Eberlei <kon...@beberlei.de>:

I think what Benjamin proposes is a simple and easy way to get the ids
constructed that does not need an additional class. The method
approach looks equivalent to the RepositoryIdGenerator. It delegates
responsibility to user code. The only difference is that it is in the
document class and not in the repository class. One reason to do it in
the repository class might be that we have more information there that
is not available to the document itself. But I think that that will
matter less in reality.

Regarding the subpaths issue I also came to the conclusion that we
need to create the intermediate nodes and have to filter them when we
want to populate the @Children when we create the document of the
parent node. Therefore it wouldn't be possible to just use PHPCR's
Node::getNodes which would actually return the intermediate nodes to
get the child documents. When we look again at a blog example, these
intermediate nodes might represent years and months which should be
modelled explicitly at the document level because you might want to
give the enduser the ability to navigate to different blog entries in
different months or years.

So for me the @GeneratedValue(name="...") and
@GeneratedValue(method="...") would be fine.

Kind regards
Uwe

Lukas Kahwe Smith

unread,
Apr 4, 2011, 4:17:25 PM4/4/11
to symfony-...@googlegroups.com

On 04.04.2011, at 22:13, Uwe Jäger wrote:

> I think what Benjamin proposes is a simple and easy way to get the ids
> constructed that does not need an additional class. The method
> approach looks equivalent to the RepositoryIdGenerator. It delegates
> responsibility to user code. The only difference is that it is in the
> document class and not in the repository class. One reason to do it in
> the repository class might be that we have more information there that
> is not available to the document itself. But I think that that will
> matter less in reality.

Well the main reason for why I put it into the Repository because imho the Document doesnt really care that much about the id, its more an outside concept. More importantly generating the ID might depend on other dependencies and injecting these into the Repository seems easier than in every Document instance.

> Regarding the subpaths issue I also came to the conclusion that we
> need to create the intermediate nodes and have to filter them when we
> want to populate the @Children when we create the document of the
> parent node. Therefore it wouldn't be possible to just use PHPCR's
> Node::getNodes which would actually return the intermediate nodes to
> get the child documents. When we look again at a blog example, these
> intermediate nodes might represent years and months which should be
> modelled explicitly at the document level because you might want to
> give the enduser the ability to navigate to different blog entries in
> different months or years.
>
> So for me the @GeneratedValue(name="...") and
> @GeneratedValue(method="...") would be fine.


yeah, i havent fully thought this "creating of entire subtrees" in an on demand fashion yet. i must admit i am also not quite sure how this all works in Jackalope.

David Buchmann

unread,
Apr 5, 2011, 3:35:31 AM4/5/11
to symfony-...@googlegroups.com, Lukas Kahwe Smith
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi,

disclaimer: i am no doctrine expert. i agree we should do something that
does not too much surprise people coming from other doctrine
implementations.

i think we should encourage the proper use of the tree. if you store all
your documents directly in root, you'd better use mongodb.

i would propose to have the default id generator look at parent and
name. there should be an option to control whether the generator may
fall back to some hash under root node. if you are sure, i am ok with
making the default allowing the hash as last resort.

any more advanced strategies like spreading children could be injected,
but i am very unsure if you really want this as this would mess the
loading of the nodes, as uwe says.
maybe we should just document a strategy to spread large numbers of
nodes with subnodes with the date or similar. then the user can really
control whats going on. we could provide some helper that automatically
creates the year, month and day documents for you and attaches the
content at the right place.

for jackalope (that is, phpcr and jcr in general) the concept is clear:
you can only add a node to an existing parent, there is no such thing as
adding something at an arbitrary path. and moving is only possible if
the target node where to attach the moved node is already existing.

cheers,david


Am 04.04.2011 22:17, schrieb Lukas Kahwe Smith:

- --

Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2axkMACgkQqBnXnqWBgIstIgCfVRpxDQ9Bl2md26vakh/xKVap
3gkAoKToY/5YQZ3Da18f+L9/hBFViI9w
=KWKa
-----END PGP SIGNATURE-----

Lukas Kahwe Smith

unread,
Apr 5, 2011, 9:26:09 AM4/5/11
to David Buchmann, symfony-...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 05.04.2011, at 09:35, David Buchmann wrote:

> i think we should encourage the proper use of the tree. if you store all
> your documents directly in root, you'd better use mongodb.
>
> i would propose to have the default id generator look at parent and
> name. there should be an option to control whether the generator may
> fall back to some hash under root node. if you are sure, i am ok with
> making the default allowing the hash as last resort.

i guess the hash is easy enough to implement yourself if you want it and know what you are doing.
we will add a new generator that uses the annotated parent and name properties once we have these annotations.

> any more advanced strategies like spreading children could be injected,
> but i am very unsure if you really want this as this would mess the
> loading of the nodes, as uwe says.
> maybe we should just document a strategy to spread large numbers of
> nodes with subnodes with the date or similar. then the user can really
> control whats going on. we could provide some helper that automatically
> creates the year, month and day documents for you and attaches the
> content at the right place.
>
> for jackalope (that is, phpcr and jcr in general) the concept is clear:
> you can only add a node to an existing parent, there is no such thing as
> adding something at an arbitrary path. and moving is only possible if
> the target node where to attach the moved node is already existing.


i think we definitely need to offer such a high level "magic" solution, but we will need some playing around before we can determine the final approach it seems.

regards,
Lukas Kahwe Smith
sm...@pooteeweet.org

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAk2bGHEACgkQ2FnmJDTfRPcVfACfR3MqlJf3UVd5Ord4u7mPcosH
TCgAoJmKu/KCBE5U48ulKcoxrBMsyh4T
=pG1N
-----END PGP SIGNATURE-----

Lukas Kahwe Smith

unread,
Apr 12, 2011, 9:38:04 AM4/12/11
to symfony-...@googlegroups.com, David Buchmann
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aloha,

So https://github.com/doctrine/phpcr-odm/pull/12 has been merged.
As a side effect choosing an ID strategy is now also possible.

I have updated the LiipHelloBundle accordingly:
https://github.com/liip/HelloBundle/commit/0feadf10aa3f90610572e5b7f45358df71241da1

For existing code the only thing that really needs to be done is:
https://github.com/liip/HelloBundle/commit/0feadf10aa3f90610572e5b7f45358df71241da1#L1R11

The rest of the change set just shows how to use the RepositoryIdGenerator strategy.

regards,
Lukas


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAk2kVbwACgkQ2FnmJDTfRPecvQCgiw679iG1cj8LtiVkEzbrcMEB
7mkAn05WnfeoEIPFUZcbcj/y6IbCb17n
=eqPo
-----END PGP SIGNATURE-----

Benjamin Eberlei

unread,
Apr 12, 2011, 9:41:56 AM4/12/11
to symfony-...@googlegroups.com
Thats good!
Reply all
Reply to author
Forward
0 new messages