Problematic purge (order of operations)

40 views
Skip to first unread message

Lars Janssen

unread,
Apr 10, 2013, 2:43:58 PM4/10/13
to symfony-...@googlegroups.com
Hi all,

I'm using PHPCR with Jackalope/Jackrabbit.

While messing around with fixtures and cloning across workspaces, I forgot to purge and managed to get myself some same name siblings (SNS):

/cr
/cr[2]
/cr[3]
etc.

So, the problem: given the above, PHPCRPurger->purge() results in a PHPCR\PathNotFoundException (via Jackrabbit HTTP 409) for node /cr[n] (where 'n' is around halfway down the list).

I thought this might be reasonably simple to solve, but then I started digging around... ;)

The problem can be observed in Jackalope\Transport\Jackrabbit\Client::finishSave() [1], although this method is as much the messenger as the culprit. In fact, I eventually came across a post on the jackrabbit-dev list by Christian Stocker from just over a year ago [2], so it seems I wasn't the first to come across this.

Here's an example of the string the Jackalope-Jackrabbit transport sends in my failing case:
    :diff=-/cr : \r-/cr[2] : \r-/cr[3] : 

If I hard-wire that for a moment, but in reverse, i.e.
:diff=-/crr[3] : \r-/cr[2] : \r-/cr : 
 - then it works!

So, a bit of fiddly string manipulation and we're done, right? Well, when I looked at the transport client, there are a number of possible operations, with these prefixes:

Delete: -
Move: >
Store property: ^
Create: +

(I might have missed/misunderstood some).

If there are only deletes, as with the purge case, then just changing the order is ok. But what about a mixture of deletes, moves, creates etc? In the mailing list exchange mentioned above, it was suggested that this possibly should be handled on the server side. A Jira issue was mentioned but I couldn't find this.

So should we attempt to resolve the order of operations in the transport, before sending the request? Or even in the object manager's save() method? I think this will be very difficult to get right in all cases though.

I'm not sure if it's related, but I've had similar problems in the the past with a form that allowed renaming multiple nested blocks. To get this working I had to carefully order things around in my own code, and perform a manager->flush() part-way through.

For the purging use case though, I think the expected behaviour would be that it will always result in an empty workspace. Perhaps a work-around for PHPCRPurger is in order? 

For example, add a 'deep' option (or similar):
    public function purge($deep = true)

and
    purgeWorkspace(SessionInterface $session, $deep = true)

This latter function would then do a session->save() and a fresh $root->getNodes() each time (perhaps only if SNS are detected). Probably not good for performance, but only significant if the root node has a lot of SNS children.

Failing that, it's not too hard to do effectively the same thing in my own application's fixtures loader/purger, but I suspect this will affect other people too.


Hoping this all makes sense!

Thanks,

Lars.

Lars Janssen

unread,
Apr 10, 2013, 4:36:18 PM4/10/13
to symfony-...@googlegroups.com
Hi all,

Just in case anyone else wants a work-around for this, I've made a simple command here:


It should be easy to get this working, just change the namespace and argument to setName() to suit your needs.

Thanks,

Lars.

Lukas Smith

unread,
Apr 10, 2013, 4:43:35 PM4/10/13
to symfony-...@googlegroups.com
Hi,

a better solution is to just disallow same name siblings entirely as we do not support them. maybe we should even provide an "phpcr:unstructured" node type exactly with this configured.

i think this CND definition would do just that

<foo='http://www.foo.com/jcr/bar/1.0'>
[foo:unstructured]
orderable
- * (UNDEFINED) multiple
- * (UNDEFINED)
+ * (nt:base) = foo:unstructured VERSION

regards,
Lukas
> --
> You received this message because you are subscribed to the Google Groups "symfony-cmf-devs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to symfony-cmf-de...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Lars Janssen

unread,
Apr 10, 2013, 5:04:19 PM4/10/13
to symfony-...@googlegroups.com
Hi,

Good idea, I'll give that a go.

Even though same-name siblings are not supported, I think this is how I created them:

 - Clone a referenceable node "/cr" from "preview" workspace to "published"
 - Purge the "preview" workspace, but *not* the "published" one
 - Create a new referenceable node "/cr" in the "preview" workspace
   (now, even though the nodes have the same path, they have different UUIDs)
 - Clone the "/cr" node over again
   (this doesn't cause an error, or overwrite the destination node, it creates one with the same name)

So, I can avoid it using your suggestion and/or clean it up using the above Gist that I created.

Should we also add a check in session->cloneFrom() and throw a NotImplementedException if the node already exists with a different UUID? Or will that just cause problems later?

Thanks,

Lars.

Lukas Kahwe Smith

unread,
Apr 10, 2013, 5:07:22 PM4/10/13
to symfony-...@googlegroups.com

On Apr 10, 2013, at 23:04 , Lars Janssen <la...@fazy.net> wrote:

> Hi,
>
> Good idea, I'll give that a go.
>
> Even though same-name siblings are not supported, I think this is how I created them:
>
> - Clone a referenceable node "/cr" from "preview" workspace to "published"
> - Purge the "preview" workspace, but *not* the "published" one
> - Create a new referenceable node "/cr" in the "preview" workspace
> (now, even though the nodes have the same path, they have different UUIDs)
> - Clone the "/cr" node over again
> (this doesn't cause an error, or overwrite the destination node, it creates one with the same name)
>
> So, I can avoid it using your suggestion and/or clean it up using the above Gist that I created.
>
> Should we also add a check in session->cloneFrom() and throw a NotImplementedException if the node already exists with a different UUID? Or will that just cause problems later?

It might make sense .. for import there are different modes of operation that allow you to decide what to do in this case, not sure if that is also available for clone.


regards,
Lukas Kahwe Smith
sm...@pooteeweet.org



David Buchmann

unread,
Apr 12, 2013, 7:35:09 AM4/12/13
to symfony-...@googlegroups.com
ideally, we would make jackalope-jackrabbit check if target nodes
already exists for the clone operation. it is an error of
jackalope-jackrabbit to ever create same-name-siblings as it is not
supported. i guess on jackalope-doctrine-dbal it can't ever happen as
the underlying database prevents it (apart from the fact that cloneFrom
is not implemented yet).
the only case where no exception should be is if we have
$removeExisting=true AND both the existing and the new node have the
same uuid.

do you want to do a PR to fix this, lars? i would fix it in the
jackalope-jackrabbit client, as its a storage backend issue that
jackrabbit can do sns while jackalope can't.

we could additinoally offer a phpcr:unstructured node type for people
who want to be forward compatible with future jackalope versions that
might support same-name siblings. but basically we should not require
people to use that to work around bugs in jackalope-jackrabbit.

i think having a fix to properly delete same-name-siblings if they
happen to be created would make sense. phpcr-utils sounds the wrong
place, maybe just in finishSave of jackalope-jackrabbit detect the sns
case and reorder sns? its not perfect but will fix i.e. the purge use
case. (btw, the real problem is that for jackalope, the [x] part is just
part of the nodename. if we would support sns, right after the delete,
the index of the other nodes would be decreased, ending up in several
"-/cr" without index, which would be what jackrabbit seems to expect)

cheers,david
--
Liip AG // Agile Web Development // T +41 26 422 25 11
CH-1700 Fribourg // PGP 0xA581808B // www.liip.ch

Lars Janssen

unread,
Apr 12, 2013, 10:04:40 AM4/12/13
to symfony-...@googlegroups.com
Hi all,

On Friday, 12 April 2013 12:35:09 UTC+1, David Buchmann wrote:
ideally, we would make jackalope-jackrabbit check if target nodes
already exists for the clone operation. it is an error of
jackalope-jackrabbit to ever create same-name-siblings as it is not
supported. i guess on jackalope-doctrine-dbal it can't ever happen as
the underlying database prevents it (apart from the fact that cloneFrom
is not implemented yet).
the only case where no exception should be is if we have
$removeExisting=true AND both the existing and the new node have the
same uuid.

Agreed. 
 
do you want to do a PR to fix this, lars? i would fix it in the
jackalope-jackrabbit client, as its a storage backend issue that
jackrabbit can do sns while jackalope can't.

Sure, I'll give it a go.

I wonder what is the best exception to use in the case where the user tries to clone over an existing node (same name, different or no uuid)? Think I suggested NotImplementedException before, but if the we don't offer SNS support, then is something like RepositoryException more appropriate?
 
we could additinoally offer a phpcr:unstructured node type for people
who want to be forward compatible with future jackalope versions that
might support same-name siblings. but basically we should not require
people to use that to work around bugs in jackalope-jackrabbit.

How would Jackalope-Jackrabbit behave with this node type? i.e. if the user tried to clone over an existing, non-corresponding node?

i think having a fix to properly delete same-name-siblings if they
happen to be created would make sense. phpcr-utils sounds the wrong
place, maybe just in finishSave of jackalope-jackrabbit detect the sns
case and reorder sns? its not perfect but will fix i.e. the purge use
case. (btw, the real problem is that for jackalope, the [x] part is just
part of the nodename. if we would support sns, right after the delete,
the index of the other nodes would be decreased, ending up in several
"-/cr" without index, which would be what jackrabbit seems to expect)

Ok, I guess it's better to partly fix the problem than not at all. ;) I'll take a look at this one too.

Thanks,

Lars.

David Buchmann

unread,
Apr 12, 2013, 10:47:49 AM4/12/13
to symfony-...@googlegroups.com
> I wonder what is the best exception to use in the case where the user
> tries to clone over an existing node (same name, different or no uuid)?
> Think I suggested NotImplementedException before, but if the we don't
> offer SNS support, then is something like RepositoryException more
> appropriate?

i think it should be ItemExistsException which extends the
RepositoryException. sns is an optional feature and jackalope does not
claim to support it.

>
> we could additinoally offer a phpcr:unstructured node type for people
> who want to be forward compatible with future jackalope versions that
> might support same-name siblings. but basically we should not require
> people to use that to work around bugs in jackalope-jackrabbit.
>
>
> How would Jackalope-Jackrabbit behave with this node type? i.e. if the
> user tried to clone over an existing, non-corresponding node?

if we have no bugs in jackalope then it would make no difference as we
don't support sns anyways. it will only help whan jackalope is telling
jackrabbit to do something that results in a sns. then jackrabbit will
throw some exception, hopefully the ItemExistsException, and tell that
sns are not allowed here.

> i think having a fix to properly delete same-name-siblings if they
> happen to be created would make sense. phpcr-utils sounds the wrong
> place, maybe just in finishSave of jackalope-jackrabbit detect the sns
> case and reorder sns? its not perfect but will fix i.e. the purge use
> case. (btw, the real problem is that for jackalope, the [x] part is
> just
> part of the nodename. if we would support sns, right after the delete,
> the index of the other nodes would be decreased, ending up in several
> "-/cr" without index, which would be what jackrabbit seems to expect)
>
>
> Ok, I guess it's better to partly fix the problem than not at all. ;)
> I'll take a look at this one too.

cool. just be sure to not reorder anything else than sns directly
following each other. anything else could lead to weird bugs. if we have
delete for /path/other and /path the order must be kept. if we have
delete /path/node, move /path/other -> /somewhere, delete /path the
order matters.

cheers,david

Lars Janssen

unread,
Apr 12, 2013, 2:28:38 PM4/12/13
to symfony-...@googlegroups.com
Hi,

On Friday, 12 April 2013 15:47:49 UTC+1, David Buchmann wrote:
i think it should be ItemExistsException which extends the
RepositoryException. sns is an optional feature and jackalope does not
claim to support it.

No problem there.

However, it seems working across workspaces has thrown up another challenge.

There's already (from my previous PR) some logic in Jackalope\ObjectManager::cloneFromImmediately that checks for an existing destination node. I need to expand the logic as follows:

1. removeExisting = true
  a. check for existence of an existing, non-corresponding item at the destination path
    
2. removeExisting = false
  a. check for existence of an existing item of any type at the destination path
  b. check for existence of an existing item with the same uuid anywhere in the destination workspace

1a and 2a should be ok.

For 2b, I can't work backwards from the destination. In the destination session I only know the intended path (but there might not be a node there). The only way is to get the source node from the source workspace.

So, how can I get something from the source workspace, when the user has called this from the destination object manager/session?

We had a similar issue with Node::getCorrespondingNodePath, resolved by adding a workspaceName parameter to the method in the transport interface. I looked at doing a similar thing in one of two ways:

- Use Jackalope\ObjectManager::getNodeByPath - but this does a lot of caching (by path) which would clash.
- Call getNode on the transport - but I notice this gives back a \stdClass object and hasn't been 'hydrated' (if that's the right term?) to the Jackalope\Node type.
- Create a completely new session based on the existing one; however as before I'm not sure if this is possible/appropriate from within the ObjectManager.

Also, with the first two of the above options, there is the need to add another $workspace = null option onto the end of a TransportInterface method. Doing this on some of the methods but not all seems a bit inconsistent.

Any thoughts?

cheers,

Lars.

Lars Janssen

unread,
Apr 16, 2013, 5:56:07 AM4/16/13
to symfony-...@googlegroups.com
Hi all,

Ok, I managed to avoid most of the complexity mentioned in my last posting. Some of the cases mentioned are already handled by Jackrabbit itself, so I just had to worry about "filling in the gaps". Thanks to David for being helpful as usual...

I think the latest PR in https://github.com/jackalope/jackalope-jackrabbit/pull/41 should do the trick, all being well.

Thanks,

Lars.

Lars Janssen

unread,
Apr 18, 2013, 7:14:14 AM4/18/13
to symfony-...@googlegroups.com
Just a final update here, this is all merged in now. Thanks for your help & patience David! :)

So, it should no longer be possible to create same name siblings by cloning across workspaces. If you try to purge an existing repository that has same name siblings at the root level, the purge should no longer fail.

David Buchmann

unread,
Apr 18, 2013, 7:23:46 AM4/18/13
to symfony-...@googlegroups.com, Lars Janssen
thanks a lot for tracking it down and fixing it, lars!

and as i updated the sandbox yesterday, it hopefully is also ready there.

i plan to tag versions of jackalope tomorrow, then the fixes will also
be in tagged versions. so if you work on any open RPs on phpcr-utils and
jackalope, lets finish them :-)

cheers,david
> --
> You received this message because you are subscribed to the Google
> Groups "symfony-cmf-devs" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to symfony-cmf-de...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Lars Janssen

unread,
Apr 19, 2013, 4:39:15 AM4/19/13
to symfony-...@googlegroups.com, Lars Janssen
The only open PR from my point of view is https://github.com/phpcr/phpcr-api-tests/pull/85 - but it'll probably be a few weeks before I get a chance to look into this issue again (I'm doing preview first in my project, then versions later).
Reply all
Reply to author
Forward
0 new messages