Re: [PHPCR] extending nt:resource

81 views
Skip to first unread message

Lukas Kahwe Smith

unread,
Dec 4, 2012, 9:30:09 AM12/4/12
to phpcr...@googlegroups.com

On Dec 4, 2012, at 14:50 , zniaznwuz <znia...@gmail.com> wrote:

> Hi,
>
> I hope this is the right place to ask, but I'm kind of stucked in a Problem regarding PHPCR/Doctrine odm. I'm using PHPCR as pure file system in my web application and it worked well with Doctrine DBAL. I now have to store large binary files (>300mb) and this polutes my database quite a bit making backups more difficult and on top of that the memory consumption is huge (about 700 mb for a 100mb file). Therefore I tried out Midgard2 which should store its blobs in a folder. Midgard2 broke almost all functionality for my application. I debugged a few use cases, but I realised, it would take a really long time to fix all (cause I couldn't find a satisfying documentation). Therefore I moved on Apache Jackrabbit, which worked quite well out of the box (Memory consumption only 200mb for a 100mb file). My Problem now is, that I was extending the nt:resource node of my files for a multiple key/value property to store meta information. This worked with doctrine dbal, but not with jackrabbit , which says Node type definition does not allow to set the property with name . My (I hope not too fuzzy) questions are:

Sounds like there are some omissions in the node type validation in the Doctrine DBAL in terms of allowed properties.
Adding a config option to store larger binary data outside of the RDBMS is certainly something to consider and might not even be that hard to implement.

> 1) What is the proper way to add custom fields to defined nodeTypes (e.g. nt:resource, nt:file, etc)

Extending the node type definition.
nt:resource is defined like this:
http://wiki.apache.org/jackrabbit/nt%3Aresource

You could extend it via node type inheritance.

However you could also define a new node type that inherits from nt:hierarchyNode:
http://wiki.apache.org/jackrabbit/nt%3Afile

I am not sure if we have an example. You will need to write your node type definition either in the CND format or write a script using the NodeTypeManager. For inspiration have a look at:
https://github.com/phpcr/phpcr-utils/blob/master/src/PHPCR/Util/Console/Command/RegisterNodeTypesCommand.php
https://github.com/doctrine/phpcr-odm/blob/master/lib/Doctrine/ODM/PHPCR/Tools/Console/Command/RegisterSystemNodeTypesCommand.php

> 2) Every implementation so far, just loads the binary jcr:data in a php variable causing memory consumption to get really high. Is it planned to change this behaviour to real streaming, making working with large objects possible?
>
> I know these Questions are rather implementaion specific, but I hope that some people here may have knowledge about them.

PHPCR mandates that implementations are able to consume and return streams, but its their choice if they want to lazy load binary data. In Jackalope it should by default lazy load them into a stream if the "jackalope.disable_stream_wrapper" option is not set to true. This at least ensures that the binary data is not read from the server until its actually accessed. Now the stream should also allow outputting without needing to store the data into a variable, but maybe we have an issue in there.

regards,
Lukas Kahwe Smith
m...@pooteeweet.org



Lukas Kahwe Smith

unread,
Dec 4, 2012, 9:34:24 AM12/4/12
to phpcr...@googlegroups.com

On Dec 4, 2012, at 15:30 , Lukas Kahwe Smith <m...@pooteeweet.org> wrote:

>> 2) Every implementation so far, just loads the binary jcr:data in a php variable causing memory consumption to get really high. Is it planned to change this behaviour to real streaming, making working with large objects possible?
>>
>> I know these Questions are rather implementaion specific, but I hope that some people here may have knowledge about them.
>
> PHPCR mandates that implementations are able to consume and return streams, but its their choice if they want to lazy load binary data. In Jackalope it should by default lazy load them into a stream if the "jackalope.disable_stream_wrapper" option is not set to true. This at least ensures that the binary data is not read from the server until its actually accessed. Now the stream should also allow outputting without needing to store the data into a variable, but maybe we have an issue in there.


I just reviewed the code of Jackalope Doctrine DBAL, so for writing we currently do not support streaming even if the RDBMS supports it. For reading we should be using lazy loading streams if the RDBMS supports it (which afaik is the case for PostgreSQL but not for MySQL). With Jackrabbit we definitely support streaming on reads, but not on writes.

Piotr Pokora

unread,
Dec 4, 2012, 9:36:33 AM12/4/12
to phpcr...@googlegroups.com
Hi!

> Midgard2 broke almost all functionality

I'll be glad to fix them as soon as I know them :)
https://github.com/midgardproject/phpcr-midgard2/issues

Piotras
Reply all
Reply to author
Forward
0 new messages