Modified Preorder Tree Traversal

Bjarte Skogøy

unread,

Sep 11, 2012, 4:13:09 AM9/11/12

to rav...@googlegroups.com

Hi,

I'm looking into storing some hierarchical data in RavenDB using this concept: http://www.codeproject.com/Articles/31669/Hierarchical-Tree-Represented-by-Modified-Preorder

I'm not sure how/where to implement and when to execute the logic of modifying the left and right columns. I'm thinking that the most "correct" way of doing it would be to create a PUT-trigger and update the left and right values in AfterPut, to ensure that everything happens in one transaction, but I'm not sure what the consequences of this would be (stale indexes, other updates at the same time, and so on).

Have anyone else solved this? Any thoughts or objections?

Oren Eini (Ayende Rahien)

unread,

Sep 11, 2012, 5:09:41 AM9/11/12

to rav...@googlegroups.com

Why would you want to do things this way?

This is a solution for a relational database.

What is it that you are trying to do?

Bjarte Skogøy

unread,

Sep 11, 2012, 7:05:17 AM9/11/12

to rav...@googlegroups.com

:) I'm not quite sure, actually. I need to be able to store different kind of trees.

And one of these kinds can potentially contain a lot of levels. In one way it's very similar to a file system. I've got folders with sub-folders and content (not being folders), kind of. I've looked into storing everything in one document. But that won't quite be the right thing. I have to be able to load a sub-part of the tree. The tree will potentially be too large to load everything at once, so I would like to be able to lazy-load sub-folders.

I've also looked into storing the entire parent-path in each folder-document. Storing both the parent_id and generating the parent-path on updates would maybe do what I want? But this solution, as well as the Modified Preorder Tree Traversal, will require a lot of processing when changing the folder structure.

I've also considered another approach; I wouldn't mind if moving a folder around in the structure "locks" the entire structure when updating, but I would like to be able to change the name of the folder without this happening. To accomplish this I have wondered if it would be a solution to keep one document for organizing the folder-structure, but keeping all of the data (folder name, the folders content items etc) in another document. Would look something like this:

folderstructure/1

{

"Package": "Resource collection 1",

"Folders": [

{

"ThisFolder": "folder/1",

"Children": [

{

"ThisFolder": "folder/2",

"Children": []

}

]

}

]

}

folder/1

{

"Name":"Documentation",

"Content": [

"content/2",

"content/19"

]

}

This would also allow me to place a folder in multiple folder structures. This would in a way solve my problem, but would this be a stupid way of using a document-database? Given the extensive usage of document-links ...

What do you think?

Oren Eini (Ayende Rahien)

unread,

Sep 11, 2012, 7:07:17 AM9/11/12

to rav...@googlegroups.com

Let us take several steps back.

What are you _doing_?

Bjarte Skogøy

unread,

Sep 11, 2012, 7:26:10 AM9/11/12

to rav...@googlegroups.com

From the start ...

We have some million content-items. Being text, images, audio and so on.

All of these items are now, in small portions, part of desktop installations and distributed in access databases. We want to have all of our data online and could of course merge everything together into an mssql-database and keep things as they are. But some of the underlying structures are approaching an age of 20 years and is ready for rewriting. We are evaluating RavenDB in other solutions and would like to use it in this one too.

The main issue is to decide how to save a folder structure where the structure and usages of it does not invite to storing everything in one document. We need to support that a folder can be placed in multiple parent-folders. Folders can in theory have a very large count of both sub-folders and content items.

Oren Eini (Ayende Rahien)

unread,

Sep 11, 2012, 7:52:15 AM9/11/12

to rav...@googlegroups.com

Okay, so far, so good.

Now, let us assume that we can treat each content item as its own document.

Is there a reason not to have something like:

Parents: ["folders/123", "folders/431"]

Etc...

You make a distinction from folder name and id, so renaming a folder is cheap.

You can have multiple parents, and you store only the immediate parent.

What sort of operations do you need to do on those?

Bjarte Skogøy

unread,

Sep 11, 2012, 8:25:24 AM9/11/12

to rav...@googlegroups.com

So you're thinking of saving a list of parents in both folders and content items? That would work.

I need to fill a tree in the UI from the folder structure, but that will be easy. I would also need a way of checking if a content item is a child (at any level) of a given folder. How would I do that?

Chris Marisic

unread,

Sep 11, 2012, 8:54:16 AM9/11/12

to rav...@googlegroups.com

If it's really content items like you say, it sounds like this should be stored directly in the file system to begin with.

Bjarte Skogøy

unread,

Sep 11, 2012, 12:07:32 PM9/11/12

to rav...@googlegroups.com

The actual content(the binary data) of the content items will be stored in the cloud.

Chris Marisic

unread,

Sep 11, 2012, 1:08:16 PM9/11/12

to rav...@googlegroups.com

With that being said, I really love RavenDB and strongly push people to it. However since the graph support for RavenDB was abandoned, I would have to atleast suggest taking a look a db that directly understands hierarchical information, MS sql server 2008+ does, Neo4j are 2 of the top picks.

Bjarte Skogøy

unread,

Sep 12, 2012, 2:19:20 AM9/12/12

to rav...@googlegroups.com

What was the difficulties/issues of implementing graph support in RavenDB?

Oren Eini (Ayende Rahien)

unread,

Sep 12, 2012, 2:27:01 AM9/12/12

to rav...@googlegroups.com

Time

Mostly

Bjarte Skogøy

unread,

Sep 12, 2012, 3:06:59 AM9/12/12

to rav...@googlegroups.com

Ok. Would this be a good solution for hierarchical documents: http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene? Not sure if it's implemented in Lucene.Net yet, but I think it's implemented in the Java version.

Is the source for the start of graph support available somewhere?

Oren Eini (Ayende Rahien)

unread,

Sep 12, 2012, 3:12:58 AM9/12/12

to rav...@googlegroups.com

That feature would be extremely useful to have.

I am not sure it is relevant to what you are doing, but if it is there, I'll use it in a heart beat.

Bjarte Skogøy

unread,

Sep 12, 2012, 3:35:08 AM9/12/12

to rav...@googlegroups.com

https://issues.apache.org/jira/browse/LUCENE-2454

Matt Warren

unread,

Sep 12, 2012, 5:29:18 AM9/12/12

to rav...@googlegroups.com

Also take a look at http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html and http://blog.mikemccandless.com/2012/01/tochildblockjoinquery-in-lucene.html.

It seems like some of this functionality got added to Lucene 3.4.0 and IIRC Lucene.NET is currently 3.0.3

Oren Eini (Ayende Rahien)

unread,

Sep 12, 2012, 5:43:40 AM9/12/12

to rav...@googlegroups.com

I can already think of several very interesting ways to use this.

Anyone here want to do the porting of this feature to Lucene?

Bjarte Skogøy

unread,

Sep 13, 2012, 7:10:20 AM9/13/12

to rav...@googlegroups.com

Could you elaborate a little on what you're thinking of as interesting ways to use this?

Oren Eini (Ayende Rahien)

unread,

Sep 13, 2012, 8:09:55 AM9/13/12

to rav...@googlegroups.com

Depending on how what we can do with the API.

I would like to see if I can merge results from multiple maps to allow us to do searches across entities.

Reply all

Reply to author

Forward