Traverse query question

51 views
Skip to first unread message

Thomas Müller

unread,
Mar 8, 2015, 11:22:22 AM3/8/15
to orient-...@googlegroups.com
Hello,

for testing i generated a category hierachy ("V_Category") which has 4 levels and 4 categories per level, starting from one ROOT-category (#17:0). The hierachy is build up with "E_CH" category hierachy edges ( parentCategory.addEdge(EEdges.E_CH.name(), childCategory); ). At all 1364 categories.

I also generated 500 documents which have assigned randomly 1 to 10 of the above categories:

 
       nextInt = RandomUtils.nextInt(9) + 1;
       
for (int j = 0; j < nextInt; j++) {
           
OrientVertex orientVertex = nextRandomCategory().getOrientVertex();
            document
.addEdge(EEdges.E_Category.name(), orientVertex);
       
}



i want to select the number of documents assigned to a category. What i tried ist:

select count(in('E_Category')) from (traverse out('E_CH'), in('E_Category') from #17:0)

What i expect to get is 500 since i have 500 documets. What i get is 1866 which is 1364(number of categories) +500(number of documents) +2 (?). Even if i change the number of categories this formula seems to be constant: number of categories+number of documents+2. The good thing: the duplicates (one document may have several categories) already seem to be removed (no distinct necessary).

How do i have to change the querry to get only the number of documents which have assigned one or more categories from the given root category and all child categories.

Luca Garulli

unread,
Mar 9, 2015, 9:56:32 PM3/9/15
to orient-database
Hi Thomas,
Could you describe your graph better? 

Lvc@


On 8 March 2015 at 16:22, Thomas Müller <oberin...@googlemail.com> wrote:
Hello,

for testing i generated a category hierachy ("V_Category") which has 4 levels and 5 categories per level, starting from one ROOT-category (#17:0). The hierachy is build up with "E_CH" category hierachy edges ( parentCategory.addEdge(EEdges.E_CH.name(), childCategory); ). At all 1364 categories.


I also generated 500 documents which have assigned randomly 1 to 10 of the above categories:

 
       nextInt = RandomUtils.nextInt(9) + 1;
       
for (int j = 0; j < nextInt; j++) {
           
OrientVertex orientVertex = nextRandomCategory().getOrientVertex();
            document
.addEdge(EEdges.E_Category.name(), orientVertex);
       
}



i want to select the number of documents assigned to a category. What i tried ist:

select count(in('E_Category')) from (traverse out('E_CH'), in('E_Category') from #17:0)

What i expect to get is 500 since i have 500 documets. What i get is 1866 which is 1364(number of categories) +500(number of documents) +2 (?). Even if i change the number of categories this formula seems to be constant: number of categories+number of documents+2. The good thing: the duplicates (one cocument may have several categories) already seem to be removed (no distinct necessary).


How do i have to change the querry to get only the number of documents which have assigned one or more categories from the given root category and all child categories.

--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Thomas Müller

unread,
Mar 10, 2015, 4:50:20 AM3/10/15
to orient-...@googlegroups.com
Hi Luca,

Thanks for your reply!

My category graph looks like that:

Level 1:
(#17:0_ROOT_Categorie)-['E_CH']->(#17:1_1.Category)
(#17:0_ROOT_Categorie)-['E_CH']->(#17:2_2.Category)
(#17:0_ROOT_Categorie)-['E_CH']->(#17:3_3.Category)
(#17:0_ROOT_Categorie)-['E_CH']->(#17:4_4.Category)


...
Level 2:
(#17:1_1.Category)-['E_CH']->(#17:11_1.1.Category)
(#17:1_1.Category)-['E_CH']->(#17:12_1.2.Category)
(#17:1_1.Category)-['E_CH']->(#17:13_1.3.Category)
(#17:1_1.Category)-['E_CH']->(#17:14_1.4.Category)


...
Level 3:
(#17:11_1.1.Category)-['E_CH']->(#17:111_1.1.1.Category)
(#17:11_1.1.Category)-['E_CH']->(#17:112_1.1.2.Category)
(#17:11_1.1.Category)-['E_CH']->(#17:113_1.1.3.Category)
(#17:11_1.1.Category)-['E_CH']->(#17:114_1.1.4.Category)


... and so on... 4 levels deep and 4 categories per level

I have assigned one to 10 categories to each of my 500 documents:

(#23:1_1.Doc)-['E_Category']->(#17:111_1.1.1.Category)
(#23:1_1.Doc)-['E_Category']->(#17:2_2.Category)
(#23:1_1.Doc)-['E_Category']->(#17:113_1.1.3.Category)


...
(#23:50_50.Doc)-['E_Category']->(#17:4_4.Category)
(#23:50_50.Doc)-['E_Category']->(#17:11_1.1.Category)


...
(#23:500_500.Doc)-['E_Category']->(#17:4_4.Category)
(#23:500_500.Doc)-['E_Category']->(#17:112_1.1.2.Category)
(#23:500_500.Doc)-['E_Category']->(#17:11_1.1.Category)
(#23:500_500.Doc)-['E_Category']->(#17:114_1.1.4.Category)
... and so on, 500 documents.

I want to select the number of documents which have assigned the given category and all child categories of the given category without duplicates. So with given ROOT category (#17:0) i expect to get 500 documents as result with my query but i got 1364 as mentioned above.




Reply all
Reply to author
Forward
0 new messages