Titan - ElasticSearch SET Cardinality

21 views
Skip to first unread message

Vijai Shankar Natarajan

unread,
Nov 22, 2016, 7:05:41 PM11/22/16
to Aurelius
I have a question about titan with ES as search index.
According to the code here https://github.com/thinkaurelius/titan/blob/titan11/titan-es/src/main/java/com/thinkaurelius/titan/diskstorage/es/ElasticSearchIndex.java#L662 even if its a SET duplicates will be added into elasticsearch.

To test this I tried creating a vertex and created a new property with set data type. After that I added values with duplicates. Though graph querying back the vertex yields the property values without duplicates, I still see duplicates in the elasticsearch index. To make sure there are no duplicates in ES, the script needs to make a ".contains" check before adding right?

Am I missing something?

thomas prelle

unread,
Nov 23, 2016, 7:15:14 PM11/23/16
to Aurelius
Yes you are right, it is a loss of space in ES, but the result of the query is good because when ES search a document, it do not make the distinguish between a document which have the match value one time and a document which have the match value many times.

For TitanDB, I think it's useless even for property with List cardinality to push duplicate values.

Reply all
Reply to author
Forward
0 new messages