Hi,
I have been using Titan for a while and have recently moved to Janusgraph. Thanks for revitalising the project!
I am using Janusgraph to look at social chat that is generating approximately 7 million messages a day. I want to keep 1 week's worth and when loading today's data, purge the messages that are more than 7 days old. Are there recommendations for performing this kind of bulk delete of edges?
I am using Cassandra as the backend store and I have messages as edges, with a timestamp property. I have an vertex-centric edgeIndex on the timestamp property and I also have an elasticsearch index on the same timestamp property. I have been running drops in batches of 10,000 and committing the transaction - trying to use (one of) the indexes to do this at speed.
The 2 ways I have tried are:
g.V().bothE('message').has('timestamp', lt(start_date)).limit(10000).drop().iterate() // uses the edgeIndex
g.E().has('timestamp', lt(start_date)).limit(10000).drop().iterate() // uses the elasticsearch index
Both of these are proving to be very slow, so I am probably missing something?
Thanks for any advice!
Kevin