Restore of snapshots of titan-cassandra

474 views
Skip to first unread message

Liloo

unread,
Feb 26, 2014, 11:15:07 AM2/26/14
to aureliu...@googlegroups.com
Hi folks !

I can't really figure out how to backup and restore titan on a cassandra backend.

I've taken snapshots of all the cassandra nodes in my cluster and wants to restore on an other cluser with the same number of node.
I'm restoring by :
- open the properties file to recreate the schema
- create the key and indexes (I also tried avoiding this step)
- shutting down all the nodes
- copying the snapshot folders into <data_directory_location>/<keyspace_name>
- restarting the nodes

By doing this, I'm never able to retrieve data by query on indexes, always getting as a result "
Could not find type for id: "

Has somebody already had the same issue ? Or how do you manage to backup and restore easily ?

Matthias Broecheler

unread,
Feb 28, 2014, 6:59:15 PM2/28/14
to aureliu...@googlegroups.com
Hi Liloo,

what do you mean by opening the properties file to recreate the schema? When you backup and restore all the CFs in the keyspace that contains all the data and in particular the Titan schema. Your exception suggests that the schema got overwritten.

Best,
Matthias
--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Matthias Broecheler
http://www.matthiasb.com

Amy Nguyen

unread,
Mar 13, 2014, 12:46:09 PM3/13/14
to aureliu...@googlegroups.com
We are running into the same issue here. We can successfully restore a stand alone node of cluster, however when we repeat the same process with a clustered Cassandra we are seeing the same issue. The indexes are gone. We can count the number of vertex in gremlin but when we go to view the properties of the vertex, we get nothing returned. All the indexes are also gone. 

We do a snapshot of the Titan Keyspace. Do we have to snapshot system as well or maybe just certain CF in the System Keyspace. The issue with snapshotting the entire system keyspace is the peers definition. We have tried this once and never got  it to work as well. 

Can you detail the steps needed to do a back up and preform a restore of titan running on a cassandra cluster? 
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraphs+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

Dan LaRocque

unread,
Mar 24, 2014, 11:37:37 PM3/24/14
to aureliu...@googlegroups.com
On 03/13/2014 12:46 PM, Amy Nguyen wrote:
> We are running into the same issue here. We can successfully restore a
> stand alone node of cluster, however when we repeat the same process
> with a clustered Cassandra we are seeing the same issue. The indexes are
> gone. We can count the number of vertex in gremlin but when we go to
> view the properties of the vertex, we get nothing returned. All the
> indexes are also gone.
>
> We do a snapshot of the Titan Keyspace. Do we have to snapshot system as
> well or maybe just certain CF in the System Keyspace. The issue with
> snapshotting the entire system keyspace is the peers definition. We have
> tried this once and never got it to work as well.
>
> Can you detail the steps needed to do a back up and preform a restore of
> titan running on a cassandra cluster?
>


Hi,

Cassandra-snapshot-based backup and restore on 0.4.x is still kind of
"expert friendly" compared to using Faunus to read and write data, to
put it euphemistically; I would caution that that snapshot-based
backup/restore, as of 0.4.2, should be done by an operator familiar with
both Titan's CF schemata and with Cassandra's snapshot procedure and
filesystem layout. Snapshot works underneath Titan, it's just that all
the pointy bits are exposed and it's easy to poke an eye out.

With those caveats... here's an outline of what's involved...

Steps to take snapshot:

1. If you want consistent indices and locks, then suspend writes
2. Issue nodetool snapshot on the titan keyspace on all Cassandra nodes

http://www.datastax.com/documentation/cassandra/1.2/cassandra/operations/ops_backup_takes_snapshot_t.html
3. Resume writes, if suspended earlier
4. Backup snapshot files to long-term storage, if desired

Steps to restore a snapshot when the cluster already has a titan KS --
e.g. a replica died but the cluster survived and a replacement node has
been bootstrapped into the ring:

1. Follow
http://www.datastax.com/documentation/cassandra/1.2/cassandra/operations/ops_backup_snapshot_restore_t.html

Steps to restore a snapshot when the cluster is blank slate without a
titan KS -- e.g. somebody unintentionally typed drop keyspace titan into
cassandra-cli:

1. Open Titan once against the cluster and commit a vertex addition.
This should define the Titan KS and attendant CFs in the system schema.
You can turn up Titan's loglevel to verify that it's defining the KS
and CFs, or inspect the KS with cassandra-cli afterward. We're going to
blow away the data, we just want Titan to define its KS and CF entries
in the system KS.

2. Follow the same link above, creating the directory structure on any
nodes where it doesn't already exist.

Broadly, we are aware that backup/restore is annoying in Titan and
looking to improve it.

thanks,
Dan

--
Dan LaRocque
http://www.thinkaurelius.com

Amy Nguyen

unread,
Mar 27, 2014, 12:08:27 AM3/27/14
to aureliu...@googlegroups.com
I am curious about this step you listed below:

1. Open Titan once against the cluster and commit a vertex addition. 
This should define the Titan KS and attendant CFs in the system schema. 
  You can turn up Titan's loglevel to verify that it's defining the KS 
and CFs, or inspect the KS with cassandra-cli afterward.  We're going to 
blow away the data, we just want Titan to define its KS and CF entries 
in the system KS. 

When you start titan it checks to see if the keyspace already exists in Cassandra if it doesn't it then creates the keyspace. So does committing a new vertex do somehting else besides creating the CFs?

I will say that running Titan 0.4.1 with Cassandra 1.2.x we can just snapshot the data and then load the data back via sstableloader. The steps you listed also works if you are on a single node cassandra on both cassandra 1.2.x and 2.0.x.

Dan LaRocque

unread,
Apr 2, 2014, 6:09:54 PM4/2/14
to aureliu...@googlegroups.com
On 03/27/2014 12:08 AM, Amy Nguyen wrote:
> When you start titan it checks to see if the keyspace already exists in
> Cassandra if it doesn't it then creates the keyspace. So does committing
> a new vertex do somehting else besides creating the CFs?
>

Not really. TitanFactory.open(...) should create the KS and all CFs
(ID, data, locks, config). Adding a vertex is superfluous, I think,
though it doesn't hurt if you blow away the old data before restoring
sstables.

Denis Giovan Marques

unread,
Jun 1, 2016, 6:51:12 PM6/1/16
to Aurelius
I did some scripts that could be useful for future research.

Reply all
Reply to author
Forward
0 new messages