How to start Titan with HBase?

瀏覽次數:780 次
跳到第一則未讀訊息

Todd Leo

未讀,
2015年11月3日 清晨6:28:212015/11/3
收件者:Aurelius

Hi Titan group,

There are multiple conf files exists, and how do I start Titan with via titan.sh?

Let’s say I have already configured titan-hbase.properties, do I start Titan with titan.sh -c hbase start?

I am aware that there is another way to start Titan, using Gremlin console:

gremlin> g = TitanFactory.open('hbase:...') # cant remember the parameter used

whereas I’m not sure if Titan and the connection to HBase still runs if I exit Gremlin console.

And, does Titan reads conf/titan-hbase.properties in either way above?

Any tips are welcome. =D

BR,
Todd Leo

Jason Plurad

未讀,
2015年11月3日 上午8:43:482015/11/3
收件者:Aurelius
It is all covered here in the docs. Post with additional questions if you have any issues. http://s3.thinkaurelius.com/docs/titan/1.0.0/hbase.html
--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/3da37ad6-9f4d-45de-902b-b6214341614a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

David

未讀,
2015年11月4日 下午3:12:172015/11/4
收件者:Aurelius
Hi Todd,

If you start the gremlin shell using gremlin.sh from a command line, and connect to a Titan graph like this:

gremlin> graph=TitanFactory.open('conf/titan-hbase.properties')
==>standardtitangraph[hbase:[99.999.223.99:2181]]

or like this:

gremlin> graph=TitanFactory.build().set('storage.backend', 'hbase').set('storage.hostname', '99.999.223.99:2181').set('storage.hbase.ext.zookeeper.znode.parent', '/hbase-unsecure').open()
==>standardtitangraph[hbase:[99.999.223.99:2181]]

when you leave the gremlin shell, Titan "stops".  Its main thread of execution was part of the gremlin shell.

To review, the relevant properties for a basic titan-hbase.properties file are:
storage.backend=hbase
#NOTE...t should be called zookeeper server host name:port and can be a comma separated list
storage.hostname=99.999.223.99:2181
storage.hbase.tablename=titandb
# must match the setting in your hbase xml files for your installation or your attempt to connect to hbase will just "hang"
storage.hbase.ext.zookeeper.znode.parent=/hbase-unsecure

I am having issues with the HBase tablename setting shown above in 1.0 - could just be amnesia on my part - but buyer beware on that property spelling.  It could also be broken in the code.  Not sure.

If you start a gremlin shell using gremlin.sh, and then supply the :remote connect….  command in gremlin.sh
as shown in Chapter 7,  then you are using the gremlin shell as a client to a long running server process
called Gremlin server - which has to be configured and started before you can use it with Titan. 
Gremlin server does not exit when you exit gremlin.sh

According to the documentation for Titan 1.0.0 here:
http://s3.thinkaurelius.com/docs/titan/1.0.0/server.html
if you use the bin/titan.sh command:
" This step will start Gremlin Server with Cassandra/ES forked into a separate process.”

You do not want Cassandra.  You want HBase.  So don't use titan.sh.  It doesn’t do what you want.

Someone pasted a link to Chapter 7:
http://s3.thinkaurelius.com/docs/titan/1.0.0/server.html

But the text in that chapter leaves pretty much everything to the imagination, not helpful.

You can review this, but it may not help you with Titan:
http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#gremlin-server

I think these are the basics of what you want to do to set up Gremlin server with HBase/Titan:

Edit the conf/titan-hbase.properties file for your environment - save it - and copy it to the
conf/gremlin-server directory under Titan.  Edit the titan-hbase.properties
file you just copied to gremlin-server directory, and add the additional property shown
below (gremlin.graph).

gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
storage.backend=hbase
storage.hostname=<zookeeperserverip:port>,<zookeeperserverip:port>
storage.hbase.tablename = titan

Copy the conf/gremlin-server/gremlin-server.yaml file that is shipped with Titan to a back up copy.
You need to change that file, and just in case you mess it up, you want a backup.

Edit the gremlin-server.yaml file
Change:
host: localhost  to
host: 0.0.0.0

change the entry for graph to be:
graphs: {
  graph: conf/gremlin-server/titan-hbase.properties}

I don't think you need to touch the channelizer for this.
You probably also don’t need to change the host setting, but I do.
The conf/remote.yaml file you will use below should be ok as is.

At this point, do this from a command prompt:
bin/gremlin-server.sh

which should start the gremlin server.

From another shell, do this:
bin/gremlin.sh
Then, because we are using gremlin.sh as a client to the gremlin server, do this:
:remote connect tinkerpop.server conf/remote.yaml

I can’t get the example in Chapter 7 to work as it is shown.  But here is
a command line session I did that works for me:

gremlin> :remote connect tinkerpop.server conf/remote.yaml

==>Connected - localhost/127.0.0.1:8182

gremlin> :> graph.addVertex("name", "david")

==>v[4200]

gremlin> :> graph.tx().commit()
gremlin> :> g=graph.traversal();  g.V().values('name’)

 ==>david

If you want to access this same graph via gremlin.sh - but not as a client, i.e. not going through gremlin server,
just use the graph=TitanFactory.open('conf/titan-hbase.properties’) command at the gremlin.sh prompt.
You should be able to see the same data in your graph as through the gremlin server.
訊息已遭刪除

Todd Leo

未讀,
2015年11月30日 清晨5:32:512015/11/30
收件者:Aurelius

Hi David,

I followed your steps, but I still have troubles when starting gremlin-server.sh. I found no storage.hbase.ext.zookeeper.znode.parent property specified in hdfs-site.xml, and I checked the manual the default value of this property is /hbase. Therefore I set storage.hbase.ext.zookeeper.znode.parent=/hbase in titan-hbase.properties, but gremlin-server failed to start after dozens of attempts:

746  [main] INFO  org.apache.zookeeper.ZooKeeper  - Initiating client connection, connectString=s3:2181 sessionTimeout=90000 watcher=hconnection-0x1e16c0aa, quorum=s3:2181, baseZNo
de=/hbase
758  [main] INFO  org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper  - Process identifier=hconnection-0x1e16c0aa connecting to ZooKeeper ensemble=s3:2181
760  [main-SendThread(s3.intra.testcluster.com:2181)] INFO  org.apache.zookeeper.ClientCnxn  - Opening socket connection to server s3.intra.testcluster.com/10.255.5.103:2181. Will not at
tempt to authenticate using SASL (unknown error)
765  [main-SendThread(s3.intra.testcluster.com:2181)] INFO  org.apache.zookeeper.ClientCnxn  - Socket connection established to s3.intra.testcluster.com/10.255.5.103:2181, initiating ses
sion
781  [main-SendThread(s3.intra.testcluster.com:2181)] INFO  org.apache.zookeeper.ClientCnxn  - Session establishment complete on server s3.intra.testcluster.com/10.255.5.103:2181, sessio
nid = 0x3513863d8f663f2, negotiated timeout = 40000
861  [main] INFO  org.apache.zookeeper.ZooKeeper  - Initiating client connection, connectString=s3:2181 sessionTimeout=90000 watcher=hconnection-0xd771cc9, quorum=s3:2181, baseZNod
e=/hbase
862  [main] INFO  org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper  - Process identifier=hconnection-0xd771cc9 connecting to ZooKeeper ensemble=s3:2181
863  [main-SendThread(s3.intra.testcluster.com:2181)] INFO  org.apache.zookeeper.ClientCnxn  - Opening socket connection to server s3.intra.testcluster.com/10.255.5.103:2181. Will not at
tempt to authenticate using SASL (unknown error)
863  [main-SendThread(s3.intra.testcluster.com:2181)] INFO  org.apache.zookeeper.ClientCnxn  - Socket connection established to s3.intra.testcluster.com/10.255.5.103:2181, initiating ses
sion
865  [main-SendThread(s3.intra.testcluster.com:2181)] INFO  org.apache.zookeeper.ClientCnxn  - Session establishment complete on server s3.intra.testcluster.com/10.255.5.103:2181, sessio
nid = 0x3513863d8f663f3, negotiated timeout = 40000
868  [main] INFO  org.apache.zookeeper.ZooKeeper  - Initiating client connection, connectString=s3:2181 sessionTimeout=90000 watcher=catalogtracker-on-hconnection-0xd771cc9, quorum=ksdc
-s3:2181, baseZNode=/hbase
869  [main] INFO  org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper  - Process identifier=catalogtracker-on-hconnection-0xd771cc9 connecting to ZooKeeper ensemble=s3:2181
869  [main-SendThread(s3.intra.testcluster.com:2181)] INFO  org.apache.zookeeper.ClientCnxn  - Opening socket connection to server s3.intra.testcluster.com/10.255.5.103:2181. Will not at
tempt to authenticate using SASL (unknown error)
870  [main-SendThread(s3.intra.testcluster.com:2181)] INFO  org.apache.zookeeper.ClientCnxn  - Socket connection established to s3.intra.testcluster.com/10.255.5.103:2181, initiating ses
sion
871  [main-SendThread(s3.intra.testcluster.com:2181)] INFO  org.apache.zookeeper.ClientCnxn  - Session establishment complete on server s3.intra.testcluster.com/10.255.5.103:2181, sessio
nid = 0x3513863d8f663f4, negotiated timeout = 40000
1229 [main] INFO  org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation  - Closing zookeeper sessionid=0x3513863d8f663f3
1232 [main] INFO  org.apache.zookeeper.ZooKeeper  - Session: 0x3513863d8f663f3 closed
1232 [main-EventThread] INFO  org.apache.zookeeper.ClientCnxn  - EventThread shut down
1334 [main] INFO  org.apache.zookeeper.ZooKeeper  - Session: 0x3513863d8f663f4 closed
1334 [main-EventThread] INFO  org.apache.zookeeper.ClientCnxn  - EventThread shut down

...
8956 [gremlin-server-boss-1] ERROR org.apache.tinkerpop.gremlin.server.GremlinServer  - Gremlin Server was unable to start and will now begin shutdown: Could not bind to 0.0.0.0 and 8182 - perhaps something else is bound to that address.

also tried to use your setting: storage.hbase.ext.zookeeper.znode.parent=/hbase-unsecure, the connection can be establisted, whereas the command gremlin> :remote connect tinkerpop.server conf/remote.yaml hangs, with no results nor errors. Tried to use host: localhost instead of host: 0.0.0.0 but doesn’t seem to help.

BR,
Todd

David

未讀,
2015年12月1日 上午8:05:112015/12/1
收件者:Aurelius
Taking this a step at a time...

Find where zkCli.sh lives, run it, and do the following after you are in the zookeeper shell:

[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, hbase-unsecure, rmstore]

what do you see as your entry for hbase ?

That should be the setting you use for storage.hbase.ext.zookeeper.znode.parent
Based on this example:
storage.hbase.ext.zookeeper.znode.parent=/hbase-unsecure



SLiZn Liu

未讀,
2015年12月1日 晚上9:36:112015/12/1
收件者:aureliu...@googlegroups.com

Hi David,
I got the following as I executed ls / in zkCli.sh:

[mesos, zookeeper, swarm, hbase, hadoop-ha]

Does this imply the value of the setting should be hbase ?


BR,
Todd Leo


--
You received this message because you are subscribed to a topic in the Google Groups "Aurelius" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/aureliusgraphs/PJW64yyziqE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/37dfd7a6-56fd-4a78-9833-33ab7b9a8b83%40googlegroups.com.

David

未讀,
2015年12月2日 下午4:26:022015/12/2
收件者:Aurelius
Hey Ted,

Yes: hbase

Next, I suggest writing a small Titan program, running it from the command line,
and connecting directly to HBase in your environment - forget
about gremlin server, for the moment.

Can you get that to work ?

----
...by the way, you may want to do a:
ps -ef | grep gremlin  
on the machine where you are trying to start a gremlin server.
Looks like one might already be started that you are not aware of.
回覆所有人
回覆作者
轉寄
0 則新訊息