HBase with Django

Sanjay Bhosale

unread,

Sep 27, 2013, 5:07:37 AM9/27/13

to chenn...@googlegroups.com

Is it possible to fetch data in django from HBase?

If yes, please let me know how to do this.

Ashwanth Kumar

unread,

Sep 27, 2013, 5:19:28 AM9/27/13

to chenn...@googlegroups.com

HBase has inbulit thrift server. So any thrift supported languages (http://thrift.apache.org/docs/features/) can access them.

In your case, since you are using Python, evaluate the following options

- https://github.com/simplegeo/python-hbase-thrift

- https://github.com/adamhadani/HBasta

- https://github.com/mozilla/socorro/blob/master/socorro/external/hbase/hbase_client.py

This library works using HBase REST API

- https://pypi.python.org/pypi/starbase

I have heard good reviews about this library (thought not personally used it)

- http://happybase.readthedocs.org/en/latest/

On Fri, Sep 27, 2013 at 2:37 PM, Sanjay Bhosale <yourssa...@gmail.com> wrote:

Is it possible to fetch data in django from HBase?
If yes, please let me know how to do this.

--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Ashwanth Kumar / ashwanthkumar.in

Sanjay Bhosale

unread,

Sep 27, 2013, 9:17:45 AM9/27/13

to chenn...@googlegroups.com

thanks for your reply. I have another question how can i transfer data from pig to hive?

Ashwanth Kumar

unread,

Sep 27, 2013, 11:07:24 AM9/27/13

to chenn...@googlegroups.com

I am sorry, haven't used Pig much. But I am confused, why do you want to load data from Pig to hive? What is that you are trying to achieve?

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Sep 30, 2013, 1:02:59 AM9/30/13

to chenn...@googlegroups.com

What i need to do is : In first step i need to store data in some database/hbase (which is successful).

then in the second step when new data comes in i also need two store that data into database/hbase.

in third step i need to do processing on two data by fetching them from database/hbase and store resulting data into database/hbase.

second and third steps are dynamic and run when new data comes in.

I have written Pig script to do first step and which is almost similar in the second stage also but problem of fetching structured data comes in pig.

so how can i solve this problem?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Sep 30, 2013, 3:55:10 AM9/30/13

to chenn...@googlegroups.com

Could you tell us a bit more about the data? What is the structure of the data, and how is your schema to store the data onto HBase?

HBase is not structured either so, if you are planning to use HBase, you are well of using something like Pig / Hive, but using both is a over-kill.

If you want to store structured data into HBase try something like Kiji, again it is it not a purely structured system, but a complex Avro model inside a HBase Cell.

[1] - http://kiji.org/getstarted/#Quick_Start_Guide

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 1, 2013, 6:52:34 AM10/1/13

to chenn...@googlegroups.com

Now i am getting following error when trying load data into HBase through my pig script

script:

file = LOAD '/user/hadoop/out.csv' USING PigStorage(',') AS (File_Name, File_Description);

file1 = STORE file into 'hbase://File_Info_Table' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('File_Name File_Description')

Error :

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/zookeeper/ZooKeeper

In Log File :

ERROR 2998: Unhandled internal error. org/apache/zookeeper/ZooKeeper

java.lang.NoClassDefFoundError: org/apache/zookeeper/ZooKeeper

How to solve this?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Oct 1, 2013, 6:53:23 AM10/1/13

to chenn...@googlegroups.com

Check if the HBase libs on the PIG classpath. Seems like it doesn't have access to ZooKeeper classes.

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 1, 2013, 7:24:04 AM10/1/13

to chenn...@googlegroups.com

Still getting the same error after editing pig-env.sh for PIG_CLASSPATH.

As well i haven't installed zookeeper, i have only installed hbase.

So is their a need to install zookeeper also?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Oct 1, 2013, 7:27:10 AM10/1/13

to chenn...@googlegroups.com

HBase has ZK inside it. Given that HBase is running, it means the you have ZK running as well.

So can you share the change you made to PIG_CLASSPATH ?

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 1, 2013, 7:30:43 AM10/1/13

to chenn...@googlegroups.com

file name : pig-env.sh

export PIG_HADOOP_VERSION=0.20.2

export PIG_CLASSPATH=$HADOOP_HOME/conf/

export PIG_CLASSPATH=$PIG_CLASSPATH:/home/hadoop/hbase-0.94.12/lib/

I have just added the last line

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Oct 1, 2013, 7:33:26 AM10/1/13

to chenn...@googlegroups.com

Try changing the last line to as below and try

export PIG_CLASSPATH=$PIG_CLASSPATH:/home/hadoop/hbase-0.94.12/lib/

export PIG_CLASSPATH=$PIG_CLASSPATH:/home/hadoop/hbase-0.94.12/lib/*

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Is it possible to fetch data in django from HBase?
If yes, please let me know how to do this.

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 1, 2013, 8:04:59 AM10/1/13

to chenn...@googlegroups.com

After correcting PIG_CLASSPATH, i stuck while running job

Below are some output after running commands said previously

2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/local/hadoop/libexec/../lib/native/Linux-amd64-64
2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.version=3.2.0-29-generic
2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.name=hadoop
2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.home=/home/hadoop
2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/home/hadoop/hbase-0.94.12/bin
2013-10-01 17:08:47,383 [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
2013-10-01 17:08:47,396 [main] INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper - The identifier of this process is 18516@hadoop-master
2013-10-01 17:08:47,398 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2013-10-01 17:08:47,398 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to localhost/127.0.0.1:2181, initiating session
2013-10-01 17:08:47,436 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x14173cff5850004, negotiated timeout = 180000
2013-10-01 17:08:47,628 [main] INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for File_Info_Table

And the job stuck in the pending state.

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Oct 1, 2013, 8:21:00 AM10/1/13

to chenn...@googlegroups.com

Quick things

- Is the Job submitted to JT? (Just in case)

- What does the RS Logs say?

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 1, 2013, 8:25:29 AM10/1/13

to chenn...@googlegroups.com

Following are the last lines i can see:

2013-10-01 17:08:51,419 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: file[1,7],file[-1,-1] C: R:
2013-10-01 17:08:51,419 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://hadoop-master:50030/jobdetails.jsp?jobid=job_201309191024_0661

and what are RS logs?Where they are kept?Is it the problem of connection?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Oct 1, 2013, 8:26:23 AM10/1/13

to chenn...@googlegroups.com

RS Logs are Region Server Logs.

The last line says that the job has been submitted to your JobTracker. You can use that link to open the status of the job.

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 1, 2013, 8:35:14 AM10/1/13

to chenn...@googlegroups.com

status of the job remains pending and after some time interval gets killed.

and in the region server log i can see following contents:

2013-10-01 17:08:42,339 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 1 region(s)
2013-10-01 17:08:42,339 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.
2013-10-01 17:08:42,348 DEBUG org.apache.hadoop.hbase.backup.HFileArchiver: Finished archiving file from: class org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, file:hdfs://hadoop-master:54310/hbase/.META./1028785192/info/e0ccbfdc73964a2fa1e88aab4ac001c2, to: hdfs://hadoop-master:54310/hbase/.archive/.META./1028785192/info/e0ccbfdc73964a2fa1e88aab4ac001c2
2013-10-01 17:08:42,348 DEBUG org.apache.hadoop.hbase.backup.HFileArchiver: Archiving:class org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, file:hdfs://hadoop-master:54310/hbase/.META./1028785192/info/b0681c378a754fe487e500dd543a8f4d
2013-10-01 17:08:42,348 DEBUG org.apache.hadoop.hbase.backup.HFileArchiver: No existing file in archive for:hdfs://hadoop-master:54310/hbase/.archive/.META./1028785192/info/b0681c378a754fe487e500dd543a8f4d, free to archive original file.
2013-10-01 17:08:42,362 DEBUG org.apache.hadoop.hbase.backup.HFileArchiver: Finished archiving file from: class org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, file:hdfs://hadoop-master:54310/hbase/.META./1028785192/info/b0681c378a754fe487e500dd543a8f4d, to: hdfs://hadoop-master:54310/hbase/.archive/.META./1028785192/info/b0681c378a754fe487e500dd543a8f4d
2013-10-01 17:08:42,362 INFO org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 4 file(s) in info of .META.,,1.1028785192 into e60bf21d8a694ca38aaf4885e09f4839, size=2.5k; total size for store is 2.5k
2013-10-01 17:08:42,363 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=.META.,,1.1028785192, storeName=info, fileCount=4, fileSize=4.8k, priority=3, time=1649421149430799; duration=0sec
2013-10-01 17:08:42,363 DEBUG org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: CompactSplitThread status: compaction_queue=(0:0), split_queue=0
2013-10-01 17:08:42,368 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Attempting to transition node 623e18dfe6cc856800bcf169a5892910 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING
2013-10-01 17:08:42,386 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Successfully transitioned node 623e18dfe6cc856800bcf169a5892910 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING
2013-10-01 17:08:42,387 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region: {NAME => 'File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.', STARTKEY => '', ENDKEY => '', ENCODED => 623e18dfe6cc856800bcf169a5892910,}
2013-10-01 17:08:42,387 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor config now ...
2013-10-01 17:08:42,388 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.
2013-10-01 17:08:42,391 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Company_Name
2013-10-01 17:08:42,391 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,397 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store File_Description
2013-10-01 17:08:42,398 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,398 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store File_Name
2013-10-01 17:08:42,399 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,400 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store File_Version
2013-10-01 17:08:42,400 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,401 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Internal_Name
2013-10-01 17:08:42,401 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,403 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Legal_Copyright
2013-10-01 17:08:42,403 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,410 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Original_File_Name
2013-10-01 17:08:42,410 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,412 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Product_Name
2013-10-01 17:08:42,412 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,413 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Product_Version
2013-10-01 17:08:42,413 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,414 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Signature
2013-10-01 17:08:42,414 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
2013-10-01 17:08:42,416 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.; next sequenceid=1
2013-10-01 17:08:42,416 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Attempting to transition node 623e18dfe6cc856800bcf169a5892910 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-10-01 17:08:42,430 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Successfully transitioned node 623e18dfe6cc856800bcf169a5892910 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2013-10-01 17:08:42,431 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910., daughter=false
2013-10-01 17:08:42,433 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Looked up root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@739bf34a; serverName=hadoop-master,60020,1380627518540
2013-10-01 17:08:42,436 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Cached location for .META.,,1.1028785192 is hadoop-master:60020
2013-10-01 17:08:42,455 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 1 replicas but expecting no less than 3 replicas. Requesting close of hlog.
2013-10-01 17:08:42,455 DEBUG org.apache.hadoop.hbase.regionserver.LogRoller: HLog roll requested
2013-10-01 17:08:42,456 INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem doesn't support getDefaultReplication
2013-10-01 17:08:42,456 INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem doesn't support getDefaultBlockSize
2013-10-01 17:08:42,456 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910. with server=hadoop-master,60020,1380627518540
2013-10-01 17:08:42,456 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Done with post open deploy task for region=File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910., daughter=false
2013-10-01 17:08:42,456 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Attempting to transition node 623e18dfe6cc856800bcf169a5892910 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-10-01 17:08:42,469 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new createWriter -- HADOOP-6840
2013-10-01 17:08:42,469 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Path=hdfs://hadoop-master:54310/hbase/.logs/hadoop-master,60020,1380627518540/hadoop-master%2C60020%2C1380627518540.1380627522455, syncFs=true, hflush=false, compression=false
2013-10-01 17:08:42,484 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Successfully transitioned node 623e18dfe6cc856800bcf169a5892910 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2013-10-01 17:08:42,484 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME => 'File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.', STARTKEY => '', ENDKEY => '', ENCODED => 623e18dfe6cc856800bcf169a5892910,}, server: hadoop-master,60020,1380627518540
2013-10-01 17:08:42,484 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910. on server:hadoop-master,60020,1380627518540
2013-10-01 17:08:42,498 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/hadoop-master,60020,1380627518540/hadoop-master%2C60020%2C1380627518540.1380627522187, entries=1, filesize=422. for /hbase/.logs/hadoop-master,60020,1380627518540/hadoop-master%2C60020%2C1380627518540.1380627522455
2013-10-01 17:13:38,610 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=75, hits=68, hitRatio=90.66%, , cachingAccesses=71, cachingHits=68, cachingHitsRatio=95.77%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-10-01 17:18:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=76, hits=69, hitRatio=90.78%, , cachingAccesses=72, cachingHits=69, cachingHitsRatio=95.83%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-10-01 17:23:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=77, hits=70, hitRatio=90.90%, , cachingAccesses=73, cachingHits=70, cachingHitsRatio=95.89%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-10-01 17:28:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=78, hits=71, hitRatio=91.02%, , cachingAccesses=74, cachingHits=71, cachingHitsRatio=95.94%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-10-01 17:33:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=79, hits=72, hitRatio=91.13%, , cachingAccesses=75, cachingHits=72, cachingHitsRatio=95.99%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-10-01 17:38:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=80, hits=73, hitRatio=91.25%, , cachingAccesses=76, cachingHits=73, cachingHitsRatio=96.05%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-10-01 17:43:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=81, hits=74, hitRatio=91.35%, , cachingAccesses=77, cachingHits=74, cachingHitsRatio=96.10%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-10-01 17:48:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=89, hits=82, hitRatio=92.13%, , cachingAccesses=85, cachingHits=82, cachingHitsRatio=96.47%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-10-01 17:53:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=90, hits=83, hitRatio=92.22%, , cachingAccesses=86, cachingHits=83, cachingHitsRatio=96.51%, , evictions=0, evicted=0, evictedPerRun=NaN
2013-10-01 17:58:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=92, hits=85, hitRatio=92.39%, , cachingAccesses=88, cachingHits=85, cachingHitsRatio=96.59%, , evictions=0, evicted=0, evictedPerRun=NaN

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Sanjay Bhosale

unread,

Oct 1, 2013, 8:57:48 AM10/1/13

to chenn...@googlegroups.com

Also in zookeeper log i can see some exceptions as

EndOfStreamException: Unable to read additional data from client sessionid 0x14173cff5850004, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:722)

Plz tell me how to solve this issue.

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Oct 1, 2013, 9:05:49 AM10/1/13

to chenn...@googlegroups.com

I will not be concerned so much by the Error, unless I am using ZK manually in my application. If its respective to HBase, then the client will take care of auto connecting after specific (incremental) intervals.

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 1, 2013, 9:10:12 AM10/1/13

to chenn...@googlegroups.com

Thanks for your help. In future if you get answer fro this kind of problem please let me know.

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Sanjay Bhosale

unread,

Oct 3, 2013, 2:44:40 AM10/3/13

to chenn...@googlegroups.com

After searching on net i came to understanding that

file = LOAD '/user/hadoop/out.csv' USING PigStorage(',') AS (File_Name, File_Version, Company_Name, Product_Name);

file1 = STORE file into 'hbase://File_Info_Table' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('File_Name:File_Name File_Version:File_Version Company_Name:Company_Name Product_Name:Product_Name');

1) Above code works in pig's local mode and dose not works in distributed mode.

2) Above doe stores data in completely different order. It stores data of one column to another column

How to solve this problem?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Sanjay Bhosale

unread,

Oct 3, 2013, 3:43:28 AM10/3/13

to chenn...@googlegroups.com

Solved 2) problem. Issue of only 1) problem

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Oct 3, 2013, 4:22:28 AM10/3/13

to chenn...@googlegroups.com

What error do you get?

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 3, 2013, 5:01:56 AM10/3/13

to chenn...@googlegroups.com

Basically the problem is the same it stops in the pending state. and no error is displayed.

Might i need to configure to work in distributed mode.

But what i need to configure, i don't know.

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Oct 3, 2013, 5:05:16 AM10/3/13

to chenn...@googlegroups.com

Again, has the job been submitted? Did you check the JobTracker for submitted job?

There is no configuration change to make it work on local mode / distributed mode. It connects to the JT configured using mapred.job.tracker. Rest hadoop framework takes care.

Sometimes, if the input file you are trying to process is very big, it might take some time to calculate the splits.

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 3, 2013, 5:26:27 AM10/3/13

to chenn...@googlegroups.com

Job is submitted but remains in pending state and gets killer after some time with no execption.

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,

Oct 3, 2013, 5:28:10 AM10/3/13

to chenn...@googlegroups.com

Your pig logs will not say anything about the job logs generally. Check the submitted job on the job tracker to see if it has failed / succeeded.

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

swapnil joshi

unread,

Oct 3, 2013, 5:31:23 AM10/3/13

to chenn...@googlegroups.com

Hi Sanjay,

You have to check JobTracker or TaskTracker log file which contain some information that can helpful to identify problem.

On Thu, Oct 3, 2013 at 2:56 PM, Sanjay Bhosale <yourssa...@gmail.com> wrote:

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Swapnil K. Joshi

Sanjay Bhosale

unread,

Oct 3, 2013, 6:46:31 AM10/3/13

to chenn...@googlegroups.com

 Job Setup failed => Error: java.lang.ClassNotFoundException: com.google.protobuf.Message

swapnil joshi

unread,

Oct 3, 2013, 7:12:00 AM10/3/13

to chenn...@googlegroups.com

Hi Sanjay

I don't know. what are your doing in your MapReduce job?

But from your previous log information.

I think, It's dependency issue Just download this jar -
http://grepcode.com/snapshot/repo1.maven.org/maven2/com.google.protobuf/protobuf-java/2.5.0/

Before that you have to see log detail:

What is cause of exception is it occur from your Java code / from Hadoop-xx.xx

All The Best

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 3, 2013, 7:20:23 AM10/3/13

to chenn...@googlegroups.com

Hi Swapnil,

I am not using my own jar. I have just written Pig script which can load the csv file into HBase and can also able to generate csv file out of HBase.

I have 2 nodes in my configuration. In one node i have configured environment to include jar files from the lib folder of the hbase and on other node i didn't.

Is that creating problem? In local Pig script working correctly but it fails after waiting for somewhere around 1 hr.

swapnil joshi

unread,

Oct 3, 2013, 7:23:24 AM10/3/13

to chenn...@googlegroups.com

Hi Sanjay,

Can you tell me your hadoop, HBase Version?

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,

Oct 3, 2013, 7:28:30 AM10/3/13

to chenn...@googlegroups.com

Hadoop version : 1.0.3

HBase version : 0.94.12

Ashwanth Kumar

unread,

Oct 3, 2013, 7:30:19 AM10/3/13

to chenn...@googlegroups.com

Ideally you don't have to setup the paths, since Pig is more like a client level program. It creates a fat jar (called job.jar) and sends it along. Can you try setting up the paths on all the boxes and try?

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Ashwanth Kumar / ashwanthkumar.in

Sanjay Bhosale

unread,

Oct 3, 2013, 7:36:39 AM10/3/13

to chenn...@googlegroups.com

yes i tried but no success.

swapnil joshi

unread,

Oct 3, 2013, 7:37:24 AM10/3/13

to chenn...@googlegroups.com

Hi Sanjay,

It's unsolved issue: (issue status: Open)
https://issues.apache.org/jira/browse/PIG-3285

For you problem just visit:
http://hortonworks.com/community/forums/topic/noclassdeffounderror-orgapachehadoophbasehbaseconfiguration/

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Swapnil K. Joshi

Sanjay Bhosale

unread,

Oct 3, 2013, 8:36:37 AM10/3/13

to chenn...@googlegroups.com

Thanks to all.

Finally solved using -Dpig.additional.jars=/home/hadoop/hbase-0.94.12/lib/protobuf-java-2.4.0a.jar

Thanks Ashwanth, Swapnil.

swapnil joshi

unread,

Oct 3, 2013, 8:39:25 AM10/3/13

to chenn...@googlegroups.com

Wish you all the best Friend :)

--

You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Selva Kumar

unread,

Oct 6, 2013, 5:19:46 AM10/6/13

to chenn...@googlegroups.com

Hi senthil,

Could you Please share that yesterday's Hadoop 2 presentation?.

krishna chaitanya

unread,

Aug 5, 2014, 7:40:38 AM8/5/14

to chenn...@googlegroups.com

Hi Sanjay

I am too facing almost the same issue as yours. But, in my case i won't get any MR job error log too.

.mapReduceLayer.MapReduceLauncher - 0% complete

2014-08-06 01:10:53,098 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.

2014-08-06 01:10:53,098 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1407265692382_0002 has failed! Stop running all dependent jobs

I have added HBASE_JARS, ZOOKEEPER_JARS to HADOOP_CLASSPATH and even i can see all the required jars including protobuf*.jar in the pig startup logs.

Since i am using HDP 2.1 setup, i am restricted to use PIG 0.12.0 version. The code runs fine in local mode.

Am i missing out anything ?

Regards

Krishna

Subrata Biswas

unread,

Aug 6, 2014, 3:57:32 AM8/6/14

to chenn...@googlegroups.com

Hi Bhosale,

As per your requirement I will look for below architecture/design-

Instead having it internal table for Hbase make it External table.

Now you can easily switch between Hive and PIG., but as per your description in below mail chain, it should be fine.

Not sure if it works for you

Regards

Subrata

On Fri, Sep 27, 2013 at 6:47 PM, Sanjay Bhosale <yourssa...@gmail.com> wrote:

thanks for your reply. I have another question how can i transfer data from pig to hive?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Is it possible to fetch data in django from HBase?
If yes, please let me know how to do this.

--

Subrata Biswas

unread,

Aug 6, 2014, 4:02:01 AM8/6/14

to chenn...@googlegroups.com

Hi Ashwant / Senthil,

Do you know any one having HANDS ON with below expertise-

Data Analytics using PIG/Hive/Java MR also can write UDF for Hive and PIG.

Importing data using Sqoop or by any means to Hadoop environment from different source of data (Mostly from different RDBMS)

Having a requirement, one of my friend is looking for his team.

Regards

Subrata

On Fri, Sep 27, 2013 at 8:37 PM, Ashwanth Kumar <ashwan...@googlemail.com> wrote:

I am sorry, haven't used Pig much. But I am confused, why do you want to load data from Pig to hive? What is that you are trying to achieve?

On Fri, Sep 27, 2013 at 6:47 PM, Sanjay Bhosale <yourssa...@gmail.com> wrote:

thanks for your reply. I have another question how can i transfer data from pig to hive?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Is it possible to fetch data in django from HBase?
If yes, please let me know how to do this.

--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Ashwanth Kumar / ashwanthkumar.in

Reply all

Reply to author

Forward