HBase with Django

1,988 views
Skip to first unread message

Sanjay Bhosale

unread,
Sep 27, 2013, 5:07:37 AM9/27/13
to chenn...@googlegroups.com
Is it possible to fetch data in django from HBase?
If yes, please let me know how to do this.

Ashwanth Kumar

unread,
Sep 27, 2013, 5:19:28 AM9/27/13
to chenn...@googlegroups.com
HBase has inbulit thrift server. So any thrift supported languages (http://thrift.apache.org/docs/features/) can access them. 
In your case, since you are using Python, evaluate the following options

This library works using HBase REST API

I have heard good reviews about this library (thought not personally used it)




On Fri, Sep 27, 2013 at 2:37 PM, Sanjay Bhosale <yourssa...@gmail.com> wrote:
Is it possible to fetch data in django from HBase?
If yes, please let me know how to do this.

--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--

Ashwanth Kumar / ashwanthkumar.in

Sanjay Bhosale

unread,
Sep 27, 2013, 9:17:45 AM9/27/13
to chenn...@googlegroups.com

thanks for your reply. I have another question how can i transfer data from pig to hive?

Ashwanth Kumar

unread,
Sep 27, 2013, 11:07:24 AM9/27/13
to chenn...@googlegroups.com
I am sorry, haven't used Pig much. But I am confused, why do you want to load data from Pig to hive? What is that you are trying to achieve? 





--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Sep 30, 2013, 1:02:59 AM9/30/13
to chenn...@googlegroups.com
What i need to do is : In first step i need to store data in some database/hbase (which is successful).
then in the second step when new data comes in i also need two store that data into database/hbase.
in third step i need to do processing on two data by fetching them from database/hbase and store resulting data into database/hbase.

second and third steps are dynamic and run when new data comes in.
I have written Pig script to do first step and which is almost similar in the second stage also but problem of fetching structured data comes in pig.
so how can i solve this problem?


On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Sep 30, 2013, 3:55:10 AM9/30/13
to chenn...@googlegroups.com
Could you tell us a bit more about the data? What is the structure of the data, and how is your schema to store the data onto HBase? 
HBase is not structured either so, if you are planning to use HBase, you are well of using something like Pig  / Hive, but using both is a over-kill. 

If you want to store structured data into HBase try something like Kiji, again it is it not a purely structured system, but a complex Avro model inside a HBase Cell. 




--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 1, 2013, 6:52:34 AM10/1/13
to chenn...@googlegroups.com
Now i am getting following error when trying load data into HBase through my pig script

script:

 file = LOAD '/user/hadoop/out.csv' USING PigStorage(',') AS (File_Name, File_Description);
 file1 = STORE file into 'hbase://File_Info_Table' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('File_Name File_Description')

Error :
 ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/zookeeper/ZooKeeper

In Log File :
 ERROR 2998: Unhandled internal error. org/apache/zookeeper/ZooKeeper
 java.lang.NoClassDefFoundError: org/apache/zookeeper/ZooKeeper

How to solve this?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Oct 1, 2013, 6:53:23 AM10/1/13
to chenn...@googlegroups.com
Check if the HBase libs on the PIG classpath. Seems like it doesn't have access to ZooKeeper classes. 


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 1, 2013, 7:24:04 AM10/1/13
to chenn...@googlegroups.com
Still getting the same error after editing pig-env.sh for PIG_CLASSPATH.
As well i haven't installed zookeeper, i have only installed hbase.
So is their a need to install zookeeper also?


On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Oct 1, 2013, 7:27:10 AM10/1/13
to chenn...@googlegroups.com
HBase has ZK inside it. Given that HBase is running, it means the you have ZK running as well. 

So can you share the change you made to PIG_CLASSPATH ? 


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 1, 2013, 7:30:43 AM10/1/13
to chenn...@googlegroups.com
file name : pig-env.sh

export PIG_HADOOP_VERSION=0.20.2
export PIG_CLASSPATH=$HADOOP_HOME/conf/
export PIG_CLASSPATH=$PIG_CLASSPATH:/home/hadoop/hbase-0.94.12/lib/

I have just added the last line


On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Oct 1, 2013, 7:33:26 AM10/1/13
to chenn...@googlegroups.com
Try changing the last line to as below and try

export PIG_CLASSPATH=$PIG_CLASSPATH:/home/hadoop/hbase-0.94.12/lib/

export PIG_CLASSPATH=$PIG_CLASSPATH:/home/hadoop/hbase-0.94.12/lib/*

 

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:
Is it possible to fetch data in django from HBase?
If yes, please let me know how to do this.

--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 1, 2013, 8:04:59 AM10/1/13
to chenn...@googlegroups.com
After correcting PIG_CLASSPATH, i stuck while running job
Below are some output after running commands said previously

  1. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/local/hadoop/libexec/../lib/native/Linux-amd64-64
  2. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
  3. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
  4. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
  5. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
  6. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=3.2.0-29-generic
  7. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=hadoop
  8. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/home/hadoop
  9. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/home/hadoop/hbase-0.94.12/bin
  10. 2013-10-01 17:08:47,383 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
  11. 2013-10-01 17:08:47,396 [main] INFO  org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper - The identifier of this process is 18516@hadoop-master
  12. 2013-10-01 17:08:47,398 [main-SendThread(localhost:2181)] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
  13. 2013-10-01 17:08:47,398 [main-SendThread(localhost:2181)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to localhost/127.0.0.1:2181, initiating session
  14. 2013-10-01 17:08:47,436 [main-SendThread(localhost:2181)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x14173cff5850004, negotiated timeout = 180000
  15. 2013-10-01 17:08:47,628 [main] INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for File_Info_Table
And the job stuck in the pending state.


On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Oct 1, 2013, 8:21:00 AM10/1/13
to chenn...@googlegroups.com
Quick things 
- Is the Job submitted to JT? (Just in case)
- What does the RS Logs say? 



--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 1, 2013, 8:25:29 AM10/1/13
to chenn...@googlegroups.com
Following are the last lines i can see:
  1. 2013-10-01 17:08:51,419 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: file[1,7],file[-1,-1] C:  R:
  2. 2013-10-01 17:08:51,419 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://hadoop-master:50030/jobdetails.jsp?jobid=job_201309191024_0661
and what are RS logs?Where they are kept?Is it the problem of connection?


On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Oct 1, 2013, 8:26:23 AM10/1/13
to chenn...@googlegroups.com
RS Logs are Region Server Logs.

The last line says that the job has been submitted to your JobTracker. You can use that link to open the status of the job. 


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 1, 2013, 8:35:14 AM10/1/13
to chenn...@googlegroups.com
status of the job remains pending and after some time interval gets killed.
and in the region server log i can see following contents:

  1. 2013-10-01 17:08:42,339 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 1 region(s)
  2. 2013-10-01 17:08:42,339 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.
  3. 2013-10-01 17:08:42,348 DEBUG org.apache.hadoop.hbase.backup.HFileArchiver: Finished archiving file from: class org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, file:hdfs://hadoop-master:54310/hbase/.META./1028785192/info/e0ccbfdc73964a2fa1e88aab4ac001c2, to: hdfs://hadoop-master:54310/hbase/.archive/.META./1028785192/info/e0ccbfdc73964a2fa1e88aab4ac001c2
  4. 2013-10-01 17:08:42,348 DEBUG org.apache.hadoop.hbase.backup.HFileArchiver: Archiving:class org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, file:hdfs://hadoop-master:54310/hbase/.META./1028785192/info/b0681c378a754fe487e500dd543a8f4d
  5. 2013-10-01 17:08:42,348 DEBUG org.apache.hadoop.hbase.backup.HFileArchiver: No existing file in archive for:hdfs://hadoop-master:54310/hbase/.archive/.META./1028785192/info/b0681c378a754fe487e500dd543a8f4d, free to archive original file.
  6. 2013-10-01 17:08:42,362 DEBUG org.apache.hadoop.hbase.backup.HFileArchiver: Finished archiving file from: class org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, file:hdfs://hadoop-master:54310/hbase/.META./1028785192/info/b0681c378a754fe487e500dd543a8f4d, to: hdfs://hadoop-master:54310/hbase/.archive/.META./1028785192/info/b0681c378a754fe487e500dd543a8f4d
  7. 2013-10-01 17:08:42,362 INFO org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 4 file(s) in info of .META.,,1.1028785192 into e60bf21d8a694ca38aaf4885e09f4839, size=2.5k; total size for store is 2.5k
  8. 2013-10-01 17:08:42,363 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=.META.,,1.1028785192, storeName=info, fileCount=4, fileSize=4.8k, priority=3, time=1649421149430799; duration=0sec
  9. 2013-10-01 17:08:42,363 DEBUG org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: CompactSplitThread status: compaction_queue=(0:0), split_queue=0
  10. 2013-10-01 17:08:42,368 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Attempting to transition node 623e18dfe6cc856800bcf169a5892910 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING
  11. 2013-10-01 17:08:42,386 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Successfully transitioned node 623e18dfe6cc856800bcf169a5892910 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING
  12. 2013-10-01 17:08:42,387 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region: {NAME => 'File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.', STARTKEY => '', ENDKEY => '', ENCODED => 623e18dfe6cc856800bcf169a5892910,}
  13. 2013-10-01 17:08:42,387 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor config now ...
  14. 2013-10-01 17:08:42,388 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.
  15. 2013-10-01 17:08:42,391 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Company_Name
  16. 2013-10-01 17:08:42,391 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  17. 2013-10-01 17:08:42,397 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store File_Description
  18. 2013-10-01 17:08:42,398 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  19. 2013-10-01 17:08:42,398 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store File_Name
  20. 2013-10-01 17:08:42,399 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  21. 2013-10-01 17:08:42,400 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store File_Version
  22. 2013-10-01 17:08:42,400 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  23. 2013-10-01 17:08:42,401 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Internal_Name
  24. 2013-10-01 17:08:42,401 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  25. 2013-10-01 17:08:42,403 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Legal_Copyright
  26. 2013-10-01 17:08:42,403 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  27. 2013-10-01 17:08:42,410 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Original_File_Name
  28. 2013-10-01 17:08:42,410 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  29. 2013-10-01 17:08:42,412 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Product_Name
  30. 2013-10-01 17:08:42,412 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  31. 2013-10-01 17:08:42,413 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Product_Version
  32. 2013-10-01 17:08:42,413 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  33. 2013-10-01 17:08:42,414 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store Signature
  34. 2013-10-01 17:08:42,414 INFO org.apache.hadoop.hbase.regionserver.Store: hbase.hstore.compaction.min = 3
  35. 2013-10-01 17:08:42,416 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.; next sequenceid=1
  36. 2013-10-01 17:08:42,416 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Attempting to transition node 623e18dfe6cc856800bcf169a5892910 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
  37. 2013-10-01 17:08:42,430 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Successfully transitioned node 623e18dfe6cc856800bcf169a5892910 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
  38. 2013-10-01 17:08:42,431 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910., daughter=false
  39. 2013-10-01 17:08:42,433 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Looked up root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@739bf34a; serverName=hadoop-master,60020,1380627518540
  40. 2013-10-01 17:08:42,436 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Cached location for .META.,,1.1028785192 is hadoop-master:60020
  41. 2013-10-01 17:08:42,455 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 1 replicas but expecting no less than 3 replicas.  Requesting close of hlog.
  42. 2013-10-01 17:08:42,455 DEBUG org.apache.hadoop.hbase.regionserver.LogRoller: HLog roll requested
  43. 2013-10-01 17:08:42,456 INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem doesn't support getDefaultReplication
  44. 2013-10-01 17:08:42,456 INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem doesn't support getDefaultBlockSize
  45. 2013-10-01 17:08:42,456 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910. with server=hadoop-master,60020,1380627518540
  46. 2013-10-01 17:08:42,456 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Done with post open deploy task for region=File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910., daughter=false
  47. 2013-10-01 17:08:42,456 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Attempting to transition node 623e18dfe6cc856800bcf169a5892910 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
  48. 2013-10-01 17:08:42,469 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new createWriter -- HADOOP-6840
  49. 2013-10-01 17:08:42,469 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Path=hdfs://hadoop-master:54310/hbase/.logs/hadoop-master,60020,1380627518540/hadoop-master%2C60020%2C1380627518540.1380627522455, syncFs=true, hflush=false, compression=false
  50. 2013-10-01 17:08:42,484 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x14173cff5850001 Successfully transitioned node 623e18dfe6cc856800bcf169a5892910 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
  51. 2013-10-01 17:08:42,484 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME => 'File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910.', STARTKEY => '', ENDKEY => '', ENCODED => 623e18dfe6cc856800bcf169a5892910,}, server: hadoop-master,60020,1380627518540
  52. 2013-10-01 17:08:42,484 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened File_Info_Table,,1380620981711.623e18dfe6cc856800bcf169a5892910. on server:hadoop-master,60020,1380627518540
  53. 2013-10-01 17:08:42,498 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/hadoop-master,60020,1380627518540/hadoop-master%2C60020%2C1380627518540.1380627522187, entries=1, filesize=422.  for /hbase/.logs/hadoop-master,60020,1380627518540/hadoop-master%2C60020%2C1380627518540.1380627522455
  54. 2013-10-01 17:13:38,610 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=75, hits=68, hitRatio=90.66%, , cachingAccesses=71, cachingHits=68, cachingHitsRatio=95.77%, , evictions=0, evicted=0, evictedPerRun=NaN
  55. 2013-10-01 17:18:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=76, hits=69, hitRatio=90.78%, , cachingAccesses=72, cachingHits=69, cachingHitsRatio=95.83%, , evictions=0, evicted=0, evictedPerRun=NaN
  56. 2013-10-01 17:23:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=77, hits=70, hitRatio=90.90%, , cachingAccesses=73, cachingHits=70, cachingHitsRatio=95.89%, , evictions=0, evicted=0, evictedPerRun=NaN
  57. 2013-10-01 17:28:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=78, hits=71, hitRatio=91.02%, , cachingAccesses=74, cachingHits=71, cachingHitsRatio=95.94%, , evictions=0, evicted=0, evictedPerRun=NaN
  58. 2013-10-01 17:33:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=79, hits=72, hitRatio=91.13%, , cachingAccesses=75, cachingHits=72, cachingHitsRatio=95.99%, , evictions=0, evicted=0, evictedPerRun=NaN
  59. 2013-10-01 17:38:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=80, hits=73, hitRatio=91.25%, , cachingAccesses=76, cachingHits=73, cachingHitsRatio=96.05%, , evictions=0, evicted=0, evictedPerRun=NaN
  60. 2013-10-01 17:43:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=81, hits=74, hitRatio=91.35%, , cachingAccesses=77, cachingHits=74, cachingHitsRatio=96.10%, , evictions=0, evicted=0, evictedPerRun=NaN
  61. 2013-10-01 17:48:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=89, hits=82, hitRatio=92.13%, , cachingAccesses=85, cachingHits=82, cachingHitsRatio=96.47%, , evictions=0, evicted=0, evictedPerRun=NaN
  62. 2013-10-01 17:53:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=90, hits=83, hitRatio=92.22%, , cachingAccesses=86, cachingHits=83, cachingHitsRatio=96.51%, , evictions=0, evicted=0, evictedPerRun=NaN
  63. 2013-10-01 17:58:38,609 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.99 MB, free=239.7 MB, max=241.69 MB, blocks=3, accesses=92, hits=85, hitRatio=92.39%, , cachingAccesses=88, cachingHits=85, cachingHitsRatio=96.59%, , evictions=0, evicted=0, evictedPerRun=NaN

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Sanjay Bhosale

unread,
Oct 1, 2013, 8:57:48 AM10/1/13
to chenn...@googlegroups.com
Also in zookeeper log i can see some exceptions as
  1. EndOfStreamException: Unable to read additional data from client sessionid 0x14173cff5850004, likely client has closed socket
  2. at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
  3. at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
  4. at java.lang.Thread.run(Thread.java:722)
Plz tell me how to solve this issue.

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Oct 1, 2013, 9:05:49 AM10/1/13
to chenn...@googlegroups.com
I will not be concerned so much by the Error, unless I am using ZK manually in my application. If its respective to HBase, then the client will take care of auto connecting after specific (incremental) intervals. 


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 1, 2013, 9:10:12 AM10/1/13
to chenn...@googlegroups.com
Thanks for your help. In future if you get answer fro this kind of problem please let me know.


On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Sanjay Bhosale

unread,
Oct 3, 2013, 2:44:40 AM10/3/13
to chenn...@googlegroups.com
After searching on net i came to understanding that 

           file = LOAD '/user/hadoop/out.csv' USING PigStorage(',') AS (File_Name, File_Version, Company_Name, Product_Name);
           file1 = STORE file into 'hbase://File_Info_Table' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('File_Name:File_Name File_Version:File_Version Company_Name:Company_Name Product_Name:Product_Name');

1) Above code works in pig's local mode and dose not works in distributed mode.
2) Above doe stores data in completely different order. It stores data of one column to another column

How to solve this problem?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Sanjay Bhosale

unread,
Oct 3, 2013, 3:43:28 AM10/3/13
to chenn...@googlegroups.com
Solved 2) problem. Issue of only 1) problem


On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Oct 3, 2013, 4:22:28 AM10/3/13
to chenn...@googlegroups.com
What error do you get? 


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 3, 2013, 5:01:56 AM10/3/13
to chenn...@googlegroups.com
Basically the problem is the same it stops in the pending state. and no error is displayed.
Might i need to configure to work in distributed mode.
But what i need to configure, i don't know.


On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Oct 3, 2013, 5:05:16 AM10/3/13
to chenn...@googlegroups.com
Again, has the job been submitted? Did you check the JobTracker for submitted job? 

There is no configuration change to make it work on local mode / distributed mode. It connects to the JT configured using mapred.job.tracker. Rest hadoop framework takes care. 

Sometimes, if the input file you are trying to process is very big, it might take some time to calculate the splits. 



--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 3, 2013, 5:26:27 AM10/3/13
to chenn...@googlegroups.com
Job is submitted but remains in pending state and gets killer after some time with no execption.


On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:

Ashwanth Kumar

unread,
Oct 3, 2013, 5:28:10 AM10/3/13
to chenn...@googlegroups.com
Your pig logs will not say anything about the job logs generally.  Check the submitted job on the job tracker to see if it has failed / succeeded. 


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

swapnil joshi

unread,
Oct 3, 2013, 5:31:23 AM10/3/13
to chenn...@googlegroups.com
Hi Sanjay,

You have to check JobTracker or TaskTracker log file which contain some information that can helpful to identify problem.


On Thu, Oct 3, 2013 at 2:56 PM, Sanjay Bhosale <yourssa...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Regards,
Swapnil K. Joshi

Sanjay Bhosale

unread,
Oct 3, 2013, 6:46:31 AM10/3/13
to chenn...@googlegroups.com
 Job Setup failed => Error: java.lang.ClassNotFoundException: com.google.protobuf.Message

swapnil joshi

unread,
Oct 3, 2013, 7:12:00 AM10/3/13
to chenn...@googlegroups.com
Hi Sanjay

I don't know.  what are your doing in your MapReduce job?
But from your previous log information.
Before that you have to see log detail:
What is cause of exception is it occur from your Java code / from Hadoop-xx.xx

All The Best


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 3, 2013, 7:20:23 AM10/3/13
to chenn...@googlegroups.com
Hi Swapnil,
   
    I am not using my own jar. I have just written Pig script which can load the csv file into HBase and can also able to generate csv file out of HBase.
    I have 2 nodes in my configuration. In one node i have configured environment to include jar files from the lib folder of the hbase and on other node i didn't.
    Is that creating problem? In local Pig script working correctly but it fails after waiting for somewhere around 1 hr.

swapnil joshi

unread,
Oct 3, 2013, 7:23:24 AM10/3/13
to chenn...@googlegroups.com
Hi Sanjay,
Can you tell me your hadoop, HBase Version?


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sanjay Bhosale

unread,
Oct 3, 2013, 7:28:30 AM10/3/13
to chenn...@googlegroups.com
Hadoop version : 1.0.3
HBase version : 0.94.12

Ashwanth Kumar

unread,
Oct 3, 2013, 7:30:19 AM10/3/13
to chenn...@googlegroups.com
Ideally you don't have to setup the paths, since Pig is more like a client level program. It creates a fat jar (called job.jar) and sends it along. Can you try setting up the paths on all the boxes and try? 


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--

Ashwanth Kumar / ashwanthkumar.in

Sanjay Bhosale

unread,
Oct 3, 2013, 7:36:39 AM10/3/13
to chenn...@googlegroups.com
yes i tried but no success.

swapnil joshi

unread,
Oct 3, 2013, 7:37:24 AM10/3/13
to chenn...@googlegroups.com


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Regards,
Swapnil K. Joshi

Sanjay Bhosale

unread,
Oct 3, 2013, 8:36:37 AM10/3/13
to chenn...@googlegroups.com
Thanks to all.
Finally solved using  -Dpig.additional.jars=/home/hadoop/hbase-0.94.12/lib/protobuf-java-2.4.0a.jar
Thanks Ashwanth, Swapnil.

swapnil joshi

unread,
Oct 3, 2013, 8:39:25 AM10/3/13
to chenn...@googlegroups.com
Wish you all the best Friend :)


--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Selva Kumar

unread,
Oct 6, 2013, 5:19:46 AM10/6/13
to chenn...@googlegroups.com
Hi senthil,

Could you Please share that yesterday's Hadoop 2 presentation?.

krishna chaitanya

unread,
Aug 5, 2014, 7:40:38 AM8/5/14
to chenn...@googlegroups.com
Hi Sanjay

I am too facing almost the same issue as yours. But, in my case i won't get any MR job error log too.

.mapReduceLayer.MapReduceLauncher - 0% complete
2014-08-06 01:10:53,098 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2014-08-06 01:10:53,098 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1407265692382_0002 has failed! Stop running all dependent jobs


I have added HBASE_JARS, ZOOKEEPER_JARS to HADOOP_CLASSPATH and even i can see all the required jars including protobuf*.jar in the pig startup logs.
Since i am using HDP 2.1 setup, i am restricted to use PIG 0.12.0 version. The code runs fine in local mode.

Am i missing out anything ? 

Regards
Krishna

Subrata Biswas

unread,
Aug 6, 2014, 3:57:32 AM8/6/14
to chenn...@googlegroups.com
Hi Bhosale,

 As per your requirement I will look for below architecture/design-

 Instead having it internal table for Hbase make it External table.
 Now you can easily switch between Hive and PIG., but as per your description in below mail chain, it should be fine.

Not sure if it works for you

 Regards 
Subrata


On Fri, Sep 27, 2013 at 6:47 PM, Sanjay Bhosale <yourssa...@gmail.com> wrote:

thanks for your reply. I have another question how can i transfer data from pig to hive?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:
Is it possible to fetch data in django from HBase?
If yes, please let me know how to do this.

--

Subrata Biswas

unread,
Aug 6, 2014, 4:02:01 AM8/6/14
to chenn...@googlegroups.com
Hi Ashwant / Senthil,

 Do you know any one having HANDS ON with below expertise-
  Data Analytics using PIG/Hive/Java MR also can write UDF for Hive and PIG.
  Importing data using Sqoop or by any means to Hadoop environment from different source of data (Mostly from different RDBMS)

 Having a requirement, one of my friend is looking for his team.

Regards
Subrata


On Fri, Sep 27, 2013 at 8:37 PM, Ashwanth Kumar <ashwan...@googlemail.com> wrote:
I am sorry, haven't used Pig much. But I am confused, why do you want to load data from Pig to hive? What is that you are trying to achieve? 





On Fri, Sep 27, 2013 at 6:47 PM, Sanjay Bhosale <yourssa...@gmail.com> wrote:

thanks for your reply. I have another question how can i transfer data from pig to hive?

On Friday, 27 September 2013 14:37:37 UTC+5:30, Sanjay Bhosale wrote:
Is it possible to fetch data in django from HBase?
If yes, please let me know how to do this.

--
You received this message because you are subscribed to the Google Groups "Hadoop Users Group (HUG) Chennai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chennaihug+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--

Ashwanth Kumar / ashwanthkumar.in

Reply all
Reply to author
Forward
0 new messages