Pointing Beeswax to external hive metastore

1,826 views
Skip to first unread message

Anki

unread,
Dec 27, 2010, 8:13:21 PM12/27/10
to Hue-Users
Hi,

I am trying to install Hue in a separate machine than the one in my
cluster. I was able to run hue successfully. However, I am not able to
point beeswax to the remote hive metastore.
Here are the steps that I followed:
1. Started remote metastore server using :
hive --service metastore
2. Copied the hive-site.xml to Hue machine and pointed the hue-
beeswax.ini to look at this location.
3. Modified the hive-site.xml to set:
a. hive.metastore.local = false
b. hive.metastore.uris = thrift://<hive machine>:<metastore port>


Then, on restarting the hue server, in the beeswax_server.log I am
seeing:
"Beeswax configured to use external metastore at hpublicdnode03:9083"

However, when I try to connect to beeswax from hue it is giving the
following exception:
"Exception communicating with Hive Metadata (Hive UI) Server at
localhost:8003: Could not connect to localhost:8003"

Also, in beeswax_server.out, I am seeing one more error:
" ERROR beeswax.Server: Could not create /user/hive/warehouse-hue"
I checked /user/hive has permissions set to 777. I am not sure what is
the problem here.

Any pointers/ thoughts are greatly appreciated.

Thanks,
Ankita

bc Wong

unread,
Dec 29, 2010, 1:27:01 PM12/29/10
to Anki, Hue-Users

Hi Ankita,

I'm on CDH3, with security turned off, and tried to reproduce the problem with
exactly the same external metastore setting. Instead, Beeswax went ahead and
created the directory:

10/12/29 10:19:51 INFO beeswax.Server: Created /user/hive/warehouse-hue with
world-writable permissions.
10/12/29 10:19:51 INFO beeswax.Server: Starting beeswaxd at port 8002

A few quick things to check:
(1) Is the Hadoop on the Hue node configured correctly? Does `hadoop fs ...`
work?
(2) Can you get the corresponding error message from the NN log to see
why it couldn't create the directory?

Cheers,
--
bc Wong
Cloudera Software Engineer

Anki

unread,
Dec 29, 2010, 6:00:04 PM12/29/10
to Hue-Users
Thanks for the reply. I am curious, in the remote machine where you
have installed hive, was there any difference between hive-site.xml
than the one you were using in Hue?

Also, I am not able to do hadoop fs -ls from my Hue machine as this
machine is not part of hadoop cluster. I think this is not an
requirement as I am able to browse my hadoop filesystem from Hue file
browser correctly. Am I missing something?

Also, I checked the namenode logs, it doesn't have any errors or
warning or any activity from hue machine.
One question- why from Hue desktop when I try to open Beeswax, it is
giving error msg that it is not able communicate with "Hive Metadata
(Hive UI) Server at localhost:8003". I was expecting this to be my
remote hive machine.

Here are other details about my setup:
1. I am using CDH3b2.
2. Hive is installed in one of the hadoop nodes and configured to use
Mysql as metastore.
3. Hue is installed in a separate machine.

Let me know if you need more details.

Thanks,
Ankita

bc Wong

unread,
Dec 30, 2010, 3:33:52 AM12/30/10
to Anki, Hue-Users
On Wed, Dec 29, 2010 at 3:00 PM, Anki <ankita...@gmail.com> wrote:
> Thanks for the reply. I am curious, in the remote machine where you
> have installed hive, was there any difference between hive-site.xml
> than the one you were using in Hue?

It's the same. Hue runs BeeswaxServer, which is essentially a Hive
client. So it should have an identical setup.

> Also, I am not able to do hadoop fs -ls from my Hue machine as this
> machine is not part of hadoop cluster. I think this is not an
> requirement as I am able to browse my hadoop filesystem from Hue file
> browser correctly. Am I missing something?

Hue requires a properly configured Hadoop client locally. The file
upload runs Hadoop, job submission requires Hadoop, and BeeswaxServer
requires Hadoop.

In Hue, the left hand side of the application bar should show a little
red exclamation. Click on it and it'll show you the common
mis-configurations.

Anki

unread,
Dec 30, 2010, 4:28:27 PM12/30/10
to Hue-Users
Thanks, I configured hadoop in the Hue machine and after that I am
able to browse my hdfs filesystem using hadoop fs -ls. However, now
when I start Hue, it is not able to load any of the app(filebrowser,
beeswax etc.) icons.

I checked the logs, the beeswax-server.out/log are not written at this
time. Also, following error msg exists in supervisor.log
[30/Dec/2010 13:06:48 +0000] supervisor ERROR Exception in
supervisor main loop
Traceback (most recent call last):
File "/usr/share/hue/desktop/core/src/desktop/supervisor.py", line
335, in main
wait_loop(sups, options)
File "/usr/share/hue/desktop/core/src/desktop/supervisor.py", line
346, in wait_loop
time.sleep(1)
File "/usr/share/hue/desktop/core/src/desktop/supervisor.py", line
198, in sig_handler
raise Exception("Signal %d received. Exiting" % signum)
Exception: Signal 15 received. Exiting
[30/Dec/2010 13:06:48 +0000] supervisor WARNING Supervisor shutting
down!
[30/Dec/2010 13:06:48 +0000] supervisor WARNING Waiting for
children to exit for 5 seconds...
[30/Dec/2010 13:06:48 +0000] supervisor INFO Command "/usr/share/
hue/build/env/bin/hue runcpserver" exited normally.
[30/Dec/2010 13:06:59 +0000] supervisor INFO Starting process /
usr/share/hue/build/env/bin/hue runcpserver
[30/Dec/2010 13:06:59 +0000] supervisor INFO Started proceses
(pid 27783) /usr/share/hue/build/env/bin/hue runcpserver


Also, I checked namenode logs, there is still no activity related to
this machine.
Let me know if you need more details.

Appreciate your help!
Ankita

On Dec 30, 12:33 am, bc Wong <bcwal...@cloudera.com> wrote:

bc Wong

unread,
Dec 30, 2010, 10:45:04 PM12/30/10
to Anki, Hue-Users
On Thu, Dec 30, 2010 at 1:28 PM, Anki <ankita...@gmail.com> wrote:
> Thanks, I configured hadoop in the Hue machine and after that I am
> able to browse my hdfs filesystem using hadoop fs -ls. However, now
> when I start Hue, it is not able to load any of the app(filebrowser,
> beeswax etc.) icons.

So it seems that Hue starts and stays up, just that you're missing
most of the apps. Right? Can you take a look at your /etc/hue/hue.ini
and make sure the "hadoop_home" is set correctly?

What you mentioned below, esp. "Supervisor shutting down" would
suggest that Hue can't even start. If so, you'd need to scan the other
log files in /var/log/hue for errors. (Error reporting is badly done
in 1.0.x. In 1.1.0, there is a central error.log for all errors.)

Anki

unread,
Jan 3, 2011, 2:41:31 PM1/3/11
to Hue-Users
My hadoop_home was set correctly, and I did scanned all the logs.
Apparently, only error that was reported was in supervisor logs and it
wasn't very helpful.
I re-installed hue, and now the hue desktop is loading properly. Also,
I am able to browse my hdfs both from hue file browser as well as from
command line using 'hadoop fs -ls'. Also, this time Beeswax went ahead
and created warehouse-hue directory in the hdfs.

However, I am getting the same error when I click Beeswax icon from
hue-desktop:
"Exception communicating with Hive Metadata (Hive UI) Server at
localhost:8003: Could not connect to localhost:8003"

From one of the log msgs it seems that beeswax is configured to use
external metastore. I am not sure why it is trying to connect to
localhost. I am wondering if anything else is required to stop
beeswax from connecting to localhost.

Also, from logs:
runcpserver.log:
03/Jan/2011 11:03:25 +0000] thrift_util INFO Thrift exception;
retrying: Could not connect to localhost:8003
[03/Jan/2011 11:03:25 +0000] thrift_util INFO Thrift exception;
retrying: Could not connect to localhost:8003
[03/Jan/2011 11:03:25 +0000] thrift_util WARNING Out of retries for
thrift call: get_tables
[03/Jan/2011 11:03:25 +0000] thrift_util INFO Thrift saw
exception: Could not connect to localhost:8003
[03/Jan/2011 11:03:25 +0000] middleware INFO Processing
exception: Could not connect to localhost:8003: Traceback (most recent
call last):


Beeswax.out:
11/01/03 10:49:17 INFO beeswax.Server: Created /user/hive/warehouse-
hue with world-writable permissions.
11/01/03 10:49:17 INFO beeswax.Server: Starting beeswaxd at port 8002
11/01/03 10:49:17 INFO beeswax.Server: Parsed core-default.xml
sucessfully. Learned 52 descriptions.
11/01/03 10:49:17 INFO beeswax.Server: Parsed hdfs-default.xml
sucessfully. Learned 47 descriptions.
11/01/03 10:49:17 INFO beeswax.Server: Parsed mapred-default.xml
sucessfully. Learned 101 descriptions.
11/01/03 10:49:17 INFO beeswax.Server: Parsed hive-default.xml
sucessfully. Learned 62 descriptions.
11/01/03 10:49:17 INFO beeswax.Server: Starting beeswax server on port
8002, talking back to Desktop at 10.2.40.30:8088

Beeswax.log
03/Jan/2011 10:49:13 +0000] settings INFO Welcome to Hue 1.0.1
[03/Jan/2011 10:49:13 +0000] settings WARNING secret_key should
be configured
[03/Jan/2011 10:49:14 +0000] beeswax_server INFO Beeswax
configured to use external metastore at hpublicdnode03:9083
[03/Jan/2011 10:49:14 +0000] beeswax_server INFO Executing '/usr/
share/hue/apps/beeswax/src/beeswax/../../
beeswax_server.sh' (['beeswax_server.sh', '--beeswax', '8002', '--
desktop-host', '10.2.40.30', '--desktop-port', '8088']) ({'LOGNAME':
'root', 'USER': 'root', 'HOME': '/home/stratify', 'PATH': '/usr/local/
sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin', 'LANG':
'en_US.UTF-8', 'TERM': 'xterm', 'SHELL': '/bin/bash', 'TZ': 'America/
Los_Angeles', 'SHLVL': '1', 'SUDO_USER': 'stratify', 'USERNAME':
'root', 'DESKTOP_LOG_DIR': '/var/log/hue', 'SUDO_UID': '1000',
'HIVE_CONF_DIR': '/etc/hue/conf', '_': '/sbin/start-stop-daemon',
'SUDO_COMMAND': '/etc/init.d/hue start', 'SUDO_GID': '1000',
'PYTHON_EGG_CACHE': '/tmp/.hue-python-eggs', 'PWD': '/home/stratify',
'DJANGO_SETTINGS_MODULE': 'desktop.settings', 'MAIL': '/var/mail/
stratify', 'LS_COLORS':
'rs=0:di=01;34:ln=01;36:hl=44;37:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.
7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:',
'HADOOP_HOME': '/usr/lib/hadoop-0.20'})
[

Thanks for your help,
Ankita

On Dec 30 2010, 7:45 pm, bc Wong <bcwal...@cloudera.com> wrote:

bc Wong

unread,
Jan 6, 2011, 8:32:31 AM1/6/11
to Anki, Hue-Users
On Mon, Jan 3, 2011 at 11:41 AM, Anki <ankita...@gmail.com> wrote:
> However, I am getting the same error when I click Beeswax icon from
> hue-desktop:
> "Exception communicating with Hive Metadata (Hive UI) Server at
> localhost:8003: Could not connect to localhost:8003"

Anki,

You're absolutely right. I probably did not setup my hive conf correctly when I
said I couldn't reproduce your problem. I filed
https://issues.cloudera.org/browse/HUE-393 for this problem. The patch in the
jira should work for you.

Anki

unread,
Jan 6, 2011, 8:54:13 PM1/6/11
to Hue-Users
Thanks Wong for a quick patch! Somehow I am not able to access the
link. I tried both https and http.
To give you some background, we have installed hive on one of our
datanodes. And we did not wanted to install hue on the same machine.
This is how it all started.
Plan B is we can always install hue on the same machine as hive.

Thanks again for your timely help,
Ankita


On Jan 6, 5:32 am, bc Wong <bcwal...@cloudera.com> wrote:
> On Mon, Jan 3, 2011 at 11:41 AM, Anki <ankita.bak...@gmail.com> wrote:
> > However, I am getting the same error when I click Beeswax icon from
> > hue-desktop:
> > "Exception communicating with Hive Metadata (Hive UI) Server at
> > localhost:8003: Could not connect to localhost:8003"
>
> Anki,
>
> You're absolutely right. I probably did not setup my hive conf correctly when I
> said I couldn't reproduce your problem. I filedhttps://issues.cloudera.org/browse/HUE-393for this problem. The patch in the

Vinithra Varadharajan

unread,
Jan 6, 2011, 9:46:05 PM1/6/11
to Anki, Hue-Users
Anki,

This morning there was an outage of https://issues.cloudera.org, that included the loss of the attached patch. You can now access https://issues.cloudera.org/browse/HUE-393 and download 
0002-HUE-393.-Beeswax-doesn-t-work-with-external-metastore.patch. Sorry for the inconvenience.

-Vinithra

Anki

unread,
Jan 7, 2011, 2:07:55 PM1/7/11
to Hue-Users
Thanks Vinithra.
I was able to access the patch now. After applying the patch, Beeswax
is able to communicate with hive server :)

Cheers!
Ankita



On Jan 6, 6:46 pm, Vinithra Varadharajan <vinit...@cloudera.com>
wrote:
> Anki,
>
> This morning there was an outage ofhttps://issues.cloudera.org, that
> included the loss of the attached patch. You can now accesshttps://issues.cloudera.org/browse/HUE-393and download
> 0002-HUE-393.-Beeswax-doesn-t-work-with-external-metastore.patch. Sorry for
> the inconvenience.
>
> -Vinithra
>

mtanquary

unread,
Feb 9, 2011, 5:47:24 PM2/9/11
to Hue-Users
I'm not able to apply the patch. I get:

]# patch -p0 <*.patch
can't find file to patch at input line 4
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/apps/beeswax/src/beeswax/db_utils.py b/apps/beeswax/src/
beeswax/db_utils.py
|--- a/apps/beeswax/src/beeswax/db_utils.py
|+++ b/apps/beeswax/src/beeswax/db_utils.py
--------------------------
File to patch: /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
patching file /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
Hunk #2 FAILED at 297.
1 out of 2 hunks FAILED -- saving rejects to file /usr/share/hue/apps/
beeswax/src/beeswax/db_utils.py.rej

bc Wong

unread,
Feb 9, 2011, 5:54:58 PM2/9/11
to mtanquary, Hue-Users
On Wed, Feb 9, 2011 at 2:47 PM, mtanquary <matt.t...@gmail.com> wrote:
> I'm not able to apply the patch. I get:
>
> ]# patch -p0 <*.patch
> can't find file to patch at input line 4
> Perhaps you used the wrong -p or --strip option?
> The text leading up to this was:
> --------------------------
> |diff --git a/apps/beeswax/src/beeswax/db_utils.py b/apps/beeswax/src/
> beeswax/db_utils.py
> |--- a/apps/beeswax/src/beeswax/db_utils.py
> |+++ b/apps/beeswax/src/beeswax/db_utils.py
> --------------------------
> File to patch: /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
> patching file /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
> Hunk #2 FAILED at 297.
> 1 out of 2 hunks FAILED -- saving rejects to file /usr/share/hue/apps/
> beeswax/src/beeswax/db_utils.py.rej

Hi Matt,

Can you try `patch -p1 ...'?

mtanquary

unread,
Feb 9, 2011, 5:59:16 PM2/9/11
to Hue-Users
This is what I got:

# patch -p1 <*.patch
can't find file to patch at input line 4
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/apps/beeswax/src/beeswax/db_utils.py b/apps/beeswax/src/
beeswax/db_utils.py
|--- a/apps/beeswax/src/beeswax/db_utils.py
|+++ b/apps/beeswax/src/beeswax/db_utils.py
--------------------------
File to patch: /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
patching file /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
Reversed (or previously applied) patch detected! Assume -R? [n] y
Hunk #2 FAILED at 296.
1 out of 2 hunks FAILED -- saving rejects to file /usr/share/hue/apps/
beeswax/src/beeswax/db_utils.py.rej


On Feb 9, 3:54 pm, bc Wong <bcwal...@cloudera.com> wrote:

bc Wong

unread,
Feb 9, 2011, 6:45:44 PM2/9/11
to mtanquary, Hue-Users
On Wed, Feb 9, 2011 at 2:59 PM, mtanquary <matt.t...@gmail.com> wrote:
> This is what I got:
>
> # patch -p1  <*.patch
> can't find file to patch at input line 4
> Perhaps you used the wrong -p or --strip option?
> The text leading up to this was:
> --------------------------
> |diff --git a/apps/beeswax/src/beeswax/db_utils.py b/apps/beeswax/src/
> beeswax/db_utils.py
> |--- a/apps/beeswax/src/beeswax/db_utils.py
> |+++ b/apps/beeswax/src/beeswax/db_utils.py
> --------------------------
> File to patch:  /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
> patching file /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
> Reversed (or previously applied) patch detected!  Assume -R? [n] y
> Hunk #2 FAILED at 296.
> 1 out of 2 hunks FAILED -- saving rejects to file /usr/share/hue/apps/
> beeswax/src/beeswax/db_utils.py.rej

Hi Matt,

What's your Hue version? Could you send me your
apps/beeswax/src/beeswax/db_utils.py file?

mtanquary

unread,
Feb 10, 2011, 4:24:44 PM2/10/11
to Hue-Users
My version is 1.1. I'll send the file as well. Thank you for your
assistance!

On Feb 9, 4:45 pm, bc Wong <bcwal...@cloudera.com> wrote:

bc Wong

unread,
Feb 10, 2011, 5:45:30 PM2/10/11
to mtanquary, Hue-Users
On Thu, Feb 10, 2011 at 1:24 PM, mtanquary <matt.t...@gmail.com> wrote:
> My version is 1.1. I'll send the file as well. Thank you for your
> assistance!

I see. The patch wouldn't apply directly on 1.1. (I was a bit mistaken
about what's in which version.) On your version, the patch should go
in around L305 rather than L296. It's a small change. Do you feel
comfortable hand-editing your db_utils.py?

bc Wong

unread,
Feb 11, 2011, 11:03:44 AM2/11/11
to Matt Tanquary, Hue-Users
On Fri, Feb 11, 2011 at 7:51 AM, Matt Tanquary <matt.t...@gmail.com> wrote:
> Now getting this error again:

>
> Exception communicating with Hive Metadata (Hive UI) Server at
> localhost:8003: timed out
>
> Here is the content of my hue-beeswax.ini:
>
> # Configuration options for the Hive UI (Beeswax).
>
> [beeswax]
>
> #
> # Configure the port the internal metastore daemon runs on. Used only if
> # hive.metastore.local is true.
> ## beeswax_meta_server_port=8003
>
> #
> # Configure the port the beeswax thrift server runs on
> ## beeswax_server_port=8002
>
> #
> # Hive configuration directory, where hive-site.xml is located
> hive_conf_dir=/etc/hive/conf

Hi Matt,

Is hive.metastore.local set to false in your
/etc/hive/conf/hive-site.xml? Could you please attach that file?

Cheers,
bc


> I have verified that hive from cli connects to my derby server and works as
> expected.
>
> I also attached my current db_utils.py file as a sanity check.
>
> Thanks again for all of your help!
> -M@
>
> On Fri, Feb 11, 2011 at 8:34 AM, bc Wong <bcwa...@cloudera.com> wrote:
>>
>> On Fri, Feb 11, 2011 at 7:20 AM, Matt Tanquary <matt.t...@gmail.com>
>> wrote:
>> > Thanks!
>> >
>> > Now I get this error when trying to start hive from hue: An error
>> > occurred:
>> > 'module' object has no attribute 'METASTORE_CONN_TIMEOUT'
>>
>> Hi Matt,
>>
>> Your original file has this for the code block in question:
>>
>>  client = thrift_util.get_client(ThriftHiveMetastore.Client,
>>                                conf.BEESWAX_META_SERVER_HOST.get(),
>>                                conf.BEESWAX_META_SERVER_PORT.get(),
>>                                service_name="Hive Metadata (Hive UI)
>> Server",
>>                                timeout_seconds=METASTORE_THRIFT_TIMEOUT)
>>
>> Your new file should modify two arguments:
>>
>>  client = thrift_util.get_client(ThriftHiveMetastore.Client,
>>                                host,
>>                                port,
>>                                service_name="Hive Metadata (Hive UI)
>> Server",
>>                                timeout_seconds=METASTORE_THRIFT_TIMEOUT)
>>
>> Cheers,
>> bc

bc Wong

unread,
Feb 11, 2011, 11:58:59 AM2/11/11
to Matt Tanquary, Hue-Users
On Fri, Feb 11, 2011 at 8:17 AM, Matt Tanquary <matt.t...@gmail.com> wrote:
> It's set to true. After you mentioned that, I set to false and get some
> errors in hive and hue. The hue error I get is
>
> Exception communicating with Hive Metadata (Hive UI) Server at undefined:0:
> Could not connect to undefined:0
>
> hive-site.xml attached.

[cc-ing hue-users@. Please reply-all to keep a record for future users.]

Where is your external metastore? You were using the Hive CLI
earlier, which connected to the external metastore. So your CLI
session cannot be using /etc/hive/conf/hive-site.xml. You should
point hive_con_dir in `hue-beeswax.ini' to wherever the real
configuration is.

The hive-site.xml you attached is misconfigured. It doesn't have
hive.metastore.uris.

bc Wong

unread,
Feb 11, 2011, 12:42:52 PM2/11/11
to Matt Tanquary, Hue-Users
On Fri, Feb 11, 2011 at 9:17 AM, Matt Tanquary <matt.t...@gmail.com> wrote:
> The metastore is on the piodev02 server. I had defined the ConnectionURL to
> point to: jdbc:derby://piodev02:1527/metastore_db.
>
> This configuration I got from
> http://wiki.apache.org/hadoop/HiveDerbyServerMode
>
> I have been trying to set up the uris but haven't gotten something to work.
> What would you suggest as a URI for this?

Matt,

Please include hue-users@ in the reply. This is useful to help
other people troubleshoot.

The Hive terminology might be confusing you:
http://wiki.apache.org/hadoop/Hive/AdminManual/MetastoreAdmin
What you have is not necessarily an external (remote) metastore.
A remote metastore is a Hive metastore server daemon. That daemon
proxies all metastore requests, and the clients don't directly
talk to the metastore DB. In your case, it sounds like your
clients are connecting to the DB directly, just that the DB is on
a remote machine. That is still classified as an internal
metastore.

To go into Hive best practices a bit:- You shouldn't use derby
for a shared metastore. You'll run into concurrency problems
(and definitely performance ones). I'd recommend mysql.

turna...@gmail.com

unread,
Feb 18, 2013, 10:05:49 AM2/18/13
to hue-...@cloudera.org, Matt Tanquary, bcwa...@cloudera.com

hi ,
I trying to figure out hive mysql metastore configuration. I have implemented everything from cloudera's Webseite
https://ccp.cloudera.com/display/CDH4DOC/Hive+Installation#HiveInstallation-ConfiguringtheHiveMetastore
 but now when i connect to hue gui i'm getting this error
Exception communicating with Hive Metastore Server at localhost:8003: timed out



i installed mysql server on my master node, where i installed cloudera manager as well.
i have created a new database on mysql , which name is metastore and my hive-site.xml is so,
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://n1.example.com/metastore</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>

<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>q*****</value>
</property>

<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>

<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>

can anybody help me to solve that error.


Chris Conner

unread,
Feb 19, 2013, 8:55:40 AM2/19/13
to hue-...@cloudera.org
Hey,

When you made these changes, did you only make them for in the /etc/hive/conf/hive-site.xml or did you make them in the Beeswax configuration of the Hue service in CM?

Thanks!

turna...@gmail.com

unread,
Feb 19, 2013, 1:50:37 PM2/19/13
to hue-...@cloudera.org
hi,
I  made them manually in /etc/hive/conf/hive-site.xml.
but  hive-conf/hive-site.xml  under configuration files on beeswax_server(n1) still looks so ;
under webbrowser links http://n1.example.com:7180/cmf/process/190/config?filename=hive-conf%2Fhive-site.xml

<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera CM on 2013-02-19T14:25:59.866Z-->
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/metastore_db?useUnicode=true&amp;characterEncoding=UTF-8</value>

</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
  </property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/beeswax/warehouse</value>

</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
    <value>hue</value>

</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
    <value></value>
</property>
</configuration>
and also
hbase-conf/hbase-site.xml
under under configuration files on beeswax_server(n1) looks so ;
<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera CM on 2013-02-19T14:25:59.524Z-->
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://n1.example.com:8020/hbase</value>
  </property>
  <property>
    <name>hbase.client.write.buffer</name>
    <value>2097152</value>
  </property>
  <property>
    <name>hbase.client.pause</name>
    <value>1000</value>
  </property>
  <property>
    <name>hbase.client.retries.number</name>
    <value>10</value>
  </property>
  <property>
    <name>hbase.client.scanner.caching</name>
    <value>1</value>
  </property>
  <property>
    <name>hbase.client.keyvalue.maxsize</name>
    <value>10485760</value>
  </property>
  <property>
    <name>hbase.security.authentication</name>
    <value>simple</value>
  </property>
  <property>
    <name>zookeeper.session.timeout</name>
    <value>60000</value>
  </property>
  <property>
    <name>zookeeper.znode.parent</name>
    <value>/hbase</value>
  </property>
  <property>
    <name>zookeeper.znode.rootserver</name>
    <value>root-region-server</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>n1.example.com</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2181</value>
  </property>
</configuration>

I installed mysql server on same node  (n1) than i build an new database,which name is 'metastore' and also all configuration from cloudera's webseite. metastore database is existing in under var/log/mysql/metastore with all metadata tables ( i had implement with that command SOURCE /usr/lib/hive/scripts/metastore/upgrade/mysql/hive-schema-0.9.0.mysql.sql;) 
and while creating of metastore database i configured database user like so;
 CREATE USER hive@localhost  IDENTIFIED BY 'xxxxxxx' ;

and also my hbase-conf/hdfs-site.xml under beeswax server in CM ;

<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera CM on 2013-02-19T14:25:59.521Z-->
<configuration>
  <property>
    <name>dfs.https.port</name>
    <value>50470</value>
  </property>
  <property>
    <name>dfs.namenode.http-address</name>
    <value>n1.example.com:50070</value>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>3</value>
  </property>
  <property>
    <name>dfs.blocksize</name>
    <value>134217728</value>
  </property>
  <property>
    <name>dfs.client.use.datanode.hostname</name>
    <value>false</value>
  </property>
  <property>
    <name>fs.permissions.umask-mode</name>
    <value>022</value>
  </property>
  <property>
    <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
    <value>true</value>
  </property>
</configuration>


I'm trying to solve that problem but  I could not yet find any issue...

Thank you for your attention.

Chris Conner

unread,
Feb 19, 2013, 2:26:50 PM2/19/13
to hue-...@cloudera.org
OK, for Beeswax to work in Hue, you have to configure the DB for the metastore warehouse in CM for Beeswax as well.  Go to the Hue Service configuration, then under Beeswax one of the configuration areas should be for the DB.  I'll try and send a screen shot shortly.

Thanks
Chris

turna...@gmail.com

unread,
Feb 19, 2013, 2:42:41 PM2/19/13
to hue-...@cloudera.org
hi,
my configuration under beeswax configuration area looks so;

Configuration options for the Hive UI (Beeswax).

hive_conf_dir

/var/run/cloudera-scm-agent/process/191-hue-HUE_SERVER/hive-conf

Verzeichnis der Hive-Konfiguration, in dem sich hive-site.xml befindet.

Standard: /var/run/cloudera-scm-agent/process/191-hue-HUE_SERVER/hive-conf

share_saved_queries

True

Gespeicherte Abfragen allen Benutzern mitteilen. Wenn auf falsch gesetzt, sind gesicherte Abfragen nur für den Eigentümer und Administratoren sichtbar.

Standard: True

metastore_conn_timeout

10

Timeouts in Sekunden für Thrift-Aufrufe des Hive-Metastores. Bei diesem Timeout sollte berücksichtigt werden, dass der Metastore mit einer externen Datenbank kommunizieren könnte.

Standard: 10

beeswax_server_port

8002

Konfigurieren Sie den Port, auf dem der Beeswax Thrift-Server läuft.

Standard: 8002

beeswax_running_query_lifetime

604800000

Dauer in Sekunden, während der Beeswax Abfragen im Cache behält.

Standard: 604800000

hive_home_dir

/usr/lib/hive

Pfad zum Ursprung der Hive-Installation; kehrt standardmäßig zur Umgebungsvariablen zurück, wenn nicht eingestellt.

Standard: /usr/lib/hive

browse_partitioned_table_limit

250

Setzen Sie eine LIMIT-Klausel beim Durchsuchen einer partitionierten Tabelle. Ein positiver Wert wird als das LIMIT gesetzt. Wenn 0 oder negativ, wird kein Limit gesetzt.

Standard: 250

beeswax_server_heapsize

53

Maximale vom Beeswax-Server verwendete Java-Heapsize (in Megabyte). Beachten Sie, dass die Einstellung von HADOOP_HEAPSIZE in $HADOOP_CONF_DIR/hadoop-env.sh diese Einstellung außer Kraft setzen kann.

Standard: 1000

beeswax_server_conn_timeout

120

Timeout in Sekunden für Thrift-Aufrufe des Beeswax-Dienstes.

Standard: 120

beeswax_meta_server_port

8003

Konfigurieren Sie den Port, auf dem der interne Metastore-Daemon läuft. Wird nur verwendet, wenn hive.metastore.local wahr ist.

Standard: 8003

beeswax_meta_server_only

None

Disable Beeswax as the query server. This is used when Beeswax is just used for talking to the meta store and Hue is using another query server. Just fill in an unused port.

Standard: None

local_examples_data_dir

/usr/share/hue/apps/beeswax/src/beeswax/../../data

Der Pfad im lokalen Dateisystem, der die Beeswax-Beispiele enthält.

Standard: /usr/share/hue/apps/beeswax/src/beeswax/../../data

i am waiting for your reply.

thanks,
onur

Chris Conner

unread,
Feb 19, 2013, 2:57:40 PM2/19/13
to hue-...@cloudera.org
Hey,

What version of CM are you running?

Romain Rigaux

unread,
Feb 19, 2013, 2:58:06 PM2/19/13
to Chris Conner, hue-...@cloudera.org
Yes, like Chris is saying, if you use CM editing manually '/etc/hive/conf/hive-site.xml' should not have an impact on Beeswax/Hue as they are using '/var/run/cloudera-scm-agent/process/191-hue-HUE_SERVER/hive-conf' so the Hive metastore DB must be configured in CM.

Romain

turna...@gmail.com

unread,
Feb 19, 2013, 4:26:21 PM2/19/13
to hue-...@cloudera.org
Hey,
I am using CM 4.1.3 in moment. How can I configure Hive metastore in CM ? Is that configuration with CM beeswax GUI possible? if So how?

Thanks
Onur

Chris Conner

unread,
Feb 19, 2013, 5:01:35 PM2/19/13
to hue-...@cloudera.org
In CM, you should have a configuration area that looks like this:


Is that where you are making the change for Beeswax?

turna...@gmail.com

unread,
Feb 19, 2013, 7:28:26 PM2/19/13
to hue-...@cloudera.org
hi,

i configured according your screenshot. now beeswax gui works fine. thank you a lot. 
i built a one table with hbase shell . I can see that in hue's filebrowser but not in beeswax tables. how can i see my hbase tables on hue's beeswax in hbase table format? 
my filebrowser look's so , and tables , which i built are cars and test. as far as l know hbase holds data in meta folder but here they are in hbase folder. i need a little help to figure out it . i am writing my thesis on that subject and i have to import such .csv file on hbase and then in first stage going to make a query with hive. 


thanks,
onur 

turna...@gmail.com

unread,
Feb 20, 2013, 5:19:41 AM2/20/13
to hue-...@cloudera.org
hey,

I am still trying to figure out beeswax. i installed sample database of hue . i can see tables but when i make a query ( also example saved query) i am getting that error ;

Driver returned: 1.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302200154_1490517290.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Job Submission failed with exception 'org.apache.hadoop.security.AccessControlException(Permission denied: user=admin, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:186)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:135)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4547)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4518)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2880)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:2844)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2823)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:639)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:417)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44096)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

as far as understand from this error, that hue's user do not have a permission to get data from from hdfs . 

Thanks
Onur

Chris Conner

unread,
Feb 20, 2013, 9:43:36 AM2/20/13
to hue-...@cloudera.org
Hey Onur,

Unfortunately there is no way in Hue 2.1 to see hbase tables.  In a future release(2.2 I think) there will be an hbase shell in Hue to see hbase tables, however, Beeswax will never have the functionality to directly see hbase tables.  You can create an external hive table that maps to the hbase table if you want to see the data through beeswax.  If that's something you'd like to know more about, I can send you the syntax for the external hive table.

Thanks
Chris

Chris Conner

unread,
Feb 20, 2013, 9:46:10 AM2/20/13
to hue-...@cloudera.org
Two things:

1.  Does the 'hive.metastore.warehouse.dir' exist already in HDFS?  I think you used "/user/beeswax/warehouse".
2.  What are the permissions on that directory?

Thanks
Chris

turna...@gmail.com

unread,
Feb 20, 2013, 12:46:19 PM2/20/13
to hue-...@cloudera.org
hey chris ,
i get dissapointed . i do not know that. i though that , i can see my tables as well , which I created with hbase shell command line . but if i with beeswax a table create . this table also built like a hbase table on hdfs ( eventually in HFile) in specific folder each datanode. and also i tried to create one table with beeswax ( i watched also cloudera's beeswax tutarial). and i get some error, which i posted . I want to ask how can I define column families and their related columns .as far as I see While create table process there is only one option to create column ( not column families).

I am very rookie with hdfs . I tried to see hdfs file through one node(master node) . "with hadoop fs -ls" i'm getting this message: ls: `.': No such file or directory
and also /var/lib/hive/metastore directory ,where i point my mysql metastore directory there is nothing in that directory. is it normal?
but default directory of hue beeswax /var/lib/hive/hue_beeswax_metastore/metastore_db have log,seg0,tmp,dblck,dbex,service.properties such of things of derby database.
and where should be that directory also under which directory ? under "bin" or "dfs" or "usr"
i thought that should be in hdfs but that command does not work out .

under Hbase Web UI i can see my tables look so ;

http://n1.example.com:60010/master-status

Master: n1.example.com:60000

Local logs, Thread Dump, Log Level, Debug dump

Attributes

Attribute Name Value Description
HBase Version 0.92.1-cdh4.1.3, rUnknown HBase version and revision
HBase Compiled Sat Jan 26 17:11:38 PST 2013, jenkins When HBase version was compiled and by whom
Hadoop Version 2.0.0-cdh4.1.3, rdbc7a60f9a798ef63afb7f5b723dc9c02d5321e1 Hadoop version and revision
Hadoop Compiled Sat Jan 26 16:46:14 PST 2013, jenkins When Hadoop version was compiled and by whom
HBase Root Directory hdfs://n1.example.com:8020/hbase Location of HBase home directory
HBase Cluster ID eab1a97c-2d1b-4d7d-8315-dcaf1c151f8d Unique identifier generated for each HBase cluster
Load average 1 Average number of regions per regionserver. Naive computation.
Zookeeper Quorum n1.example.com:2181 Addresses of all registered ZK servers. For more, see zk dump.
Coprocessors [] Coprocessors currently loaded loaded by the master
HMaster Start Time Tue Feb 19 15:25:28 CET 2013 Date stamp of when this HMaster was started
HMaster Active Time Tue Feb 19 15:25:28 CET 2013 Date stamp of when this HMaster became active
Tasks

Show All Monitored Tasks Show non-RPC Tasks Show All RPC Handler Tasks Show Active RPC Calls Show Client Operations View as JSON
No tasks currently running on this node.
Tables

Catalog Table Description
-ROOT- The -ROOT- table holds references to all .META. regions.
.META. The .META. table holds references to all User Table regions
2 table(s) in set. [Details]

User Table Description
cars {NAME => 'cars', FAMILIES => [{NAME => 'vi', MIN_VERSIONS => '0'}]}
test {NAME => 'test', FAMILIES => [{NAME => 'cf1', MIN_VERSIONS => '0'}]}
Region Servers

ServerName Start time Load
n1.example.com,60020,1361283928017 Tue Feb 19 15:25:28 CET 2013 requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=33, maxHeapMB=65
n2.example.com,60020,1361284069894 Tue Feb 19 15:27:49 CET 2013 requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=27, maxHeapMB=185
n3.example.com,60020,1361284067501 Tue Feb 19 15:27:47 CET 2013 requestsPerSecond=0, numberOfOnlineRegions=0, usedHeapMB=30, maxHeapMB=185
n4.example.com,60020,1361284298009 Tue Feb 19 15:31:38 CET 2013 requestsPerSecond=0, numberOfOnlineRegions=2, usedHeapMB=27, maxHeapMB=185
Total: servers: 4 requestsPerSecond=0, numberOfOnlineRegions=4
Load is requests per second and count of regions loaded

Dead Region Servers

Regions in Transition

No regions in transition.

thank you for your attention .
Onur

turna...@gmail.com

unread,
Feb 20, 2013, 12:59:54 PM2/20/13
to hue-...@cloudera.org
hi chris ,

i tried now somethings and worked ;
[root@n1 ~]# hadoop fs -ls /hbase
Found 10 items
drwxr-xr-x - hbase hbase 0 2013-02-08 04:11 /hbase/-ROOT-
drwxr-xr-x - hbase hbase 0 2013-02-08 04:11 /hbase/.META.
drwxr-xr-x - hbase hbase 0 2013-02-08 04:12 /hbase/.corrupt
drwxr-xr-x - hbase hbase 0 2013-02-19 15:27 /hbase/.logs
drwxr-xr-x - hbase hbase 0 2013-02-19 15:28 /hbase/.oldlogs
drwxr-xr-x - hbase hbase 0 2013-02-16 22:20 /hbase/cars
-rw-r--r-- 3 hbase hbase 38 2013-02-08 04:11 /hbase/hbase.id
-rw-r--r-- 3 hbase hbase 3 2013-02-08 04:11 /hbase/hbase.version
drwxr-xr-x - hbase hbase 0 2013-02-19 15:26 /hbase/splitlog
drwxr-xr-x - hbase hbase 0 2013-02-16 22:10 /hbase/test
[root@n1 ~]# hadoop fs -ls /hbase/cars
Found 3 items
-rw-r--r-- 3 hbase hbase 509 2013-02-16 22:20 /hbase/cars/.tableinfo.0000000001
drwxr-xr-x - hbase hbase 0 2013-02-16 22:20 /hbase/cars/.tmp
drwxr-xr-x - hbase hbase 0 2013-02-18 04:39 /hbase/cars/7c91bdc9437420e2896525114c0a0499
[root@n1 ~]# hadoop fs -ls /
Found 3 items
drwxr-xr-x - hbase hbase 0 2013-02-16 22:20 /hbase
drwxrwxrwt - hdfs hdfs 0 2013-02-20 10:39 /tmp
drwxr-xr-x - hdfs supergroup 0 2013-02-08 04:14 /user


[root@n1 ~]# hadoop fs -ls /user/beeswax/warehouse
Found 2 items
drwxr-xr-x - hue hdfs 0 2013-02-20 10:38 /user/beeswax/warehouse/sample_07
drwxr-xr-x - hue hdfs 0 2013-02-20 10:39 /user/beeswax/warehouse/sample_08

[root@n1 ~]# hadoop fs -ls /user/beeswax/warehouse/sample_07
Found 1 items
-rw-r--r-- 3 hue hdfs 46055 2013-02-20 10:39 /user/beeswax/warehouse/sample_07/sample_07.csv

know i can see every tables on hadoop, which I created but I'am could noy figure out these errors,
Driver returned: 1. Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302191549_48687276.txt
FAILED: Error in metadata: MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=admin, access=WRITE, inode="/user/beeswax/warehouse":hue:hive:drwxrwxr-x


at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:186)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:135)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4547)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4518)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2880)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:2844)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2823)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:639)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:417)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44096)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask


and

thanks
Onur

turna...@gmail.com

unread,
Feb 20, 2013, 1:00:07 PM2/20/13
to hue-...@cloudera.org
hi chris ,

at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:186)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:135)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4547)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4518)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2880)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:2844)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2823)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:639)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:417)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44096)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask


and

thanks
Onur

Romain Rigaux

unread,
Feb 20, 2013, 1:26:14 PM2/20/13
to turna...@gmail.com, hue-...@cloudera.org
Original error was because the user 'admin' did not have a home (/user/admin). This is fixed in the 2.2 version coming this month. In the meantime you can create it in the FileBrowser if you are logged in as the 'hdfs' user.

The second error is because '/user/beeswax/warehouse' should be chmoded to 1777 (cf https://ccp.cloudera.com/display/CDH4DOC/Hue+Installation#HueInstallation-HiveConfiguration).

About HBase I think that as long as you register the jars it will work (Beeswax is similar to an embeddeed Hive Client).

Romain

turna...@gmail.com

unread,
Feb 20, 2013, 2:42:27 PM2/20/13
to hue-...@cloudera.org
Hi Romain,
should I create a new directory or manipulate my existing user admin ( if change name as hdfs ) it will be not enough. if not how exactly should i create a new folder with file browser gui?


Thanks
Onur

Romain Rigaux

unread,
Feb 20, 2013, 2:51:01 PM2/20/13
to turna...@gmail.com, hue-...@cloudera.org
The latest error will be solved if you create a 'hdfs' superuser in Hue, login as 'hdfs' in Hue and then go to FileBrowser and change the permissions of '/user/beeswax/warehouse' to 1777 (check all the boxes).

You can also do in a shell:

sudo -u hdfs chmod 1777 /user/beeswax/warehouse

Romain

turna...@gmail.com

unread,
Feb 20, 2013, 7:58:17 PM2/20/13
to hue-...@cloudera.org
hey,

I created a new user as 'hdfs' in Hue GUI. than i erased hue example table in HUE, after that i preferred to change permission setting in command line. when i tried to used that command in shell ;

sudo -u hdfs chmod 1777 /user/beeswax/warehouse i getting that error;
chmod: cannot acces /user/beeswax/warehouse : No such file or directory and also

i tried as well as : hadoop fs -ls /user/beeswax/warehouse than I get nothing .

after i create new user as hdfs i tried to build example tables of hue although i erased these tables i am getting that error ; 'There was an error processing your request: Beeswax examples already installed.'


thank you for your attention

Onur

Romain Rigaux

unread,
Feb 20, 2013, 8:26:24 PM2/20/13
to turna...@gmail.com, hue-...@cloudera.org
If it is not there can you create it?

sudo -u hdfs mkdir /user/beeswax/warehouse

sudo -u hdfs chmod 1777 /user/beeswax/warehouse

To reset the examples in Hue:
/usr/share/hue/bin/hue dbshell
sqlite> delete from beeswax_metainstall;

Romain

turna...@gmail.com

unread,
Feb 26, 2013, 7:14:59 AM2/26/13
to hue-...@cloudera.org, turna...@gmail.com
hallo Romain,

I could not responced your last post. I was away from keyboard. 

I tried all , what you said. But still ı am getting more than one error.

I created a new user for beeswax as "hdfs"

ı had deleted  examples from hue's metastore (sqlite> delete from beeswax_metainstall;)

after that ı tried to create warehouse directory but ı could not ı am getting that errors;


than ı realized it directory is existing. ı reinstall sample tables from hue . while sample tables instaling . ı get this error 


but I can see tables through beeswax.


ı tried to use a example query on sample tables but ı does not worked out ;



13/02/26 03:28:15 INFO exec.HiveHistory: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_1211673658.txt
13/02/26 03:28:15 INFO ql.Driver: <PERFLOG method=compile>
13/02/26 03:28:15 INFO parse.ParseDriver: Parsing command: SELECT s07.description, s07.total_emp, s08.total_emp, s07.salary
FROM
  sample_07 s07 JOIN 
  sample_08 s08
ON ( s07.code = s08.code )
WHERE
( s07.total_emp > s08.total_emp
 AND s07.salary > 100000 )
SORT BY s07.salary DESC
13/02/26 03:28:15 INFO parse.ParseDriver: Parse Completed
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Get metadata for source tables
13/02/26 03:28:15 INFO metastore.HiveMetaStore: 6: get_table : db=default tbl=sample_07
13/02/26 03:28:15 INFO HiveMetaStore.audit: ugi=hdfs	ip=unknown-ip-addr	cmd=get_table : db=default tbl=sample_07	
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO metastore.HiveMetaStore: 6: get_table : db=default tbl=sample_08
13/02/26 03:28:15 INFO HiveMetaStore.audit: ugi=hdfs	ip=unknown-ip-addr	cmd=get_table : db=default tbl=sample_08	
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Get metadata for subqueries
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Get metadata for destination tables
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for FS(12)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for OP(11)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for RS(10)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for SEL(9)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for FIL(8)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Pushdown Predicates of FIL For Alias : s07
13/02/26 03:28:15 INFO ppd.OpProcFactory: 	(_col3 > 100000)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for JOIN(7)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Pushdown Predicates of JOIN For Alias : s07
13/02/26 03:28:15 INFO ppd.OpProcFactory: 	(VALUE._col3 > 100000)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for RS(5)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Pushdown Predicates of RS For Alias : s07
13/02/26 03:28:15 INFO ppd.OpProcFactory: 	(salary > 100000)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for TS(3)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Pushdown Predicates of TS For Alias : s07
13/02/26 03:28:15 INFO ppd.OpProcFactory: 	(salary > 100000)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for RS(6)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for TS(4)
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO physical.MetadataOnlyOptimizer: Looking for table scans where optimization is applicable
13/02/26 03:28:15 INFO physical.MetadataOnlyOptimizer: Found 0 metadata only table scans
13/02/26 03:28:15 INFO physical.MetadataOnlyOptimizer: Looking for table scans where optimization is applicable
13/02/26 03:28:15 INFO physical.MetadataOnlyOptimizer: Found 0 metadata only table scans
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Completed plan generation
13/02/26 03:28:15 INFO ql.Driver: Semantic Analysis Completed
13/02/26 03:28:15 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:description, type:string, comment:null), FieldSchema(name:total_emp, type:int, comment:null), FieldSchema(name:total_emp, type:int, comment:null), FieldSchema(name:salary, type:int, comment:null)], properties:null)
13/02/26 03:28:15 INFO ql.Driver: </PERFLOG method=compile start=1361878095433 end=1361878095784 duration=351>
Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
13/02/26 03:28:15 INFO exec.HiveHistory: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
13/02/26 03:28:15 INFO ql.Driver: <PERFLOG method=Driver.execute>
13/02/26 03:28:15 INFO ql.Driver: Starting command: SELECT s07.description, s07.total_emp, s08.total_emp, s07.salary
FROM
  sample_07 s07 JOIN 
  sample_08 s08
ON ( s07.code = s08.code )
WHERE
( s07.total_emp > s08.total_emp
 AND s07.salary > 100000 )
SORT BY s07.salary DESC
Total MapReduce jobs = 2
13/02/26 03:28:15 INFO ql.Driver: Total MapReduce jobs = 2
13/02/26 03:28:15 INFO ql.Driver: </PERFLOG method=TimeToSubmit end=1361878095885>
Launching Job 1 out of 2
13/02/26 03:28:15 INFO ql.Driver: Launching Job 1 out of 2
13/02/26 03:28:16 INFO exec.Utilities: Cache Content Summary for hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08 length: 46069 file count: 1 directory count: 1
13/02/26 03:28:16 INFO exec.Utilities: Cache Content Summary for hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07 length: 46055 file count: 1 directory count: 1
13/02/26 03:28:16 INFO exec.ExecDriver: BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=92124
Number of reduce tasks not specified. Estimated from input data size: 1
13/02/26 03:28:16 INFO exec.Task: Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
13/02/26 03:28:16 INFO exec.Task: In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
13/02/26 03:28:16 INFO exec.Task:   set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
13/02/26 03:28:16 INFO exec.Task: In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
13/02/26 03:28:16 INFO exec.Task:   set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
13/02/26 03:28:16 INFO exec.Task: In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
13/02/26 03:28:16 INFO exec.Task:   set mapred.reduce.tasks=<number>
13/02/26 03:28:16 INFO exec.ExecDriver: Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
13/02/26 03:28:16 INFO exec.ExecDriver: adding libjars: file:///usr/lib/hive/lib/hive-builtins-0.9.0-cdh4.1.3.jar
13/02/26 03:28:16 INFO exec.ExecDriver: Processing alias s07
13/02/26 03:28:16 INFO exec.ExecDriver: Adding input file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07
13/02/26 03:28:16 INFO exec.Utilities: Content Summary hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07length: 46055 num files: 1 num directories: 1
13/02/26 03:28:16 INFO exec.ExecDriver: Processing alias s08
13/02/26 03:28:16 INFO exec.ExecDriver: Adding input file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08
13/02/26 03:28:16 INFO exec.Utilities: Content Summary hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08length: 46069 num files: 1 num directories: 1
13/02/26 03:28:17 INFO exec.ExecDriver: Making Temp Directory: hdfs://n1.example.com:8020/tmp/hive-beeswax-hdfs/hive_2013-02-26_03-28-15_434_7027909062045907076/-mr-10002
13/02/26 03:28:21 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/02/26 03:28:22 INFO io.CombineHiveInputFormat: CombineHiveInputSplit creating pool for hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07; using filter path hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07
13/02/26 03:28:22 INFO io.CombineHiveInputFormat: CombineHiveInputSplit creating pool for hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08; using filter path hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08
13/02/26 03:28:22 INFO mapred.FileInputFormat: Total input paths to process : 2
13/02/26 03:28:22 INFO io.CombineHiveInputFormat: number of splits 2
Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
13/02/26 03:28:32 INFO exec.Task: Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
13/02/26 03:28:32 INFO exec.Task: Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
13/02/26 03:28:49 INFO exec.Task: Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
13/02/26 03:28:49 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
13/02/26 03:28:49 INFO exec.Task: 2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
13/02/26 03:29:49 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
13/02/26 03:29:49 INFO exec.Task: 2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
13/02/26 03:30:37 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
13/02/26 03:30:37 INFO exec.Task: 2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
13/02/26 03:30:37 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
Ended Job = job_201302210135_0001 with errors
13/02/26 03:30:37 ERROR exec.Task: Ended Job = job_201302210135_0001 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
13/02/26 03:30:37 ERROR ql.Driver: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
13/02/26 03:30:37 INFO ql.Driver: </PERFLOG method=Driver.execute start=1361878095875 end=1361878237752 duration=141877>
MapReduce Jobs Launched: 
13/02/26 03:30:37 INFO ql.Driver: MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
13/02/26 03:30:37 INFO ql.Driver: Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
13/02/26 03:30:37 INFO ql.Driver: Total MapReduce CPU Time Spent: 0 msec
13/02/26 03:30:38 ERROR beeswax.BeeswaxServiceImpl: Exception while processing query
BeeswaxException(message:Driver returned: 2.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0001 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
, log_context:e7199542-f187-458c-b3dd-887560485a81, handle:QueryHandle(id:e7199542-f187-458c-b3dd-887560485a81, log_context:e7199542-f187-458c-b3dd-887560485a81), SQLState:     )
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute(BeeswaxServiceImpl.java:319)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:577)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:566)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:337)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1312)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1.run(BeeswaxServiceImpl.java:566)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
13/02/26 03:30:39 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:BeeswaxException(message:Driver returned: 2.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0001 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
, log_context:e7199542-f187-458c-b3dd-887560485a81, handle:QueryHandle(id:e7199542-f187-458c-b3dd-887560485a81, log_context:e7199542-f187-458c-b3dd-887560485a81), SQLState:     )
13/02/26 03:30:39 ERROR beeswax.BeeswaxServiceImpl: Caught BeeswaxException
BeeswaxException(message:Driver returned: 2.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0001 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
, log_context:e7199542-f187-458c-b3dd-887560485a81, handle:QueryHandle(id:e7199542-f187-458c-b3dd-887560485a81, log_context:e7199542-f187-458c-b3dd-887560485a81), SQLState:     )
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute(BeeswaxServiceImpl.java:319)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:577)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:566)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:337)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1312)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1.run(BeeswaxServiceImpl.java:566)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Duration0:02:12
Ended02/26/13 03:30:36
IDjob_201302210135_0001
Userhdfs
Mapred Input Dirhdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07 
hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08
Mapred Input Format Classorg.apache.hadoop.hive.ql.io.CombineHiveInputFormat
Mapred Mapper Classorg.apache.hadoop.hive.ql.exec.ExecMapper
Mapred Output Format Classorg.apache.hadoop.hive.ql.io.HiveOutputFormatImpl
Mapred Reducer Classorg.apache.hadoop.hive.ql.exec.ExecReducer
Maps0 of 2
Reduces0 of 1
Started02/26/13 03:28:24
StatusFAILED
 
I can not import my database before ı fix that errors. otherweise ı can see now sample databases on hdfs ;


Romain Rigaux

unread,
Feb 26, 2013, 12:30:27 PM2/26/13
to turna...@gmail.com, hue-...@cloudera.org
So the 'hadoop fs' name was missing from the command:
sudo -u hdfs hadoop fs -mkdir /user/beeswax/warehouse
sudo -u hdfs hadoop fs -chmod 1777 /user/beeswax/warehouse

But as the files are already in the warehouse, you don't need to create it. If the tables appears in the 'Tables' tab you are fine (if not, delete the sudo -u hdfs hadoop fs -rmr /user/beeswax/warehouse/* and again in sqlite and reinstall them in Hue by clicking on the button).

However, looking at you logs, it seems that the tables are good but the underlying MapReduces jobs are failing. You need to look at their logs by clicking on one of the id below "MR JOBS" when the query is running.
How this link too from the logs.


Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001

Romain

turna...@gmail.com

unread,
Feb 27, 2013, 2:26:13 PM2/27/13
to hue-...@cloudera.org, turna...@gmail.com
hi Romain,

I tried last command fro change permission. and that,s work fine. thank you very much.  
ı had checked out all logs ,while processing of query. ı guess map-reduce does not work properly . every time when ı make a query , jobs are failed. they looks like so,

that output of logs;
Starting Job = job_201302210135_0002, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0002
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0002
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-27 10:56:24,583 Stage-1 map = 0%,  reduce = 0%
2013-02-27 10:57:24,870 Stage-1 map = 0%,  reduce = 0%
2013-02-27 10:57:50,623 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0002 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
, log_context:f8067762-4935-472e-9437-8f550b65e54a, handle:QueryHandle(id:f8067762-4935-472e-9437-8f550b65e54a, log_context:f8067762-4935-472e-9437-8f550b65e54a), SQLState:     )
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute(BeeswaxServiceImpl.java:319)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:577)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:566)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:337)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1312)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1.run(BeeswaxServiceImpl.java:566)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)


Driver returned: 2.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302271055_1265331267.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201302210135_0002, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0002
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0002
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-27 10:56:24,583 Stage-1 map = 0%,  reduce = 0%
2013-02-27 10:57:24,870 Stage-1 map = 0%,  reduce = 0%
2013-02-27 10:57:50,623 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0002 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec



they are from mr job ;

View Failed Tasks » Failed Tasks

TasksType
m_000000MAP
m_000001MAP
View All Tasks »

Recent Tasks

TasksType
m_000002JOB_CLEANUP
m_000003JOB_SETUP

thats from one task from above ;


Attempt IDProgressStateTask TrackerStart TimeEnd TimeOutput SizePhaseShuffle FinishSort FinishMap Finish
000000_0100%failedtracker_n3.example.com:localhost/127.0.0.1:3655502/27/13 11:00:0602/27/13 11:00:19-1CLEANUP
000000_1100%failedtracker_n2.example.com:localhost/127.0.0.1:4654302/27/13 11:00:2002/27/13 11:00:32-1CLEANUP
000000_2100%failedtracker_n4.example.com:localhost/127.0.0.1:4141002/27/13 11:04:1002/27/13 11:04:19-1CLEANUP
000000_30%killedtracker_n1.example.com:localhost/127.0.0.1:5917002/27/13 10:57:3702/27/13 10:57:46-1MAP



Hadoop job_201302210135_0002 on n1

User: hdfs
Job Name: SELECT s07.description, s07.salary, s...DESC(Stage-1)
Job File: hdfs://n1.example.com:8020/user/hdfs/.staging/job_201302210135_0002/job.xml
Submit Host: n1.example.com
Submit Host Address: 192.168.0.241
Job-ACLs: All users are allowed
Job Setup: Successful
Status: Failed
Failure Info:NA
Started at: Wed Feb 27 19:56:16 CET 2013
Failed at: Wed Feb 27 19:57:50 CET 2013
Failed in: 1mins, 34sec
Job Cleanup: Successful

Kind% CompleteNum TasksPendingRunningCompleteKilledFailed/Killed
Task Attempts
map100.00%
200027 / 1
reduce100.00%
100010 / 0


CounterMapReduceTotal
Job CountersFailed map tasks001
Launched map tasks008
Data-local map tasks002
Rack-local map tasks006
Total time spent by all maps in occupied slots (ms)0058,457
Total time spent by all reduces in occupied slots (ms)000
Total time spent by all maps waiting after reserving slots (ms)000
Total time spent by all reduces waiting after reserving slots (ms)000


Map Completion Graph - close 

Reduce Completion Graph - close 

I suppose , I should correct map reduce configuration or . but ı do not know , why is that happening? 


thank you for your attention 
Onur Turna

Romain Rigaux

unread,
Feb 27, 2013, 2:43:07 PM2/27/13
to turna...@gmail.com, hue-...@cloudera.org
If MapReduce is not configured, you can test it by running an example like described in https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-RunninganexampleapplicationwithMRv1

About Beeswax, it is nice too look at the tasks of the jobs in JobBrowser but could you drill down to the 'log' level of a failed task? (the log from the MapReduce task itself should show what's going wrong)

Romain

turna...@gmail.com

unread,
Feb 27, 2013, 3:50:56 PM2/27/13
to hue-...@cloudera.org, turna...@gmail.com

hi Romain,


ı did not configured map-reduce  myself. valid configuration of map reduce  are default configuration of cloudera.
I checked where you said. they looks like so,


FAILED MAP task list for 0002_1361991376229_hdfs

Task IdStart TimeFinish Time
Error
task_201302210135_0002_m_00000027/02 20:00:0627/02 20:00:19 (13sec)Error: Java heap space
task_201302210135_0002_m_00000027/02 20:00:2027/02 20:00:32 (12sec)Error: Java heap space
task_201302210135_0002_m_00000027/02 20:04:1027/02 20:04:19 (8sec)Error: Java heap space
task_201302210135_0002_m_00000127/02 20:03:4527/02 20:03:57 (12sec)Error: Java heap space
task_201302210135_0002_m_00000127/02 20:00:2027/02 20:00:31 (11sec)Error: Java heap space
task_201302210135_0002_m_00000127/02 19:56:5527/02 19:57:36 (41sec)Error: Java heap space
task_201302210135_0002_m_00000127/02 20:01:1727/02 20:01:26 (8sec)Error: Java heap space


attempt_201302210135_0002_m_000000_027/02 20:00:0627/02 20:00:19 (13sec)n3.example.comError: Java heap spaceLast 4KB
Last 8KB
All
0
attempt_201302210135_0002_m_000000_127/02 20:00:2027/02 20:00:32 (12sec)n2.example.comError: Java heap spaceLast 4KB
Last 8KB
All
0
attempt_201302210135_0002_m_000000_227/02 20:04:1027/02 20:04:19 (8sec)n4.example.comError: Java heap spaceLast 4KB
Last 8KB
All
0
attempt_201302210135_0002_m_000000_327/02 19:57:3727/02 19:57:46 (9sec)n1.example.comLast 4KB
Last 8KB
All
0





why could be heap space error? it's could be because of insufficient hardware of clusters or ?  My all clusters (4 vm cluster) works on vmware in my personal laptop , which have 8 bg ram they looks like so ;



      n1.example.com192.168.0.241/defaultCDH4Good9.51s ago2
0,840,840,76
5.1 GiB / 37.2 GiB
1.4 GiB / 1.4 GiB
549.8 MiB / 2.9 GiB
n2.example.com192.168.0.242/defaultCDH4Good1.90s ago2
0,000,000,00
4.7 GiB / 37.2 GiB
462.3 MiB / 1.4 GiB
0 B / 2.9 GiB
n3.example.com192.168.0.243/defaultCDH4Good3.33s ago2
0,050,010,00
4.7 GiB / 37.2 GiB
483.5 MiB / 1.4 GiB
0 B / 2.9 GiB
n4.example.com192.168.0.246/defaultCDH4Good6.52s ago2
0,000,000,00
4.7 GiB / 37.2 GiB
476.3 MiB / 1.4 GiB
0 B / 2.9 GiB




and also except these ı looked up syslog as well , which look so ;



2013-02-27 19:57:41,710 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-27 19:57:43,371 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/1804431985416186481_538532255_486741148/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-27_10-55-58_866_1132759753206554877/-mr-10004/eee2a09a-a215-43c3-aab5-1de316044d27 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/attempt_201302210135_0002_m_000000_3/work/HIVE_PLANeee2a09a-a215-43c3-aab5-1de316044d27
2013-02-27 19:57:43,619 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/attempt_201302210135_0002_m_000000_3/work/.job.jar.crc
2013-02-27 19:57:43,622 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/attempt_201302210135_0002_m_000000_3/work/job.jar
2013-02-27 19:57:43,700 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-27 19:57:43,719 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-27 19:57:46,132 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-27 19:57:46,242 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@101a0ae6
2013-02-27 19:57:46,564 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-27 19:57:46,568 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: Failed on local exception: java.io.IOException; Host Details : local host is: "n1.example.com/192.168.0.241"; destination host is: "n1.example.com":8020; 


Thanks,
Onur Turna

Romain Rigaux

unread,
Feb 27, 2013, 4:21:00 PM2/27/13
to turna...@gmail.com, hue-...@cloudera.org
Hum, that's a lot of VM for one physical machine (I personally use a pseudo distributed cluster).

What does it say when you click on the 'Last 4KB' of a failed task with the heap space error?

Error: Java heap space Last 4KB
Last 8KB
All


Could you please try to run the Hadoop example? https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-RunninganexampleapplicationwithMRv1

About this I don't know. Maybe the JobTracker logs have more information, or it a proxy user setting is missing.

2013-02-27 19:57:46,568 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: Failed on local exception: java.io.IOException; Host Details : local host is: "n1.example.com/192.168.0.241"; destination host is: "n1.example.com":8020; 

Romain

turna...@gmail.com

unread,
Feb 28, 2013, 5:32:34 PM2/28/13
to hue-...@cloudera.org, turna...@gmail.com
Hi,
I want to make a performance test, that's way i need a as much as possible cluster with my computer ..... as far as i know performance to much relevant number of cluster ( namenode). 


Hadoop job_201302210135_0003 failures on n1

AttemptTaskMachineStateErrorLogs
attempt_201302210135_0003_m_000000_0task_201302210135_0003_m_000000n3.example.comFAILED
Error: Java heap space
Last 4KB
Last 8KB
All
attempt_201302210135_0003_m_000000_1task_201302210135_0003_m_000000n1.example.comFAILED
Error: Java heap space
Last 4KB
Last 8KB
All
attempt_201302210135_0003_m_000000_2task_201302210135_0003_m_000000n2.example.comFAILED
Error: Java heap space
Last 4KB
Last 8KB
All
attempt_201302210135_0003_m_000000_3


when i clicked "all" first one ;

Task Logs: 'attempt_201302210135_0003_m_000000_0'



stdout logs



stderr logs



syslog logs
2013-02-28 23:01:24,052 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-28 23:01:24,908 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/5651432844641255173_-1826301297_584014340/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-28_13-57-19_791_3663977264499258484/-mr-10003/fb822c07-e95c-413f-b2b5-b6edb08f63c6 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_0/work/HIVE_PLANfb822c07-e95c-413f-b2b5-b6edb08f63c6
2013-02-28 23:01:24,918 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_0/work/.job.jar.crc
2013-02-28 23:01:24,921 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_0/work/job.jar
2013-02-28 23:01:24,974 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-28 23:01:24,975 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-28 23:01:25,622 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-28 23:01:25,635 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@3e152f4
2013-02-28 23:01:26,156 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-02-28 23:01:26,436 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-02-28 23:01:26,436 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-02-28 23:01:26,441 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-02-28 23:01:26,452 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-02-28 23:01:26,629 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-28 23:01:26,634 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)


second one;

Task Logs: 'attempt_201302210135_0003_m_000000_1'



stdout logs



stderr logs



syslog logs
2013-02-28 22:57:53,287 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-28 22:57:54,499 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/4869933896674749018_-1826301297_584014340/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-28_13-57-19_791_3663977264499258484/-mr-10003/fb822c07-e95c-413f-b2b5-b6edb08f63c6 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_1/work/HIVE_PLANfb822c07-e95c-413f-b2b5-b6edb08f63c6
2013-02-28 22:57:54,506 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_1/work/.job.jar.crc
2013-02-28 22:57:54,508 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_1/work/job.jar
2013-02-28 22:57:54,558 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-28 22:57:54,560 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-28 22:57:55,155 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-28 22:57:55,170 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@2326a29c
2013-02-28 22:57:55,881 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-02-28 22:57:56,469 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-02-28 22:57:56,469 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-02-28 22:57:56,475 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-02-28 22:57:56,499 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-02-28 22:57:56,676 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-28 22:57:56,680 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)

third one;

Task Logs: 'attempt_201302210135_0003_m_000000_2'



stdout logs



stderr logs



syslog logs
2013-02-28 23:01:43,350 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-28 23:01:44,477 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/-6064914283392164676_-1826301297_584014340/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-28_13-57-19_791_3663977264499258484/-mr-10003/fb822c07-e95c-413f-b2b5-b6edb08f63c6 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_2/work/HIVE_PLANfb822c07-e95c-413f-b2b5-b6edb08f63c6
2013-02-28 23:01:44,510 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_2/work/.job.jar.crc
2013-02-28 23:01:44,528 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_2/work/job.jar
2013-02-28 23:01:44,659 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-28 23:01:44,661 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-28 23:01:45,147 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-28 23:01:45,169 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@7d15d06c
2013-02-28 23:01:45,535 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-02-28 23:01:45,842 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-02-28 23:01:45,843 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-02-28 23:01:45,850 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-02-28 23:01:45,866 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-02-28 23:01:46,034 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-28 23:01:46,037 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)


and last one;

Task Logs: 'attempt_201302210135_0003_m_000000_3'



stdout logs



stderr logs



syslog logs
2013-02-28 23:05:29,916 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-28 23:05:30,877 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/2824718158320928215_-1826301297_584014340/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-28_13-57-19_791_3663977264499258484/-mr-10003/fb822c07-e95c-413f-b2b5-b6edb08f63c6 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_3/work/HIVE_PLANfb822c07-e95c-413f-b2b5-b6edb08f63c6
2013-02-28 23:05:30,920 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_3/work/.job.jar.crc
2013-02-28 23:05:30,924 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_3/work/job.jar
2013-02-28 23:05:30,989 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-28 23:05:30,990 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-28 23:05:31,447 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-28 23:05:31,463 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@67071c84
2013-02-28 23:05:31,819 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-02-28 23:05:32,099 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-02-28 23:05:32,100 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-02-28 23:05:32,103 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-02-28 23:05:32,111 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-02-28 23:05:32,282 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-28 23:05:32,286 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)




and also;

I followed tutorial from cloudera webseite , which you sad to me. ( https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-RunninganexampleapplicationwithMRv1) so far as i understood i have created a user as onur but i could not create a new directory(input) as the user onur.

actually i guess , i have to learn more about hadoop commands. if you know some hands on tutorial for hadoop and map-reduce i appreciate you.


thank you for your attention
Onur Turna



...

turna...@gmail.com

unread,
Mar 1, 2013, 5:14:13 PM3/1/13
to hue-...@cloudera.org, turna...@gmail.com
Hi ,

I had tried MapReduce through that manual . when i want to put xml datei from hadoop configuration i getting permission denied error. that's way i can not test mapreduce functionality. i suppose i have to create a new hadoop user on my centos? or ? 
if i have to create new hadoop user on my centos how should i do that?


thanks
onur turna

Romain Rigaux

unread,
Mar 1, 2013, 6:29:03 PM3/1/13
to turna...@gmail.com, hue-...@cloudera.org
The memory problem is a well known one, you can find how to bump it on:
http://stackoverflow.com/questions/8464048/out-of-memory-error-in-hadoop

For the second question, you need to create a Unix user 'onur'. e.g.

adduser onur


Romain

turna...@gmail.com

unread,
Mar 1, 2013, 7:35:14 PM3/1/13
to hue-...@cloudera.org, turna...@gmail.com

i had tired to run mapr-reduce example. i get same error;
[root@n1 ~]# sudo -u hdfs /usr/bin/hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /user/onur/input /user/onur/output 'dfs[a-z.]+'
13/03/02 01:26:55 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/03/02 01:26:55 INFO mapred.FileInputFormat: Total input paths to process : 3
13/03/02 01:26:56 INFO mapred.JobClient: Running job: job_201302210135_0006
13/03/02 01:26:57 INFO mapred.JobClient:  map 0% reduce 0%
13/03/02 01:27:17 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000000_0, Status : FAILED
Error: Java heap space
13/03/02 01:27:19 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000001_0, Status : FAILED
Error: Java heap space
13/03/02 01:27:27 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000002_0, Status : FAILED
Error: Java heap space
13/03/02 01:27:35 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000000_1, Status : FAILED
Error: Java heap space
13/03/02 01:27:35 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000001_1, Status : FAILED
Error: Java heap space
13/03/02 01:27:46 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000002_1, Status : FAILED
Error: Java heap space
13/03/02 01:27:49 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000001_2, Status : FAILED
Error: Java heap space
13/03/02 01:27:51 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000000_2, Status : FAILED
Error: Java heap space
13/03/02 01:28:03 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000002_2, Status : FAILED
13/03/02 01:28:10 INFO mapred.JobClient: Job complete: job_201302210135_0006
13/03/02 01:28:10 INFO mapred.JobClient: Counters: 8
13/03/02 01:28:10 INFO mapred.JobClient:   Job Counters 
13/03/02 01:28:10 INFO mapred.JobClient:     Failed map tasks=1
13/03/02 01:28:10 INFO mapred.JobClient:     Launched map tasks=12
13/03/02 01:28:10 INFO mapred.JobClient:     Data-local map tasks=9
13/03/02 01:28:10 INFO mapred.JobClient:     Rack-local map tasks=3
13/03/02 01:28:10 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=86972
13/03/02 01:28:10 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=0
13/03/02 01:28:10 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/03/02 01:28:10 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/03/02 01:28:10 INFO mapred.JobClient: Job Failed: NA
java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1322)
        at org.apache.hadoop.examples.Grep.run(Grep.java:69)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.examples.Grep.main(Grep.java:93)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

i'm going to figure it out. i hope soon.


thanks
onur turna

<b style="line-height:17px;font-size:11px;font-family:sans-seri
...

Romain Rigaux

unread,
Mar 1, 2013, 7:56:41 PM3/1/13
to turna...@gmail.com, hue-...@cloudera.org
If playing with the options from the link in my previous post does not help, I would recommend just setting up a pseudo distributed cluser.

I also guess that '/user/onur/input' is pretty small?

Romain

Abraham Elmahrek

unread,
Mar 1, 2013, 8:21:00 PM3/1/13
to Romain Rigaux, turna...@gmail.com, hue-...@cloudera.org
Hey Onur,

Please check your system limits via 'ulimit -a'... some times the out of memory exception is thrown if your system cannot open a process or if there aren't enough file descriptors available.

-Abe

turna...@gmail.com

unread,
Mar 1, 2013, 9:50:29 PM3/1/13
to hue-...@cloudera.org, turna...@gmail.com
Hi,

i changed child hesap size  just like example of webseite(1024 mib). and it worked ;
[root@n1 ~]#  sudo -u hdfs /usr/bin/hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /user/onur/input /user/onur/output 'dfs[a-z.]+'
13/03/02 03:14:08 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/03/02 03:14:11 INFO mapred.FileInputFormat: Total input paths to process : 3
13/03/02 03:14:14 INFO mapred.JobClient: Running job: job_201303020304_0001
13/03/02 03:14:15 INFO mapred.JobClient:  map 0% reduce 0%
13/03/02 03:14:40 INFO mapred.JobClient:  map 33% reduce 0%
13/03/02 03:14:45 INFO mapred.JobClient:  map 66% reduce 0%
13/03/02 03:14:56 INFO mapred.JobClient:  map 100% reduce 0%
13/03/02 03:15:06 INFO mapred.JobClient:  map 100% reduce 100%
13/03/02 03:15:08 INFO mapred.JobClient: Job complete: job_201303020304_0001
13/03/02 03:15:08 INFO mapred.JobClient: Counters: 33
13/03/02 03:15:08 INFO mapred.JobClient:   File System Counters
13/03/02 03:15:08 INFO mapred.JobClient:     FILE: Number of bytes read=200
13/03/02 03:15:08 INFO mapred.JobClient:     FILE: Number of bytes written=704772
13/03/02 03:15:08 INFO mapred.JobClient:     FILE: Number of read operations=0
13/03/02 03:15:08 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/03/02 03:15:08 INFO mapred.JobClient:     FILE: Number of write operations=0
13/03/02 03:15:08 INFO mapred.JobClient:     HDFS: Number of bytes read=4190
13/03/02 03:15:08 INFO mapred.JobClient:     HDFS: Number of bytes written=416
13/03/02 03:15:08 INFO mapred.JobClient:     HDFS: Number of read operations=8
13/03/02 03:15:08 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/03/02 03:15:08 INFO mapred.JobClient:     HDFS: Number of write operations=4
13/03/02 03:15:08 INFO mapred.JobClient:   Job Counters 
13/03/02 03:15:08 INFO mapred.JobClient:     Launched map tasks=3
13/03/02 03:15:08 INFO mapred.JobClient:     Launched reduce tasks=2
13/03/02 03:15:08 INFO mapred.JobClient:     Data-local map tasks=3
13/03/02 03:15:08 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=76944
13/03/02 03:15:08 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=13855
13/03/02 03:15:08 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/03/02 03:15:08 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/03/02 03:15:08 INFO mapred.JobClient:   Map-Reduce Framework
13/03/02 03:15:08 INFO mapred.JobClient:     Map input records=147
13/03/02 03:15:08 INFO mapred.JobClient:     Map output records=7
13/03/02 03:15:08 INFO mapred.JobClient:     Map output bytes=188
13/03/02 03:15:08 INFO mapred.JobClient:     Input split bytes=329
13/03/02 03:15:08 INFO mapred.JobClient:     Combine input records=7
13/03/02 03:15:08 INFO mapred.JobClient:     Combine output records=7
13/03/02 03:15:08 INFO mapred.JobClient:     Reduce input groups=7
13/03/02 03:15:08 INFO mapred.JobClient:     Reduce shuffle bytes=256
13/03/02 03:15:08 INFO mapred.JobClient:     Reduce input records=7
13/03/02 03:15:08 INFO mapred.JobClient:     Reduce output records=7
13/03/02 03:15:08 INFO mapred.JobClient:     Spilled Records=14
13/03/02 03:15:08 INFO mapred.JobClient:     CPU time spent (ms)=33400
13/03/02 03:15:08 INFO mapred.JobClient:     Physical memory (bytes) snapshot=604430336
13/03/02 03:15:08 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=7996264448
13/03/02 03:15:08 INFO mapred.JobClient:     Total committed heap usage (bytes)=267726848
13/03/02 03:15:08 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/03/02 03:15:08 INFO mapred.JobClient:     BYTES_READ=3861
13/03/02 03:15:08 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/03/02 03:15:08 INFO mapred.FileInputFormat: Total input paths to process : 2
13/03/02 03:15:08 INFO mapred.JobClient: Running job: job_201303020304_0002
13/03/02 03:15:09 INFO mapred.JobClient:  map 0% reduce 0%
13/03/02 03:15:21 INFO mapred.JobClient:  map 100% reduce 0%
13/03/02 03:15:26 INFO mapred.JobClient:  map 100% reduce 100%
13/03/02 03:15:28 INFO mapred.JobClient: Job complete: job_201303020304_0002
13/03/02 03:15:28 INFO mapred.JobClient: Counters: 33
13/03/02 03:15:28 INFO mapred.JobClient:   File System Counters
13/03/02 03:15:28 INFO mapred.JobClient:     FILE: Number of bytes read=165
13/03/02 03:15:28 INFO mapred.JobClient:     FILE: Number of bytes written=415431
13/03/02 03:15:28 INFO mapred.JobClient:     FILE: Number of read operations=0
13/03/02 03:15:28 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/03/02 03:15:28 INFO mapred.JobClient:     FILE: Number of write operations=0
13/03/02 03:15:28 INFO mapred.JobClient:     HDFS: Number of bytes read=658
13/03/02 03:15:28 INFO mapred.JobClient:     HDFS: Number of bytes written=146
13/03/02 03:15:28 INFO mapred.JobClient:     HDFS: Number of read operations=7
13/03/02 03:15:28 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/03/02 03:15:28 INFO mapred.JobClient:     HDFS: Number of write operations=2
13/03/02 03:15:28 INFO mapred.JobClient:   Job Counters 
13/03/02 03:15:28 INFO mapred.JobClient:     Launched map tasks=2
13/03/02 03:15:28 INFO mapred.JobClient:     Launched reduce tasks=1
13/03/02 03:15:28 INFO mapred.JobClient:     Data-local map tasks=2
13/03/02 03:15:28 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=13984
13/03/02 03:15:28 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=4753
13/03/02 03:15:28 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/03/02 03:15:28 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/03/02 03:15:28 INFO mapred.JobClient:   Map-Reduce Framework
13/03/02 03:15:28 INFO mapred.JobClient:     Map input records=7
13/03/02 03:15:28 INFO mapred.JobClient:     Map output records=7
13/03/02 03:15:28 INFO mapred.JobClient:     Map output bytes=188
13/03/02 03:15:28 INFO mapred.JobClient:     Input split bytes=242
13/03/02 03:15:28 INFO mapred.JobClient:     Combine input records=0
13/03/02 03:15:28 INFO mapred.JobClient:     Combine output records=0
13/03/02 03:15:28 INFO mapred.JobClient:     Reduce input groups=1
13/03/02 03:15:28 INFO mapred.JobClient:     Reduce shuffle bytes=184
13/03/02 03:15:28 INFO mapred.JobClient:     Reduce input records=7
13/03/02 03:15:28 INFO mapred.JobClient:     Reduce output records=7
13/03/02 03:15:28 INFO mapred.JobClient:     Spilled Records=14
13/03/02 03:15:28 INFO mapred.JobClient:     CPU time spent (ms)=2910
13/03/02 03:15:28 INFO mapred.JobClient:     Physical memory (bytes) snapshot=365584384
13/03/02 03:15:28 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=4818108416
13/03/02 03:15:28 INFO mapred.JobClient:     Total committed heap usage (bytes)=169353216
13/03/02 03:15:28 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/03/02 03:15:28 INFO mapred.JobClient:     BYTES_READ=244
[root@n1 ~]#  sudo -u hdfs hadoop fs -ls
Found 1 items
drwx------   - hdfs hdfs          0 2013-03-02 03:15 .staging
[root@n1 ~]#  sudo -u hdfs hadoop fs -ls /user/onur
Found 2 items
drwxr-xr-x   - hdfs supergroup          0 2013-03-02 00:19 /user/onur/input
drwxr-xr-x   - hdfs supergroup          0 2013-03-02 03:15 /user/onur/output
[root@n1 ~]#  sudo -u hdfs hadoop fs -ls /user/onur/output
Found 3 items
-rw-r--r--   3 hdfs supergroup          0 2013-03-02 03:15 /user/onur/output/_SUCCESS
drwxr-xr-x   - hdfs supergroup          0 2013-03-02 03:15 /user/onur/output/_logs
-rw-r--r--   3 hdfs supergroup        146 2013-03-02 03:15 /user/onur/output/part-00000

[root@n1 ~]#  sudo -u hdfs hadoop fs -ls /user/onur/output
Found 3 items
-rw-r--r--   3 hdfs supergroup          0 2013-03-02 03:15 /user/onur/output/_SUCCESS
drwxr-xr-x   - hdfs supergroup          0 2013-03-02 03:15 /user/onur/output/_logs
-rw-r--r--   3 hdfs supergroup        146 2013-03-02 03:15 /user/onur/output/part-00000
[root@n1 ~]#  sudo -u hdfs hadoop fs cat /user/onur/output/part-00000 | head
cat: Unknown command
Did you mean -cat?  This command begins with a dash.
[root@n1 ~]#  sudo -u hdfs hadoop fs -cat /user/onur/output/part-00000 | head
1       dfs.blocksize
1       dfs.https.port
1       dfs.replication
1       dfs.client.use.datanode.hostname
1       dfs.datanode.hdfs
1       dfs.https.address
1       dfs.namenode.http






But when i make a query through hive i am getting still same error. 




IDİsimStatusKullanıcıMapsReducesQueuePriorityDurationDate
201303020304_0003SELECT sample_07.description, sample_...DESC(Stage-1)failedhdfs
0 / 1
0 / 1
defaultnormal46s03/01/13 18:20:09
201303020304_0002grep-sortsucceededhdfs
2 / 2
1 / 1
defaultnormal18s03/01/13 18:15:08
201303020304_0001grep-searchsucceededhdfs
3 / 3
2 / 2
defaultnormal54s03/01/13 18:14:12
 

Hadoop job_201303020304_0003 failures on n1

AttemptTaskMachineStateErrorLogs
attempt_201303020304_0003_m_000000_0task_201303020304_0003_m_000000
n3.example.comFAILED
Error: Java heap space
Last 4KB
Last 8KB
All
attempt_201303020304_0003_m_000000_1task_201303020304_0003_m_000000
n1.example.comFAILED
Error: Java heap space
Last 4KB
Last 8KB
All
attempt_201303020304_0003_m_000000_2task_201303020304_0003_m_000000
n2.example.comFAILED
Error: Java heap space
Last 4KB
Last 8KB
All
attempt_201303020304_0003_m_000000_3task_201303020304_0003_m_000000n4.example.com
FAILED
Error: Java heap space
Last 4KB
Last 8KB
All




2013-03-02 03:24:02,074 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-03-02 03:24:02,861 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/1153535521351037204_943032454_686175782/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-03-01_18-20-01_992_5419204033953410960/-mr-10003/fd47ce6a-5113-4f45-88e2-90c469b71edf <- /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/attempt_201303020304_0003_m_000000_0/work/HIVE_PLANfd47ce6a-5113-4f45-88e2-90c469b71edf
2013-03-02 03:24:02,870 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/attempt_201303020304_0003_m_000000_0/work/.job.jar.crc
2013-03-02 03:24:02,872 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/attempt_201303020304_0003_m_000000_0/work/job.jar
2013-03-02 03:24:03,010 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-03-02 03:24:03,011 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-03-02 03:24:03,382 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-03-02 03:24:03,393 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@3d7dc1cb
2013-03-02 03:24:03,692 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-03-02 03:24:03,955 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-03-02 03:24:03,955 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-03-02 03:24:03,959 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-03-02 03:24:03,968 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-03-02 03:24:04,113 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-03-02 03:24:04,117 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)


Should i assign more heap size again ( more than 1024 mib) ?  i suppose, my namenode use full of your Ram. It can be because of that or ?  i assigned to my all clusters 1.5 gb ram. hosts look like :


Name
IPRackCDH VersionHealth
Last Heartbeat
Number of CoresLoad AverageDisk UsagePhysical MemorySwap Space
 
n1.example.com192.168.0.241/defaultCDH4Good
9.61s ago2
1,280,430,29
5.2 GiB / 37.2 GiB
1.4 GiB / 1.4 GiB
459.6 MiB / 2.9 GiB
n2.example.com192.168.0.242/defaultCDH4Good
13.45s ago2
0,150,050,01
4.7 GiB / 37.2 GiB
465.9 MiB / 1.4 GiB
0 B / 2.9 GiB
n3.example.com192.168.0.243/defaultCDH4Good
14.87s ago2
0,030,010,00
4.7 GiB / 37.2 GiB
481.7 MiB / 1.4 GiB
0 B / 2.9 GiB
n4.example.com192.168.0.246/defaultCDH4Good
12.68s ago
2
0,000,000,00
4.7 GiB / 37.2 GiB
475.7 MiB / 1.4 GiB
0 B / 2.9 G


thanks
Onur Turna
<td style="padding:4px 5px;vertical-align:top;border-top-width:1px;border-top-style:solid;borde
...

turna...@gmail.com

unread,
Mar 1, 2013, 9:54:00 PM3/1/13
to hue-...@cloudera.org, Romain Rigaux, turna...@gmail.com
Hey Abe

i checked my system limits through ulimit -a ; it showed me up like;
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 11543
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

is it normal or ?

onur turna
<span
...

Abraham Elmahrek

unread,
Mar 1, 2013, 10:04:36 PM3/1/13
to turna...@gmail.com, hue-...@cloudera.org, Romain Rigaux
Hey Turna,

Your limits show a small 'max user processes' and 'open files' I think. When HDFS is installed, it adds a file to /etc/limits.d normally that sets those numbers much higher.

I just read in your previous email that you got it working :). Glad to hear!

-Abe

turna...@gmail.com

unread,
Mar 1, 2013, 10:17:06 PM3/1/13
to hue-...@cloudera.org, turna...@gmail.com, Romain Rigaux
Hey,

How should i configure my limits? i installed hdfs through cdh 4 package .
Example of map-reduce is working but when i make a query through beeswax i  am still getting error.  

Onur Turna 
<div style="text-align:center;overflow:hidden;margin-top:auto;margin-bottom:auto;font-size:11px;line-height:11px;background-image:-webkit-gradient(linear,0% 0%,0%
...

Romain Rigaux

unread,
Mar 1, 2013, 10:42:39 PM3/1/13
to turna...@gmail.com, hue-...@cloudera.org
Great, almost there!

How did you change the heap parameter?

If you set it in CM for MapReduce (/etc/hadoop/conf/mapred-site.xml without CM) it should be taken into account for all the jobs (even created by Hive)
   <property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx1024m</value>
  </property>

Or in Beeswax '
SETTINGS' on the left:
key:
mapred.child.java.opts
value:
-Xmx1024m

Romain

turna...@gmail.com

unread,
Mar 2, 2013, 1:45:46 PM3/2/13
to hue-...@cloudera.org, turna...@gmail.com
Hey,

I changed that value on Cloudera Manager as well as etc/hadoop/mapred-site.xml  . but when I make a query still getting error. should I change also hive's configuration environment.

Thanks
Onur Turna
<label style="font-s
...

turna...@gmail.com

unread,
Mar 2, 2013, 2:00:44 PM3/2/13
to hue-...@cloudera.org, turna...@gmail.com
Hi,

ı had changed in CM for Mapreduce heap parameter but now ı checked hue's  heap size it has still default size. which mapred-site.xml should I change for aplly to all tools(hive,hbase ..) from which folder exactly.

Thanks
Onur Turna 
<table style="max-width:100%;border-collapse:collapse;background-c
...

Abraham Elmahrek

unread,
Mar 2, 2013, 3:18:49 PM3/2/13
to turna...@gmail.com, hue-...@cloudera.org
Hey Onur,

Which heap size are you referring to? Beeswax heap size is managed in the "beeswax" section of CM.

Also, beeswax has a tendency to run a little high on the number of file descriptors it opens up. I'm guessing you're on a redhat variant... newer versions of redhat/centos/fedora sometimes provide a hard override of the system limits. Please check all files under /etc/security/limits.d. More information on limits type 'man limits.conf' in your linux shell.

-Abe

turna...@gmail.com

unread,
Mar 4, 2013, 5:51:09 PM3/4/13
to hue-...@cloudera.org
Hey Abe,

I had assigned 1024 mib as java child heap size both for mapreduce( through cloudera's GUI) as well as for hue's java child heap size manually from hue's config files. and I also changed hadoop's The /etc/hadoop/hadoop-env.sh maximum java heap memory for Hadoop as "-Xmx1024m . and Now everthings work fine .

and except from these i also checked conf files from /etc/security/limits.d which are looks;

[root@n1 limits.d]# ls
90-nproc.conf hbase.nofiles.conf impala.conf mapreduce.conf
cloudera-scm.conf hdfs.conf mapred.conf yarn.conf

there are many conf files, which one should i check and how should I configure?

I actually want to start to import my tables , which are .csv formated, to hbase. I have read also some presentation , to make hbase hive integration .that's why I have established mysql server on one of my cluster for hive's metastore. I thought, I'm going to get a visual for my hbase table on beeswax GUI. but as far as I learned this is not an option at the moment. because of this how can somebody else(except who has built these tables) make a query through beeswax on hbase tables , which are not have a visual of tables on beeswax. and also I want to be sure , whether is that an option through beeswax sql like query for hbase tables or not.

Thank you very much for your attention,
Onur Turna


Romain Rigaux

unread,
Mar 4, 2013, 6:07:38 PM3/4/13
to turna...@gmail.com, hue-...@cloudera.org
So ulimit should not need to be touched for now so you are fine I think.

In Beeswax you can select your tables in the 'Tables' tab and view a data sample. You can also do a simple 'SELECT * from your_table' and see the content.

About using Hive on top of HBase this is possible: https://ccp.cloudera.com/display/CDH4DOC/Hive+Installation#HiveInstallation-UsingHivewithHBase

Romain

turna...@gmail.com

unread,
Mar 4, 2013, 6:23:30 PM3/4/13
to hue-...@cloudera.org, turna...@gmail.com
Hi Romain,

I already made that configuration :

Using Hive with HBase

To allow Hive scripts to use HBase, add the following statements to the top of each script:

ADD JAR /usr/lib/hive/lib/zookeeper.jar;
ADD JAR /usr/lib/hive/lib/hbase.jar;
ADD JAR /usr/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.2.0.jar
ADD JAR /usr/lib/hive/lib/guava-11.0.2.jar;

and i made also hive metastore configurations. 
now can i get a visual of  hbase tables on beeswax , which i created through hbase shell? 

Thanks,
Onur Turna

turna...@gmail.com

unread,
Mar 14, 2013, 2:47:36 PM3/14/13
to hue-...@cloudera.org
Hey,
After 2 weeks pause . I started again to figure out hbase hive integration. I am reading tutorial from offical website of hive and ı hade encountered some things what ı do not understand.

CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key,a:b,a:c,d:e"
);
INSERT OVERWRITE TABLE hbase_table_1 SELECT foo, bar, foo+1, foo+2
FROM pokes WHERE foo=98 OR foo=100; from https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration

According that example hbase_table_1 have 3 column but here (INSERT OVERWRITE TABLE hbase_table_1 SELECT foo, bar, foo+1, foo+2 ) there are 4 column to overwrıte? or ı am misunderstanding ?

Thanks
Onur turna

Romain Rigaux

unread,
Mar 18, 2013, 1:16:17 PM3/18/13
to turna...@gmail.com, hue-...@cloudera.org
It has 4 columns: CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int)

I am not an HBase expert but 'key' is just the name of the first column in the table. Its references the HBase key at the same time.

Romain
Reply all
Reply to author
Forward
0 new messages