Anki

unread,

Dec 27, 2010, 8:13:21 PM12/27/10

to Hue-Users

Hi,

I am trying to install Hue in a separate machine than the one in my
cluster. I was able to run hue successfully. However, I am not able to
point beeswax to the remote hive metastore.
Here are the steps that I followed:
1. Started remote metastore server using :
hive --service metastore
2. Copied the hive-site.xml to Hue machine and pointed the hue-
beeswax.ini to look at this location.
3. Modified the hive-site.xml to set:
a. hive.metastore.local = false
b. hive.metastore.uris = thrift://<hive machine>:<metastore port>

Then, on restarting the hue server, in the beeswax_server.log I am
seeing:
"Beeswax configured to use external metastore at hpublicdnode03:9083"

However, when I try to connect to beeswax from hue it is giving the
following exception:
"Exception communicating with Hive Metadata (Hive UI) Server at
localhost:8003: Could not connect to localhost:8003"

Also, in beeswax_server.out, I am seeing one more error:
" ERROR beeswax.Server: Could not create /user/hive/warehouse-hue"
I checked /user/hive has permissions set to 777. I am not sure what is
the problem here.

Any pointers/ thoughts are greatly appreciated.

Thanks,
Ankita

bc Wong

unread,

Dec 29, 2010, 1:27:01 PM12/29/10

to Anki, Hue-Users

Hi Ankita,

I'm on CDH3, with security turned off, and tried to reproduce the problem with
exactly the same external metastore setting. Instead, Beeswax went ahead and
created the directory:

10/12/29 10:19:51 INFO beeswax.Server: Created /user/hive/warehouse-hue with
world-writable permissions.
10/12/29 10:19:51 INFO beeswax.Server: Starting beeswaxd at port 8002

A few quick things to check:
(1) Is the Hadoop on the Hue node configured correctly? Does `hadoop fs ...`
work?
(2) Can you get the corresponding error message from the NN log to see
why it couldn't create the directory?

Cheers,
--
bc Wong
Cloudera Software Engineer

Anki

unread,

Dec 29, 2010, 6:00:04 PM12/29/10

to Hue-Users

Thanks for the reply. I am curious, in the remote machine where you
have installed hive, was there any difference between hive-site.xml
than the one you were using in Hue?

Also, I am not able to do hadoop fs -ls from my Hue machine as this
machine is not part of hadoop cluster. I think this is not an
requirement as I am able to browse my hadoop filesystem from Hue file
browser correctly. Am I missing something?

Also, I checked the namenode logs, it doesn't have any errors or
warning or any activity from hue machine.
One question- why from Hue desktop when I try to open Beeswax, it is
giving error msg that it is not able communicate with "Hive Metadata
(Hive UI) Server at localhost:8003". I was expecting this to be my
remote hive machine.

Here are other details about my setup:
1. I am using CDH3b2.
2. Hive is installed in one of the hadoop nodes and configured to use
Mysql as metastore.
3. Hue is installed in a separate machine.

Let me know if you need more details.

Thanks,
Ankita

bc Wong

unread,

Dec 30, 2010, 3:33:52 AM12/30/10

to Anki, Hue-Users

On Wed, Dec 29, 2010 at 3:00 PM, Anki <ankita...@gmail.com> wrote:
> Thanks for the reply. I am curious, in the remote machine where you
> have installed hive, was there any difference between hive-site.xml
> than the one you were using in Hue?

It's the same. Hue runs BeeswaxServer, which is essentially a Hive
client. So it should have an identical setup.

> Also, I am not able to do hadoop fs -ls from my Hue machine as this
> machine is not part of hadoop cluster. I think this is not an
> requirement as I am able to browse my hadoop filesystem from Hue file
> browser correctly. Am I missing something?

Hue requires a properly configured Hadoop client locally. The file
upload runs Hadoop, job submission requires Hadoop, and BeeswaxServer
requires Hadoop.

In Hue, the left hand side of the application bar should show a little
red exclamation. Click on it and it'll show you the common
mis-configurations.

Anki

unread,

Dec 30, 2010, 4:28:27 PM12/30/10

to Hue-Users

Thanks, I configured hadoop in the Hue machine and after that I am
able to browse my hdfs filesystem using hadoop fs -ls. However, now
when I start Hue, it is not able to load any of the app(filebrowser,
beeswax etc.) icons.

I checked the logs, the beeswax-server.out/log are not written at this
time. Also, following error msg exists in supervisor.log
[30/Dec/2010 13:06:48 +0000] supervisor ERROR Exception in
supervisor main loop
Traceback (most recent call last):
File "/usr/share/hue/desktop/core/src/desktop/supervisor.py", line
335, in main
wait_loop(sups, options)
File "/usr/share/hue/desktop/core/src/desktop/supervisor.py", line
346, in wait_loop
time.sleep(1)
File "/usr/share/hue/desktop/core/src/desktop/supervisor.py", line
198, in sig_handler
raise Exception("Signal %d received. Exiting" % signum)
Exception: Signal 15 received. Exiting
[30/Dec/2010 13:06:48 +0000] supervisor WARNING Supervisor shutting
down!
[30/Dec/2010 13:06:48 +0000] supervisor WARNING Waiting for
children to exit for 5 seconds...
[30/Dec/2010 13:06:48 +0000] supervisor INFO Command "/usr/share/
hue/build/env/bin/hue runcpserver" exited normally.
[30/Dec/2010 13:06:59 +0000] supervisor INFO Starting process /
usr/share/hue/build/env/bin/hue runcpserver
[30/Dec/2010 13:06:59 +0000] supervisor INFO Started proceses
(pid 27783) /usr/share/hue/build/env/bin/hue runcpserver

Also, I checked namenode logs, there is still no activity related to
this machine.

Let me know if you need more details.

Appreciate your help!
Ankita

On Dec 30, 12:33 am, bc Wong <bcwal...@cloudera.com> wrote:

bc Wong

unread,

Dec 30, 2010, 10:45:04 PM12/30/10

to Anki, Hue-Users

On Thu, Dec 30, 2010 at 1:28 PM, Anki <ankita...@gmail.com> wrote:
> Thanks, I configured hadoop in the Hue machine and after that I am
> able to browse my hdfs filesystem using hadoop fs -ls. However, now
> when I start Hue, it is not able to load any of the app(filebrowser,
> beeswax etc.) icons.

So it seems that Hue starts and stays up, just that you're missing
most of the apps. Right? Can you take a look at your /etc/hue/hue.ini
and make sure the "hadoop_home" is set correctly?

What you mentioned below, esp. "Supervisor shutting down" would
suggest that Hue can't even start. If so, you'd need to scan the other
log files in /var/log/hue for errors. (Error reporting is badly done
in 1.0.x. In 1.1.0, there is a central error.log for all errors.)

Anki

unread,

Jan 3, 2011, 2:41:31 PM1/3/11

to Hue-Users

My hadoop_home was set correctly, and I did scanned all the logs.
Apparently, only error that was reported was in supervisor logs and it
wasn't very helpful.
I re-installed hue, and now the hue desktop is loading properly. Also,
I am able to browse my hdfs both from hue file browser as well as from
command line using 'hadoop fs -ls'. Also, this time Beeswax went ahead
and created warehouse-hue directory in the hdfs.

However, I am getting the same error when I click Beeswax icon from
hue-desktop:
"Exception communicating with Hive Metadata (Hive UI) Server at

localhost:8003: Could not connect to localhost:8003"

From one of the log msgs it seems that beeswax is configured to use
external metastore. I am not sure why it is trying to connect to
localhost. I am wondering if anything else is required to stop
beeswax from connecting to localhost.

Also, from logs:
runcpserver.log:
03/Jan/2011 11:03:25 +0000] thrift_util INFO Thrift exception;
retrying: Could not connect to localhost:8003
[03/Jan/2011 11:03:25 +0000] thrift_util INFO Thrift exception;
retrying: Could not connect to localhost:8003
[03/Jan/2011 11:03:25 +0000] thrift_util WARNING Out of retries for
thrift call: get_tables
[03/Jan/2011 11:03:25 +0000] thrift_util INFO Thrift saw
exception: Could not connect to localhost:8003
[03/Jan/2011 11:03:25 +0000] middleware INFO Processing
exception: Could not connect to localhost:8003: Traceback (most recent
call last):

Beeswax.out:
11/01/03 10:49:17 INFO beeswax.Server: Created /user/hive/warehouse-
hue with world-writable permissions.
11/01/03 10:49:17 INFO beeswax.Server: Starting beeswaxd at port 8002
11/01/03 10:49:17 INFO beeswax.Server: Parsed core-default.xml
sucessfully. Learned 52 descriptions.
11/01/03 10:49:17 INFO beeswax.Server: Parsed hdfs-default.xml
sucessfully. Learned 47 descriptions.
11/01/03 10:49:17 INFO beeswax.Server: Parsed mapred-default.xml
sucessfully. Learned 101 descriptions.
11/01/03 10:49:17 INFO beeswax.Server: Parsed hive-default.xml
sucessfully. Learned 62 descriptions.
11/01/03 10:49:17 INFO beeswax.Server: Starting beeswax server on port
8002, talking back to Desktop at 10.2.40.30:8088

Beeswax.log
03/Jan/2011 10:49:13 +0000] settings INFO Welcome to Hue 1.0.1
[03/Jan/2011 10:49:13 +0000] settings WARNING secret_key should
be configured
[03/Jan/2011 10:49:14 +0000] beeswax_server INFO Beeswax

configured to use external metastore at hpublicdnode03:9083

[03/Jan/2011 10:49:14 +0000] beeswax_server INFO Executing '/usr/
share/hue/apps/beeswax/src/beeswax/../../
beeswax_server.sh' (['beeswax_server.sh', '--beeswax', '8002', '--
desktop-host', '10.2.40.30', '--desktop-port', '8088']) ({'LOGNAME':
'root', 'USER': 'root', 'HOME': '/home/stratify', 'PATH': '/usr/local/
sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin', 'LANG':
'en_US.UTF-8', 'TERM': 'xterm', 'SHELL': '/bin/bash', 'TZ': 'America/
Los_Angeles', 'SHLVL': '1', 'SUDO_USER': 'stratify', 'USERNAME':
'root', 'DESKTOP_LOG_DIR': '/var/log/hue', 'SUDO_UID': '1000',
'HIVE_CONF_DIR': '/etc/hue/conf', '_': '/sbin/start-stop-daemon',
'SUDO_COMMAND': '/etc/init.d/hue start', 'SUDO_GID': '1000',
'PYTHON_EGG_CACHE': '/tmp/.hue-python-eggs', 'PWD': '/home/stratify',
'DJANGO_SETTINGS_MODULE': 'desktop.settings', 'MAIL': '/var/mail/
stratify', 'LS_COLORS':
'rs=0:di=01;34:ln=01;36:hl=44;37:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.
7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:',
'HADOOP_HOME': '/usr/lib/hadoop-0.20'})
[

Thanks for your help,
Ankita

On Dec 30 2010, 7:45 pm, bc Wong <bcwal...@cloudera.com> wrote:

bc Wong

unread,

Jan 6, 2011, 8:32:31 AM1/6/11

to Anki, Hue-Users

On Mon, Jan 3, 2011 at 11:41 AM, Anki <ankita...@gmail.com> wrote:
> However, I am getting the same error when I click Beeswax icon from
> hue-desktop:
> "Exception communicating with Hive Metadata (Hive UI) Server at
> localhost:8003: Could not connect to localhost:8003"

Anki,

You're absolutely right. I probably did not setup my hive conf correctly when I
said I couldn't reproduce your problem. I filed
https://issues.cloudera.org/browse/HUE-393 for this problem. The patch in the
jira should work for you.

Anki

unread,

Jan 6, 2011, 8:54:13 PM1/6/11

to Hue-Users

Thanks Wong for a quick patch! Somehow I am not able to access the
link. I tried both https and http.
To give you some background, we have installed hive on one of our
datanodes. And we did not wanted to install hue on the same machine.
This is how it all started.
Plan B is we can always install hue on the same machine as hive.

Thanks again for your timely help,
Ankita

On Jan 6, 5:32 am, bc Wong <bcwal...@cloudera.com> wrote:

> On Mon, Jan 3, 2011 at 11:41 AM, Anki <ankita.bak...@gmail.com> wrote:
> > However, I am getting the same error when I click Beeswax icon from
> > hue-desktop:
> > "Exception communicating with Hive Metadata (Hive UI) Server at
> > localhost:8003: Could not connect to localhost:8003"
>
> Anki,
>
> You're absolutely right. I probably did not setup my hive conf correctly when I

> said I couldn't reproduce your problem. I filedhttps://issues.cloudera.org/browse/HUE-393for this problem. The patch in the

Vinithra Varadharajan

unread,

Jan 6, 2011, 9:46:05 PM1/6/11

to Anki, Hue-Users

Anki,

This morning there was an outage of https://issues.cloudera.org, that included the loss of the attached patch. You can now access https://issues.cloudera.org/browse/HUE-393 and download
0002-HUE-393.-Beeswax-doesn-t-work-with-external-metastore.patch. Sorry for the inconvenience.

-Vinithra

Anki

unread,

Jan 7, 2011, 2:07:55 PM1/7/11

to Hue-Users

Thanks Vinithra.
I was able to access the patch now. After applying the patch, Beeswax
is able to communicate with hive server :)

Cheers!
Ankita

On Jan 6, 6:46 pm, Vinithra Varadharajan <vinit...@cloudera.com>
wrote:
> Anki,
>
> This morning there was an outage ofhttps://issues.cloudera.org, that
> included the loss of the attached patch. You can now accesshttps://issues.cloudera.org/browse/HUE-393and download

> 0002-HUE-393.-Beeswax-doesn-t-work-with-external-metastore.patch. Sorry for
> the inconvenience.
>
> -Vinithra
>

mtanquary

unread,

Feb 9, 2011, 5:47:24 PM2/9/11

to Hue-Users

I'm not able to apply the patch. I get:

]# patch -p0 <*.patch
can't find file to patch at input line 4
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/apps/beeswax/src/beeswax/db_utils.py b/apps/beeswax/src/
beeswax/db_utils.py
|--- a/apps/beeswax/src/beeswax/db_utils.py
|+++ b/apps/beeswax/src/beeswax/db_utils.py
--------------------------
File to patch: /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
patching file /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
Hunk #2 FAILED at 297.
1 out of 2 hunks FAILED -- saving rejects to file /usr/share/hue/apps/
beeswax/src/beeswax/db_utils.py.rej

bc Wong

unread,

Feb 9, 2011, 5:54:58 PM2/9/11

to mtanquary, Hue-Users

On Wed, Feb 9, 2011 at 2:47 PM, mtanquary <matt.t...@gmail.com> wrote:
> I'm not able to apply the patch. I get:
>
> ]# patch -p0 <*.patch
> can't find file to patch at input line 4
> Perhaps you used the wrong -p or --strip option?
> The text leading up to this was:
> --------------------------
> |diff --git a/apps/beeswax/src/beeswax/db_utils.py b/apps/beeswax/src/
> beeswax/db_utils.py
> |--- a/apps/beeswax/src/beeswax/db_utils.py
> |+++ b/apps/beeswax/src/beeswax/db_utils.py
> --------------------------
> File to patch: /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
> patching file /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
> Hunk #2 FAILED at 297.
> 1 out of 2 hunks FAILED -- saving rejects to file /usr/share/hue/apps/
> beeswax/src/beeswax/db_utils.py.rej

Hi Matt,

Can you try `patch -p1 ...'?

mtanquary

unread,

Feb 9, 2011, 5:59:16 PM2/9/11

to Hue-Users

This is what I got:

# patch -p1 <*.patch

can't find file to patch at input line 4
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/apps/beeswax/src/beeswax/db_utils.py b/apps/beeswax/src/
beeswax/db_utils.py
|--- a/apps/beeswax/src/beeswax/db_utils.py
|+++ b/apps/beeswax/src/beeswax/db_utils.py
--------------------------
File to patch: /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
patching file /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py

Reversed (or previously applied) patch detected! Assume -R? [n] y
Hunk #2 FAILED at 296.

1 out of 2 hunks FAILED -- saving rejects to file /usr/share/hue/apps/
beeswax/src/beeswax/db_utils.py.rej

On Feb 9, 3:54 pm, bc Wong <bcwal...@cloudera.com> wrote:

bc Wong

unread,

Feb 9, 2011, 6:45:44 PM2/9/11

to mtanquary, Hue-Users

On Wed, Feb 9, 2011 at 2:59 PM, mtanquary <matt.t...@gmail.com> wrote:
> This is what I got:
>
> # patch -p1 <*.patch
> can't find file to patch at input line 4
> Perhaps you used the wrong -p or --strip option?
> The text leading up to this was:
> --------------------------
> |diff --git a/apps/beeswax/src/beeswax/db_utils.py b/apps/beeswax/src/
> beeswax/db_utils.py
> |--- a/apps/beeswax/src/beeswax/db_utils.py
> |+++ b/apps/beeswax/src/beeswax/db_utils.py
> --------------------------
> File to patch: /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
> patching file /usr/share/hue/apps/beeswax/src/beeswax/db_utils.py
> Reversed (or previously applied) patch detected! Assume -R? [n] y
> Hunk #2 FAILED at 296.
> 1 out of 2 hunks FAILED -- saving rejects to file /usr/share/hue/apps/
> beeswax/src/beeswax/db_utils.py.rej

Hi Matt,

What's your Hue version? Could you send me your
apps/beeswax/src/beeswax/db_utils.py file?

mtanquary

unread,

Feb 10, 2011, 4:24:44 PM2/10/11

to Hue-Users

My version is 1.1. I'll send the file as well. Thank you for your
assistance!

On Feb 9, 4:45 pm, bc Wong <bcwal...@cloudera.com> wrote:

bc Wong

unread,

Feb 10, 2011, 5:45:30 PM2/10/11

to mtanquary, Hue-Users

On Thu, Feb 10, 2011 at 1:24 PM, mtanquary <matt.t...@gmail.com> wrote:
> My version is 1.1. I'll send the file as well. Thank you for your
> assistance!

I see. The patch wouldn't apply directly on 1.1. (I was a bit mistaken
about what's in which version.) On your version, the patch should go
in around L305 rather than L296. It's a small change. Do you feel
comfortable hand-editing your db_utils.py?

bc Wong

unread,

Feb 11, 2011, 11:03:44 AM2/11/11

to Matt Tanquary, Hue-Users

On Fri, Feb 11, 2011 at 7:51 AM, Matt Tanquary <matt.t...@gmail.com> wrote:
> Now getting this error again:

>
> Exception communicating with Hive Metadata (Hive UI) Server at

> localhost:8003: timed out
>
> Here is the content of my hue-beeswax.ini:
>
> # Configuration options for the Hive UI (Beeswax).
>
> [beeswax]
>
> #
> # Configure the port the internal metastore daemon runs on. Used only if
> # hive.metastore.local is true.
> ## beeswax_meta_server_port=8003
>
> #
> # Configure the port the beeswax thrift server runs on
> ## beeswax_server_port=8002
>
> #
> # Hive configuration directory, where hive-site.xml is located
> hive_conf_dir=/etc/hive/conf

Hi Matt,

Is hive.metastore.local set to false in your
/etc/hive/conf/hive-site.xml? Could you please attach that file?

Cheers,
bc

> I have verified that hive from cli connects to my derby server and works as
> expected.
>
> I also attached my current db_utils.py file as a sanity check.
>
> Thanks again for all of your help!
> -M@
>
> On Fri, Feb 11, 2011 at 8:34 AM, bc Wong <bcwa...@cloudera.com> wrote:
>>
>> On Fri, Feb 11, 2011 at 7:20 AM, Matt Tanquary <matt.t...@gmail.com>
>> wrote:
>> > Thanks!
>> >
>> > Now I get this error when trying to start hive from hue: An error
>> > occurred:
>> > 'module' object has no attribute 'METASTORE_CONN_TIMEOUT'
>>
>> Hi Matt,
>>
>> Your original file has this for the code block in question:
>>
>> client = thrift_util.get_client(ThriftHiveMetastore.Client,
>> conf.BEESWAX_META_SERVER_HOST.get(),
>> conf.BEESWAX_META_SERVER_PORT.get(),
>> service_name="Hive Metadata (Hive UI)
>> Server",
>> timeout_seconds=METASTORE_THRIFT_TIMEOUT)
>>
>> Your new file should modify two arguments:
>>
>> client = thrift_util.get_client(ThriftHiveMetastore.Client,
>> host,
>> port,
>> service_name="Hive Metadata (Hive UI)
>> Server",
>> timeout_seconds=METASTORE_THRIFT_TIMEOUT)
>>
>> Cheers,
>> bc

bc Wong

unread,

Feb 11, 2011, 11:58:59 AM2/11/11

to Matt Tanquary, Hue-Users

On Fri, Feb 11, 2011 at 8:17 AM, Matt Tanquary <matt.t...@gmail.com> wrote:
> It's set to true. After you mentioned that, I set to false and get some
> errors in hive and hue. The hue error I get is
>
> Exception communicating with Hive Metadata (Hive UI) Server at undefined:0:
> Could not connect to undefined:0
>
> hive-site.xml attached.

[cc-ing hue-users@. Please reply-all to keep a record for future users.]

Where is your external metastore? You were using the Hive CLI
earlier, which connected to the external metastore. So your CLI
session cannot be using /etc/hive/conf/hive-site.xml. You should
point hive_con_dir in `hue-beeswax.ini' to wherever the real
configuration is.

The hive-site.xml you attached is misconfigured. It doesn't have
hive.metastore.uris.

bc Wong

unread,

Feb 11, 2011, 12:42:52 PM2/11/11

to Matt Tanquary, Hue-Users

On Fri, Feb 11, 2011 at 9:17 AM, Matt Tanquary <matt.t...@gmail.com> wrote:
> The metastore is on the piodev02 server. I had defined the ConnectionURL to
> point to: jdbc:derby://piodev02:1527/metastore_db.
>
> This configuration I got from
> http://wiki.apache.org/hadoop/HiveDerbyServerMode
>
> I have been trying to set up the uris but haven't gotten something to work.
> What would you suggest as a URI for this?

Matt,

Please include hue-users@ in the reply. This is useful to help
other people troubleshoot.

The Hive terminology might be confusing you:
http://wiki.apache.org/hadoop/Hive/AdminManual/MetastoreAdmin
What you have is not necessarily an external (remote) metastore.
A remote metastore is a Hive metastore server daemon. That daemon
proxies all metastore requests, and the clients don't directly
talk to the metastore DB. In your case, it sounds like your
clients are connecting to the DB directly, just that the DB is on
a remote machine. That is still classified as an internal
metastore.

To go into Hive best practices a bit:- You shouldn't use derby
for a shared metastore. You'll run into concurrency problems
(and definitely performance ones). I'd recommend mysql.

turna...@gmail.com

unread,

Feb 18, 2013, 10:05:49 AM2/18/13

to hue-...@cloudera.org, Matt Tanquary, bcwa...@cloudera.com

hi ,
I trying to figure out hive mysql metastore configuration. I have implemented everything from cloudera's Webseite
https://ccp.cloudera.com/display/CDH4DOC/Hive+Installation#HiveInstallation-ConfiguringtheHiveMetastore
but now when i connect to hue gui i'm getting this error

Exception communicating with Hive Metastore Server at localhost:8003: timed out



i installed mysql server on my master node, where i installed cloudera manager as well.
i have created a new database on mysql , which name is metastore and my hive-site.xml is so,
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://n1.example.com/metastore</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>q*****</value>
</property>
 
<property>
  <name>datanucleus.autoCreateSchema</name>
  <value>false</value>
</property>
 
<property>
  <name>datanucleus.fixedDatastore</name>
  <value>true</value>
</property>

can anybody help me to solve that error.

Chris Conner

unread,

Feb 19, 2013, 8:55:40 AM2/19/13

to hue-...@cloudera.org

Hey,

When you made these changes, did you only make them for in the /etc/hive/conf/hive-site.xml or did you make them in the Beeswax configuration of the Hue service in CM?

Thanks!

turna...@gmail.com

unread,

Feb 19, 2013, 1:50:37 PM2/19/13

to hue-...@cloudera.org

hi,
I made them manually in /etc/hive/conf/hive-site.xml.
but hive-conf/hive-site.xml under configuration files on beeswax_server(n1) still looks so ;
under webbrowser links http://n1.example.com:7180/cmf/process/190/config?filename=hive-conf%2Fhive-site.xml

<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera CM on 2013-02-19T14:25:59.866Z-->
<configuration>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/metastore_db?useUnicode=true&amp;characterEncoding=UTF-8</value>


  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>

  </property>
  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/beeswax/warehouse</value>


  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>

    <value>hue</value>


  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>

    <value></value>
  </property>
</configuration>
 and also
 hbase-conf/hbase-site.xml under under configuration files on beeswax_server(n1) looks so ;

<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera CM on 2013-02-19T14:25:59.524Z-->
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://n1.example.com:8020/hbase</value>
  </property>
  <property>
    <name>hbase.client.write.buffer</name>
    <value>2097152</value>
  </property>
  <property>
    <name>hbase.client.pause</name>
    <value>1000</value>
  </property>
  <property>
    <name>hbase.client.retries.number</name>
    <value>10</value>
  </property>
  <property>
    <name>hbase.client.scanner.caching</name>
    <value>1</value>
  </property>
  <property>
    <name>hbase.client.keyvalue.maxsize</name>
    <value>10485760</value>
  </property>
  <property>
    <name>hbase.security.authentication</name>
    <value>simple</value>
  </property>
  <property>
    <name>zookeeper.session.timeout</name>
    <value>60000</value>
  </property>
  <property>
    <name>zookeeper.znode.parent</name>
    <value>/hbase</value>
  </property>
  <property>
    <name>zookeeper.znode.rootserver</name>
    <value>root-region-server</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>n1.example.com</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2181</value>
  </property>

</configuration>

I installed mysql server on same node  (n1) than i build an new database,which name is 'metastore' and also all configuration from cloudera's webseite. metastore database is existing in under var/log/mysql/metastore with all metadata tables ( i had implement with that command SOURCE /usr/lib/hive/scripts/metastore/upgrade/mysql/hive-schema-0.9.0.mysql.sql;)

and while creating of metastore database i configured database user like so;

 CREATE USER hive@localhost  IDENTIFIED BY 'xxxxxxx' ;

and also my hbase-conf/hdfs-site.xml under beeswax server in CM ;

<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera CM on 2013-02-19T14:25:59.521Z-->
<configuration>
  <property>
    <name>dfs.https.port</name>
    <value>50470</value>
  </property>
  <property>
    <name>dfs.namenode.http-address</name>
    <value>n1.example.com:50070</value>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>3</value>
  </property>
  <property>
    <name>dfs.blocksize</name>
    <value>134217728</value>
  </property>
  <property>
    <name>dfs.client.use.datanode.hostname</name>
    <value>false</value>
  </property>
  <property>
    <name>fs.permissions.umask-mode</name>
    <value>022</value>
  </property>
  <property>
    <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
    <value>true</value>
  </property>
</configuration>

I'm trying to solve that problem but  I could not yet find any issue...

Thank you for your attention.

Chris Conner

unread,

Feb 19, 2013, 2:26:50 PM2/19/13

to hue-...@cloudera.org

OK, for Beeswax to work in Hue, you have to configure the DB for the metastore warehouse in CM for Beeswax as well. Go to the Hue Service configuration, then under Beeswax one of the configuration areas should be for the DB. I'll try and send a screen shot shortly.

Thanks
Chris

turna...@gmail.com

unread,

Feb 19, 2013, 2:42:41 PM2/19/13

to hue-...@cloudera.org

hi,

my configuration under beeswax configuration area looks so;

Configuration options for the Hive UI (Beeswax).

hive_conf_dir	/var/run/cloudera-scm-agent/process/191-hue-HUE_SERVER/hive-conf Verzeichnis der Hive-Konfiguration, in dem sich hive-site.xml befindet. Standard: /var/run/cloudera-scm-agent/process/191-hue-HUE_SERVER/hive-conf

hive_conf_dir

/var/run/cloudera-scm-agent/process/191-hue-HUE_SERVER/hive-conf

Verzeichnis der Hive-Konfiguration, in dem sich hive-site.xml befindet.

Standard: /var/run/cloudera-scm-agent/process/191-hue-HUE_SERVER/hive-conf

share_saved_queries	True Gespeicherte Abfragen allen Benutzern mitteilen. Wenn auf falsch gesetzt, sind gesicherte Abfragen nur für den Eigentümer und Administratoren sichtbar. Standard: True

share_saved_queries

True

Gespeicherte Abfragen allen Benutzern mitteilen. Wenn auf falsch gesetzt, sind gesicherte Abfragen nur für den Eigentümer und Administratoren sichtbar.

Standard: True

metastore_conn_timeout	10 Timeouts in Sekunden für Thrift-Aufrufe des Hive-Metastores. Bei diesem Timeout sollte berücksichtigt werden, dass der Metastore mit einer externen Datenbank kommunizieren könnte. Standard: 10

metastore_conn_timeout

10

Timeouts in Sekunden für Thrift-Aufrufe des Hive-Metastores. Bei diesem Timeout sollte berücksichtigt werden, dass der Metastore mit einer externen Datenbank kommunizieren könnte.

Standard: 10

beeswax_server_port	8002 Konfigurieren Sie den Port, auf dem der Beeswax Thrift-Server läuft. Standard: 8002

beeswax_server_port

8002

Konfigurieren Sie den Port, auf dem der Beeswax Thrift-Server läuft.

Standard: 8002

beeswax_running_query_lifetime	604800000 Dauer in Sekunden, während der Beeswax Abfragen im Cache behält. Standard: 604800000

beeswax_running_query_lifetime

604800000

Dauer in Sekunden, während der Beeswax Abfragen im Cache behält.

Standard: 604800000

hive_home_dir	/usr/lib/hive Pfad zum Ursprung der Hive-Installation; kehrt standardmäßig zur Umgebungsvariablen zurück, wenn nicht eingestellt. Standard: /usr/lib/hive

hive_home_dir

/usr/lib/hive

Pfad zum Ursprung der Hive-Installation; kehrt standardmäßig zur Umgebungsvariablen zurück, wenn nicht eingestellt.

Standard: /usr/lib/hive

browse_partitioned_table_limit	250 Setzen Sie eine LIMIT-Klausel beim Durchsuchen einer partitionierten Tabelle. Ein positiver Wert wird als das LIMIT gesetzt. Wenn 0 oder negativ, wird kein Limit gesetzt. Standard: 250

browse_partitioned_table_limit

250

Setzen Sie eine LIMIT-Klausel beim Durchsuchen einer partitionierten Tabelle. Ein positiver Wert wird als das LIMIT gesetzt. Wenn 0 oder negativ, wird kein Limit gesetzt.

Standard: 250

beeswax_server_heapsize	53 Maximale vom Beeswax-Server verwendete Java-Heapsize (in Megabyte). Beachten Sie, dass die Einstellung von HADOOP_HEAPSIZE in $HADOOP_CONF_DIR/hadoop-env.sh diese Einstellung außer Kraft setzen kann. Standard: 1000

beeswax_server_heapsize

53

Maximale vom Beeswax-Server verwendete Java-Heapsize (in Megabyte). Beachten Sie, dass die Einstellung von HADOOP_HEAPSIZE in $HADOOP_CONF_DIR/hadoop-env.sh diese Einstellung außer Kraft setzen kann.

Standard: 1000

beeswax_server_conn_timeout	120 Timeout in Sekunden für Thrift-Aufrufe des Beeswax-Dienstes. Standard: 120

beeswax_server_conn_timeout

120

Timeout in Sekunden für Thrift-Aufrufe des Beeswax-Dienstes.

Standard: 120

beeswax_meta_server_port	8003 Konfigurieren Sie den Port, auf dem der interne Metastore-Daemon läuft. Wird nur verwendet, wenn hive.metastore.local wahr ist. Standard: 8003

beeswax_meta_server_port

8003

Konfigurieren Sie den Port, auf dem der interne Metastore-Daemon läuft. Wird nur verwendet, wenn hive.metastore.local wahr ist.

Standard: 8003

beeswax_meta_server_only	None Disable Beeswax as the query server. This is used when Beeswax is just used for talking to the meta store and Hue is using another query server. Just fill in an unused port. Standard: None

beeswax_meta_server_only

None

Disable Beeswax as the query server. This is used when Beeswax is just used for talking to the meta store and Hue is using another query server. Just fill in an unused port.

Standard: None

local_examples_data_dir	/usr/share/hue/apps/beeswax/src/beeswax/../../data Der Pfad im lokalen Dateisystem, der die Beeswax-Beispiele enthält. Standard: /usr/share/hue/apps/beeswax/src/beeswax/../../data

local_examples_data_dir

/usr/share/hue/apps/beeswax/src/beeswax/../../data

Der Pfad im lokalen Dateisystem, der die Beeswax-Beispiele enthält.

Standard: /usr/share/hue/apps/beeswax/src/beeswax/../../data

i am waiting for your reply.

thanks,

onur

Chris Conner

unread,

Feb 19, 2013, 2:57:40 PM2/19/13

to hue-...@cloudera.org

Hey,

What version of CM are you running?

Romain Rigaux

unread,

Feb 19, 2013, 2:58:06 PM2/19/13

to Chris Conner, hue-...@cloudera.org

Yes, like Chris is saying, if you use CM editing manually '/etc/hive/conf/hive-site.xml' should not have an impact on Beeswax/Hue as they are using '/var/run/cloudera-scm-agent/process/191-hue-HUE_SERVER/hive-conf' so the Hive metastore DB must be configured in CM.

Romain

turna...@gmail.com

unread,

Feb 19, 2013, 4:26:21 PM2/19/13

to hue-...@cloudera.org

Hey,

I am using CM 4.1.3 in moment. How can I configure Hive metastore in CM ? Is that configuration with CM beeswax GUI possible? if So how?

Thanks

Onur

Chris Conner

unread,

Feb 19, 2013, 5:01:35 PM2/19/13

to hue-...@cloudera.org

In CM, you should have a configuration area that looks like this:

Is that where you are making the change for Beeswax?

turna...@gmail.com

unread,

Feb 19, 2013, 7:28:26 PM2/19/13

to hue-...@cloudera.org

hi,

i configured according your screenshot. now beeswax gui works fine. thank you a lot.

i built a one table with hbase shell . I can see that in hue's filebrowser but not in beeswax tables. how can i see my hbase tables on hue's beeswax in hbase table format?

my filebrowser look's so , and tables , which i built are cars and test. as far as l know hbase holds data in meta folder but here they are in hbase folder. i need a little help to figure out it . i am writing my thesis on that subject and i have to import such .csv file on hbase and then in first stage going to make a query with hive.

thanks,

onur

turna...@gmail.com

unread,

Feb 20, 2013, 5:19:41 AM2/20/13

to hue-...@cloudera.org

hey,

I am still trying to figure out beeswax. i installed sample database of hue . i can see tables but when i make a query ( also example saved query) i am getting that error ;

Driver returned: 1.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302200154_1490517290.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Job Submission failed with exception 'org.apache.hadoop.security.AccessControlException(Permission denied: user=admin, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:186)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:135)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4547)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4518)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2880)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:2844)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2823)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:639)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:417)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44096)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

as far as understand from this error, that hue's user do not have a permission to get data from from hdfs .

Thanks

Onur

Chris Conner

unread,

Feb 20, 2013, 9:43:36 AM2/20/13

to hue-...@cloudera.org

Hey Onur,

Unfortunately there is no way in Hue 2.1 to see hbase tables. In a future release(2.2 I think) there will be an hbase shell in Hue to see hbase tables, however, Beeswax will never have the functionality to directly see hbase tables. You can create an external hive table that maps to the hbase table if you want to see the data through beeswax. If that's something you'd like to know more about, I can send you the syntax for the external hive table.

Thanks
Chris

Chris Conner

unread,

Feb 20, 2013, 9:46:10 AM2/20/13

to hue-...@cloudera.org

Two things:

1. Does the 'hive.metastore.warehouse.dir' exist already in HDFS? I think you used "/user/beeswax/warehouse".
2. What are the permissions on that directory?

Thanks
Chris

turna...@gmail.com

unread,

Feb 20, 2013, 12:46:19 PM2/20/13

to hue-...@cloudera.org

hey chris ,
i get dissapointed . i do not know that. i though that , i can see my tables as well , which I created with hbase shell command line . but if i with beeswax a table create . this table also built like a hbase table on hdfs ( eventually in HFile) in specific folder each datanode. and also i tried to create one table with beeswax ( i watched also cloudera's beeswax tutarial). and i get some error, which i posted . I want to ask how can I define column families and their related columns .as far as I see While create table process there is only one option to create column ( not column families).

I am very rookie with hdfs . I tried to see hdfs file through one node(master node) . "with hadoop fs -ls" i'm getting this message: ls: `.': No such file or directory
and also /var/lib/hive/metastore directory ,where i point my mysql metastore directory there is nothing in that directory. is it normal?
but default directory of hue beeswax /var/lib/hive/hue_beeswax_metastore/metastore_db have log,seg0,tmp,dblck,dbex,service.properties such of things of derby database.
and where should be that directory also under which directory ? under "bin" or "dfs" or "usr"
i thought that should be in hdfs but that command does not work out .

under Hbase Web UI i can see my tables look so ;

http://n1.example.com:60010/master-status

Master: n1.example.com:60000

Local logs, Thread Dump, Log Level, Debug dump

Attributes

Attribute Name Value Description
HBase Version 0.92.1-cdh4.1.3, rUnknown HBase version and revision
HBase Compiled Sat Jan 26 17:11:38 PST 2013, jenkins When HBase version was compiled and by whom
Hadoop Version 2.0.0-cdh4.1.3, rdbc7a60f9a798ef63afb7f5b723dc9c02d5321e1 Hadoop version and revision
Hadoop Compiled Sat Jan 26 16:46:14 PST 2013, jenkins When Hadoop version was compiled and by whom
HBase Root Directory hdfs://n1.example.com:8020/hbase Location of HBase home directory
HBase Cluster ID eab1a97c-2d1b-4d7d-8315-dcaf1c151f8d Unique identifier generated for each HBase cluster
Load average 1 Average number of regions per regionserver. Naive computation.
Zookeeper Quorum n1.example.com:2181 Addresses of all registered ZK servers. For more, see zk dump.
Coprocessors [] Coprocessors currently loaded loaded by the master
HMaster Start Time Tue Feb 19 15:25:28 CET 2013 Date stamp of when this HMaster was started
HMaster Active Time Tue Feb 19 15:25:28 CET 2013 Date stamp of when this HMaster became active
Tasks

Show All Monitored Tasks Show non-RPC Tasks Show All RPC Handler Tasks Show Active RPC Calls Show Client Operations View as JSON
No tasks currently running on this node.
Tables

Catalog Table Description
-ROOT- The -ROOT- table holds references to all .META. regions.
.META. The .META. table holds references to all User Table regions
2 table(s) in set. [Details]

User Table Description
cars {NAME => 'cars', FAMILIES => [{NAME => 'vi', MIN_VERSIONS => '0'}]}
test {NAME => 'test', FAMILIES => [{NAME => 'cf1', MIN_VERSIONS => '0'}]}
Region Servers

ServerName Start time Load
n1.example.com,60020,1361283928017 Tue Feb 19 15:25:28 CET 2013 requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=33, maxHeapMB=65
n2.example.com,60020,1361284069894 Tue Feb 19 15:27:49 CET 2013 requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=27, maxHeapMB=185
n3.example.com,60020,1361284067501 Tue Feb 19 15:27:47 CET 2013 requestsPerSecond=0, numberOfOnlineRegions=0, usedHeapMB=30, maxHeapMB=185
n4.example.com,60020,1361284298009 Tue Feb 19 15:31:38 CET 2013 requestsPerSecond=0, numberOfOnlineRegions=2, usedHeapMB=27, maxHeapMB=185
Total: servers: 4 requestsPerSecond=0, numberOfOnlineRegions=4
Load is requests per second and count of regions loaded

Dead Region Servers

Regions in Transition

No regions in transition.

thank you for your attention .
Onur

turna...@gmail.com

unread,

Feb 20, 2013, 12:59:54 PM2/20/13

to hue-...@cloudera.org

hi chris ,

i tried now somethings and worked ;
[root@n1 ~]# hadoop fs -ls /hbase
Found 10 items
drwxr-xr-x - hbase hbase 0 2013-02-08 04:11 /hbase/-ROOT-
drwxr-xr-x - hbase hbase 0 2013-02-08 04:11 /hbase/.META.
drwxr-xr-x - hbase hbase 0 2013-02-08 04:12 /hbase/.corrupt
drwxr-xr-x - hbase hbase 0 2013-02-19 15:27 /hbase/.logs
drwxr-xr-x - hbase hbase 0 2013-02-19 15:28 /hbase/.oldlogs
drwxr-xr-x - hbase hbase 0 2013-02-16 22:20 /hbase/cars
-rw-r--r-- 3 hbase hbase 38 2013-02-08 04:11 /hbase/hbase.id
-rw-r--r-- 3 hbase hbase 3 2013-02-08 04:11 /hbase/hbase.version
drwxr-xr-x - hbase hbase 0 2013-02-19 15:26 /hbase/splitlog
drwxr-xr-x - hbase hbase 0 2013-02-16 22:10 /hbase/test
[root@n1 ~]# hadoop fs -ls /hbase/cars
Found 3 items
-rw-r--r-- 3 hbase hbase 509 2013-02-16 22:20 /hbase/cars/.tableinfo.0000000001
drwxr-xr-x - hbase hbase 0 2013-02-16 22:20 /hbase/cars/.tmp
drwxr-xr-x - hbase hbase 0 2013-02-18 04:39 /hbase/cars/7c91bdc9437420e2896525114c0a0499
[root@n1 ~]# hadoop fs -ls /
Found 3 items
drwxr-xr-x - hbase hbase 0 2013-02-16 22:20 /hbase
drwxrwxrwt - hdfs hdfs 0 2013-02-20 10:39 /tmp
drwxr-xr-x - hdfs supergroup 0 2013-02-08 04:14 /user

[root@n1 ~]# hadoop fs -ls /user/beeswax/warehouse
Found 2 items
drwxr-xr-x - hue hdfs 0 2013-02-20 10:38 /user/beeswax/warehouse/sample_07
drwxr-xr-x - hue hdfs 0 2013-02-20 10:39 /user/beeswax/warehouse/sample_08

[root@n1 ~]# hadoop fs -ls /user/beeswax/warehouse/sample_07
Found 1 items
-rw-r--r-- 3 hue hdfs 46055 2013-02-20 10:39 /user/beeswax/warehouse/sample_07/sample_07.csv

know i can see every tables on hadoop, which I created but I'am could noy figure out these errors,
Driver returned: 1. Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302191549_48687276.txt
FAILED: Error in metadata: MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=admin, access=WRITE, inode="/user/beeswax/warehouse":hue:hive:drwxrwxr-x

at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:186)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:135)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4547)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4518)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2880)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:2844)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2823)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:639)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:417)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44096)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

and

thanks
Onur

turna...@gmail.com

unread,

Feb 20, 2013, 1:00:07 PM2/20/13

to hue-...@cloudera.org

hi chris ,

at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:186)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:135)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4547)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4518)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2880)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:2844)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2823)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:639)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:417)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44096)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

and

thanks
Onur

Romain Rigaux

unread,

Feb 20, 2013, 1:26:14 PM2/20/13

to turna...@gmail.com, hue-...@cloudera.org

Original error was because the user 'admin' did not have a home (/user/admin). This is fixed in the 2.2 version coming this month. In the meantime you can create it in the FileBrowser if you are logged in as the 'hdfs' user.

The second error is because '/user/beeswax/warehouse' should be chmoded to 1777 (cf https://ccp.cloudera.com/display/CDH4DOC/Hue+Installation#HueInstallation-HiveConfiguration).

About HBase I think that as long as you register the jars it will work (Beeswax is similar to an embeddeed Hive Client).

Romain

turna...@gmail.com

unread,

Feb 20, 2013, 2:42:27 PM2/20/13

to hue-...@cloudera.org

Hi Romain,
should I create a new directory or manipulate my existing user admin ( if change name as hdfs ) it will be not enough. if not how exactly should i create a new folder with file browser gui?

Thanks
Onur

Romain Rigaux

unread,

Feb 20, 2013, 2:51:01 PM2/20/13

to turna...@gmail.com, hue-...@cloudera.org

The latest error will be solved if you create a 'hdfs' superuser in Hue, login as 'hdfs' in Hue and then go to FileBrowser and change the permissions of '/user/beeswax/warehouse' to 1777 (check all the boxes).

You can also do in a shell:

sudo -u hdfs chmod 1777 /user/beeswax/warehouse

Romain

turna...@gmail.com

unread,

Feb 20, 2013, 7:58:17 PM2/20/13

to hue-...@cloudera.org

hey,

I created a new user as 'hdfs' in Hue GUI. than i erased hue example table in HUE, after that i preferred to change permission setting in command line. when i tried to used that command in shell ;

sudo -u hdfs chmod 1777 /user/beeswax/warehouse i getting that error;
chmod: cannot acces /user/beeswax/warehouse : No such file or directory and also

i tried as well as : hadoop fs -ls /user/beeswax/warehouse than I get nothing .

after i create new user as hdfs i tried to build example tables of hue although i erased these tables i am getting that error ; 'There was an error processing your request: Beeswax examples already installed.'

thank you for your attention

Onur

Romain Rigaux

unread,

Feb 20, 2013, 8:26:24 PM2/20/13

to turna...@gmail.com, hue-...@cloudera.org

If it is not there can you create it?

sudo -u hdfs mkdir /user/beeswax/warehouse

sudo -u hdfs chmod 1777 /user/beeswax/warehouse

To reset the examples in Hue:
/usr/share/hue/bin/hue dbshell
sqlite> delete from beeswax_metainstall;

Romain

turna...@gmail.com

unread,

Feb 26, 2013, 7:14:59 AM2/26/13

to hue-...@cloudera.org, turna...@gmail.com

hallo Romain,

I could not responced your last post. I was away from keyboard.

I tried all , what you said. But still ı am getting more than one error.

I created a new user for beeswax as "hdfs"

ı had deleted examples from hue's metastore (sqlite> delete from beeswax_metainstall;)

after that ı tried to create warehouse directory but ı could not ı am getting that errors;

than ı realized it directory is existing. ı reinstall sample tables from hue . while sample tables instaling . ı get this error

but I can see tables through beeswax.

ı tried to use a example query on sample tables but ı does not worked out ;

13/02/26 03:28:15 INFO exec.HiveHistory: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_1211673658.txt
13/02/26 03:28:15 INFO ql.Driver: <PERFLOG method=compile>
13/02/26 03:28:15 INFO parse.ParseDriver: Parsing command: SELECT s07.description, s07.total_emp, s08.total_emp, s07.salary
FROM
  sample_07 s07 JOIN 
  sample_08 s08
ON ( s07.code = s08.code )
WHERE
( s07.total_emp > s08.total_emp
 AND s07.salary > 100000 )
SORT BY s07.salary DESC
13/02/26 03:28:15 INFO parse.ParseDriver: Parse Completed
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Get metadata for source tables
13/02/26 03:28:15 INFO metastore.HiveMetaStore: 6: get_table : db=default tbl=sample_07
13/02/26 03:28:15 INFO HiveMetaStore.audit: ugi=hdfs	ip=unknown-ip-addr	cmd=get_table : db=default tbl=sample_07	
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO metastore.HiveMetaStore: 6: get_table : db=default tbl=sample_08
13/02/26 03:28:15 INFO HiveMetaStore.audit: ugi=hdfs	ip=unknown-ip-addr	cmd=get_table : db=default tbl=sample_08	
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Get metadata for subqueries
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Get metadata for destination tables
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for FS(12)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for OP(11)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for RS(10)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for SEL(9)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for FIL(8)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Pushdown Predicates of FIL For Alias : s07
13/02/26 03:28:15 INFO ppd.OpProcFactory: 	(_col3 > 100000)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for JOIN(7)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Pushdown Predicates of JOIN For Alias : s07
13/02/26 03:28:15 INFO ppd.OpProcFactory: 	(VALUE._col3 > 100000)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for RS(5)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Pushdown Predicates of RS For Alias : s07
13/02/26 03:28:15 INFO ppd.OpProcFactory: 	(salary > 100000)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for TS(3)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Pushdown Predicates of TS For Alias : s07
13/02/26 03:28:15 INFO ppd.OpProcFactory: 	(salary > 100000)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for RS(6)
13/02/26 03:28:15 INFO ppd.OpProcFactory: Processing for TS(4)
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_07 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO hive.log: DDL: struct sample_08 { string code, string description, i32 total_emp, i32 salary}
13/02/26 03:28:15 INFO physical.MetadataOnlyOptimizer: Looking for table scans where optimization is applicable
13/02/26 03:28:15 INFO physical.MetadataOnlyOptimizer: Found 0 metadata only table scans
13/02/26 03:28:15 INFO physical.MetadataOnlyOptimizer: Looking for table scans where optimization is applicable
13/02/26 03:28:15 INFO physical.MetadataOnlyOptimizer: Found 0 metadata only table scans
13/02/26 03:28:15 INFO parse.SemanticAnalyzer: Completed plan generation
13/02/26 03:28:15 INFO ql.Driver: Semantic Analysis Completed
13/02/26 03:28:15 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:description, type:string, comment:null), FieldSchema(name:total_emp, type:int, comment:null), FieldSchema(name:total_emp, type:int, comment:null), FieldSchema(name:salary, type:int, comment:null)], properties:null)
13/02/26 03:28:15 INFO ql.Driver: </PERFLOG method=compile start=1361878095433 end=1361878095784 duration=351>
Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
13/02/26 03:28:15 INFO exec.HiveHistory: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
13/02/26 03:28:15 INFO ql.Driver: <PERFLOG method=Driver.execute>
13/02/26 03:28:15 INFO ql.Driver: Starting command: SELECT s07.description, s07.total_emp, s08.total_emp, s07.salary
FROM
  sample_07 s07 JOIN 
  sample_08 s08
ON ( s07.code = s08.code )
WHERE
( s07.total_emp > s08.total_emp
 AND s07.salary > 100000 )
SORT BY s07.salary DESC
Total MapReduce jobs = 2
13/02/26 03:28:15 INFO ql.Driver: Total MapReduce jobs = 2
13/02/26 03:28:15 INFO ql.Driver: </PERFLOG method=TimeToSubmit end=1361878095885>
Launching Job 1 out of 2
13/02/26 03:28:15 INFO ql.Driver: Launching Job 1 out of 2
13/02/26 03:28:16 INFO exec.Utilities: Cache Content Summary for hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08 length: 46069 file count: 1 directory count: 1
13/02/26 03:28:16 INFO exec.Utilities: Cache Content Summary for hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07 length: 46055 file count: 1 directory count: 1
13/02/26 03:28:16 INFO exec.ExecDriver: BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=92124
Number of reduce tasks not specified. Estimated from input data size: 1
13/02/26 03:28:16 INFO exec.Task: Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
13/02/26 03:28:16 INFO exec.Task: In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
13/02/26 03:28:16 INFO exec.Task:   set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
13/02/26 03:28:16 INFO exec.Task: In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
13/02/26 03:28:16 INFO exec.Task:   set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
13/02/26 03:28:16 INFO exec.Task: In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
13/02/26 03:28:16 INFO exec.Task:   set mapred.reduce.tasks=<number>
13/02/26 03:28:16 INFO exec.ExecDriver: Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
13/02/26 03:28:16 INFO exec.ExecDriver: adding libjars: file:///usr/lib/hive/lib/hive-builtins-0.9.0-cdh4.1.3.jar
13/02/26 03:28:16 INFO exec.ExecDriver: Processing alias s07
13/02/26 03:28:16 INFO exec.ExecDriver: Adding input file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07
13/02/26 03:28:16 INFO exec.Utilities: Content Summary hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07length: 46055 num files: 1 num directories: 1
13/02/26 03:28:16 INFO exec.ExecDriver: Processing alias s08
13/02/26 03:28:16 INFO exec.ExecDriver: Adding input file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08
13/02/26 03:28:16 INFO exec.Utilities: Content Summary hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08length: 46069 num files: 1 num directories: 1
13/02/26 03:28:17 INFO exec.ExecDriver: Making Temp Directory: hdfs://n1.example.com:8020/tmp/hive-beeswax-hdfs/hive_2013-02-26_03-28-15_434_7027909062045907076/-mr-10002
13/02/26 03:28:21 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/02/26 03:28:22 INFO io.CombineHiveInputFormat: CombineHiveInputSplit creating pool for hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07; using filter path hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07
13/02/26 03:28:22 INFO io.CombineHiveInputFormat: CombineHiveInputSplit creating pool for hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08; using filter path hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08
13/02/26 03:28:22 INFO mapred.FileInputFormat: Total input paths to process : 2
13/02/26 03:28:22 INFO io.CombineHiveInputFormat: number of splits 2
Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
13/02/26 03:28:32 INFO exec.Task: Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
13/02/26 03:28:32 INFO exec.Task: Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
13/02/26 03:28:49 INFO exec.Task: Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
13/02/26 03:28:49 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
13/02/26 03:28:49 INFO exec.Task: 2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
13/02/26 03:29:49 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
13/02/26 03:29:49 INFO exec.Task: 2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
13/02/26 03:30:37 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
13/02/26 03:30:37 INFO exec.Task: 2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
13/02/26 03:30:37 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
Ended Job = job_201302210135_0001 with errors
13/02/26 03:30:37 ERROR exec.Task: Ended Job = job_201302210135_0001 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
13/02/26 03:30:37 ERROR ql.Driver: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
13/02/26 03:30:37 INFO ql.Driver: </PERFLOG method=Driver.execute start=1361878095875 end=1361878237752 duration=141877>
MapReduce Jobs Launched: 
13/02/26 03:30:37 INFO ql.Driver: MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
13/02/26 03:30:37 INFO ql.Driver: Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
13/02/26 03:30:37 INFO ql.Driver: Total MapReduce CPU Time Spent: 0 msec
13/02/26 03:30:38 ERROR beeswax.BeeswaxServiceImpl: Exception while processing query
BeeswaxException(message:Driver returned: 2.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0001 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
, log_context:e7199542-f187-458c-b3dd-887560485a81, handle:QueryHandle(id:e7199542-f187-458c-b3dd-887560485a81, log_context:e7199542-f187-458c-b3dd-887560485a81), SQLState:     )
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute(BeeswaxServiceImpl.java:319)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:577)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:566)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:337)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1312)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1.run(BeeswaxServiceImpl.java:566)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
13/02/26 03:30:39 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:BeeswaxException(message:Driver returned: 2.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0001 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
, log_context:e7199542-f187-458c-b3dd-887560485a81, handle:QueryHandle(id:e7199542-f187-458c-b3dd-887560485a81, log_context:e7199542-f187-458c-b3dd-887560485a81), SQLState:     )
13/02/26 03:30:39 ERROR beeswax.BeeswaxServiceImpl: Caught BeeswaxException
BeeswaxException(message:Driver returned: 2.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302260328_755410322.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0001
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-26 03:28:49,177 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:29:49,548 Stage-1 map = 0%,  reduce = 0%
2013-02-26 03:30:37,517 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0001 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
, log_context:e7199542-f187-458c-b3dd-887560485a81, handle:QueryHandle(id:e7199542-f187-458c-b3dd-887560485a81, log_context:e7199542-f187-458c-b3dd-887560485a81), SQLState:     )
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute(BeeswaxServiceImpl.java:319)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:577)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:566)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:337)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1312)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1.run(BeeswaxServiceImpl.java:566)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)

Duration	0:02:12
Ended	02/26/13 03:30:36
ID	job_201302210135_0001
User	hdfs
Mapred Input Dir	hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07 hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_08
Mapred Input Format Class	org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
Mapred Mapper Class	org.apache.hadoop.hive.ql.exec.ExecMapper
Mapred Output Format Class	org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl
Mapred Reducer Class	org.apache.hadoop.hive.ql.exec.ExecReducer
Maps	0 of 2
Reduces	0 of 1
Started	02/26/13 03:28:24
Status	FAILED

I can not import my database before ı fix that errors. otherweise ı can see now sample databases on hdfs ;

Romain Rigaux

unread,

Feb 26, 2013, 12:30:27 PM2/26/13

to turna...@gmail.com, hue-...@cloudera.org

So the 'hadoop fs' name was missing from the command:
sudo -u hdfs hadoop fs -mkdir /user/beeswax/warehouse
sudo -u hdfs hadoop fs -chmod 1777 /user/beeswax/warehouse

But as the files are already in the warehouse, you don't need to create it. If the tables appears in the 'Tables' tab you are fine (if not, delete the sudo -u hdfs hadoop fs -rmr /user/beeswax/warehouse/* and again in sqlite and reinstall them in Hue by clicking on the button).

However, looking at you logs, it seems that the tables are good but the underlying MapReduces jobs are failing. You need to look at their logs by clicking on one of the id below "MR JOBS" when the query is running.
How this link too from the logs.


Starting Job = job_201302210135_0001, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0001

Romain

turna...@gmail.com

unread,

Feb 27, 2013, 2:26:13 PM2/27/13

to hue-...@cloudera.org, turna...@gmail.com

hi Romain,

I tried last command fro change permission. and that,s work fine. thank you very much.

ı had checked out all logs ,while processing of query. ı guess map-reduce does not work properly . every time when ı make a query , jobs are failed. they looks like so,

that output of logs;

Starting Job = job_201302210135_0002, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0002
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0002
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-27 10:56:24,583 Stage-1 map = 0%,  reduce = 0%
2013-02-27 10:57:24,870 Stage-1 map = 0%,  reduce = 0%
2013-02-27 10:57:50,623 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0002 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
, log_context:f8067762-4935-472e-9437-8f550b65e54a, handle:QueryHandle(id:f8067762-4935-472e-9437-8f550b65e54a, log_context:f8067762-4935-472e-9437-8f550b65e54a), SQLState:     )
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute(BeeswaxServiceImpl.java:319)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:577)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:566)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:337)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1312)
	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1.run(BeeswaxServiceImpl.java:566)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)

Driver returned: 2.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201302271055_1265331267.txt
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201302210135_0002, Tracking URL = http://n1.example.com:50030/jobdetails.jsp?jobid=job_201302210135_0002
Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=n1.example.com:8021 -kill job_201302210135_0002
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2013-02-27 10:56:24,583 Stage-1 map = 0%,  reduce = 0%
2013-02-27 10:57:24,870 Stage-1 map = 0%,  reduce = 0%
2013-02-27 10:57:50,623 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302210135_0002 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

they are from mr job ;

View Failed Tasks » Failed Tasks

Tasks	Type
m_000000	MAP
m_000001	MAP

View All Tasks »

Recent Tasks

Tasks	Type
m_000002	JOB_CLEANUP
m_000003	JOB_SETUP

thats from one task from above ;

Attempt ID	Progress	State	Task Tracker	Start Time	End Time	Output Size	Phase
000000_0	100%	failed	tracker_n3.example.com:localhost/127.0.0.1:36555	02/27/13 11:00:06	02/27/13 11:00:19	-1	CLEANUP
000000_1	100%	failed	tracker_n2.example.com:localhost/127.0.0.1:46543	02/27/13 11:00:20	02/27/13 11:00:32	-1	CLEANUP
000000_2	100%	failed	tracker_n4.example.com:localhost/127.0.0.1:41410	02/27/13 11:04:10	02/27/13 11:04:19	-1	CLEANUP
000000_3	0%	killed	tracker_n1.example.com:localhost/127.0.0.1:59170	02/27/13 10:57:37	02/27/13 10:57:46	-1	MAP

Hadoop job_201302210135_0002 on n1

User: hdfs
Job Name: SELECT s07.description, s07.salary, s...DESC(Stage-1)
Job File: hdfs://n1.example.com:8020/user/hdfs/.staging/job_201302210135_0002/job.xml
Submit Host: n1.example.com
Submit Host Address: 192.168.0.241
Job-ACLs: All users are allowed
Job Setup: Successful
Status: Failed
Failure Info:NA
Started at: Wed Feb 27 19:56:16 CET 2013
Failed at: Wed Feb 27 19:57:50 CET 2013
Failed in: 1mins, 34sec
Job Cleanup: Successful

Kind

% Complete

Num Tasks

Pending

Running

Complete

Killed

Failed/Killed
Task Attempts

map

100.00%

2

0

2

7 / 1

reduce

100.00%

1

0

1

0 / 0

	Counter	Total
Job Counters	Failed map tasks	1
	Launched map tasks	8
	Data-local map tasks	2
	Rack-local map tasks	6
	Total time spent by all maps in occupied slots (ms)	58,457
	Total time spent by all reduces in occupied slots (ms)	0
	Total time spent by all maps waiting after reserving slots (ms)	0
	Total time spent by all reduces waiting after reserving slots (ms)	0

Map Completion Graph - close

Reduce Completion Graph - close

I suppose , I should correct map reduce configuration or . but ı do not know , why is that happening?

thank you for your attention

Onur Turna

Romain Rigaux

unread,

Feb 27, 2013, 2:43:07 PM2/27/13

to turna...@gmail.com, hue-...@cloudera.org

If MapReduce is not configured, you can test it by running an example like described in https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-RunninganexampleapplicationwithMRv1

About Beeswax, it is nice too look at the tasks of the jobs in JobBrowser but could you drill down to the 'log' level of a failed task? (the log from the MapReduce task itself should show what's going wrong)

Romain

turna...@gmail.com

unread,

Feb 27, 2013, 3:50:56 PM2/27/13

to hue-...@cloudera.org, turna...@gmail.com

hi Romain,

ı did not configured map-reduce myself. valid configuration of map reduce are default configuration of cloudera.

I checked where you said. they looks like so,

FAILED MAP task list for 0002_1361991376229_hdfs

Task Id	Start Time	Finish Time	Error
task_201302210135_0002_m_000000	27/02 20:00:06	27/02 20:00:19 (13sec)	Error: Java heap space
task_201302210135_0002_m_000000	27/02 20:00:20	27/02 20:00:32 (12sec)	Error: Java heap space
task_201302210135_0002_m_000000	27/02 20:04:10	27/02 20:04:19 (8sec)	Error: Java heap space
task_201302210135_0002_m_000001	27/02 20:03:45	27/02 20:03:57 (12sec)	Error: Java heap space
task_201302210135_0002_m_000001	27/02 20:00:20	27/02 20:00:31 (11sec)	Error: Java heap space
task_201302210135_0002_m_000001	27/02 19:56:55	27/02 19:57:36 (41sec)	Error: Java heap space
task_201302210135_0002_m_000001	27/02 20:01:17	27/02 20:01:26 (8sec)	Error: Java heap space

attempt_201302210135_0002_m_000000_0	27/02 20:00:06	27/02 20:00:19 (13sec)	n3.example.com	Error: Java heap space	Last 4KB Last 8KB All
attempt_201302210135_0002_m_000000_1	27/02 20:00:20	27/02 20:00:32 (12sec)	n2.example.com	Error: Java heap space	Last 4KB Last 8KB All
attempt_201302210135_0002_m_000000_2	27/02 20:04:10	27/02 20:04:19 (8sec)	n4.example.com	Error: Java heap space	Last 4KB Last 8KB All
attempt_201302210135_0002_m_000000_3	27/02 19:57:37	27/02 19:57:46 (9sec)	n1.example.com		Last 4KB Last 8KB All

why could be heap space error? it's could be because of insufficient hardware of clusters or ? My all clusters (4 vm cluster) works on vmware in my personal laptop , which have 8 bg ram they looks like so ;

n1.example.com

192.168.0.241

/default

CDH4

Good

9.51s ago

2

0,84

0,76

5.1 GiB / 37.2 GiB

1.4 GiB / 1.4 GiB

549.8 MiB / 2.9 GiB

n2.example.com

192.168.0.242

/default

CDH4

Good

1.90s ago

2

0,00

4.7 GiB / 37.2 GiB

462.3 MiB / 1.4 GiB

0 B / 2.9 GiB

n3.example.com

192.168.0.243

/default

CDH4

Good

3.33s ago

2

0,05

0,01

0,00

4.7 GiB / 37.2 GiB

483.5 MiB / 1.4 GiB

0 B / 2.9 GiB

n4.example.com

192.168.0.246

/default

CDH4

Good

6.52s ago

2

0,00

4.7 GiB / 37.2 GiB

476.3 MiB / 1.4 GiB

0 B / 2.9 GiB

and also except these ı looked up syslog as well , which look so ;

http://n1.example.com:50030/logs/userlogs/job_201302210135_0002/attempt_201302210135_0002_m_000000_3/syslog

2013-02-27 19:57:41,710 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-27 19:57:43,371 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/1804431985416186481_538532255_486741148/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-27_10-55-58_866_1132759753206554877/-mr-10004/eee2a09a-a215-43c3-aab5-1de316044d27 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/attempt_201302210135_0002_m_000000_3/work/HIVE_PLANeee2a09a-a215-43c3-aab5-1de316044d27
2013-02-27 19:57:43,619 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/attempt_201302210135_0002_m_000000_3/work/.job.jar.crc
2013-02-27 19:57:43,622 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0002/attempt_201302210135_0002_m_000000_3/work/job.jar
2013-02-27 19:57:43,700 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-27 19:57:43,719 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-27 19:57:46,132 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-27 19:57:46,242 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@101a0ae6
2013-02-27 19:57:46,564 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-27 19:57:46,568 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: Failed on local exception: java.io.IOException; Host Details : local host is: "n1.example.com/192.168.0.241"; destination host is: "n1.example.com":8020;

Thanks,

Onur Turna

Romain Rigaux

unread,

Feb 27, 2013, 4:21:00 PM2/27/13

to turna...@gmail.com, hue-...@cloudera.org

Hum, that's a lot of VM for one physical machine (I personally use a pseudo distributed cluster).


What does it say when you click on the 'Last 4KB' of a failed task with the heap space error?

Error: Java heap space

Last 4KB
Last 8KB
All

Could you please try to run the Hadoop example? https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-RunninganexampleapplicationwithMRv1

About this I don't know. Maybe the JobTracker logs have more information, or it a proxy user setting is missing.

2013-02-27 19:57:46,568 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: Failed on local exception: java.io.IOException; Host Details : local host is: "n1.example.com/192.168.0.241"; destination host is: "n1.example.com":8020;

Romain

turna...@gmail.com

unread,

Feb 28, 2013, 5:32:34 PM2/28/13

to hue-...@cloudera.org, turna...@gmail.com

Hi,

I want to make a performance test, that's way i need a as much as possible cluster with my computer ..... as far as i know performance to much relevant number of cluster ( namenode).

Hadoop job_201302210135_0003 failures on n1

Attempt	Task	Machine	State	Error	Logs
attempt_201302210135_0003_m_000000_0	task_201302210135_0003_m_000000	n3.example.com	FAILED

Error: Java heap space

Last 4KB
Last 8KB
All

attempt_201302210135_0003_m_000000_1

task_201302210135_0003_m_000000

n1.example.com

FAILED

Error: Java heap space

Last 4KB
Last 8KB
All

attempt_201302210135_0003_m_000000_2

task_201302210135_0003_m_000000

n2.example.com

FAILED

Error: Java heap space

Last 4KB
Last 8KB
All

attempt_201302210135_0003_m_000000_3

when i clicked "all" first one ;

Task Logs: 'attempt_201302210135_0003_m_000000_0'

stdout logs

stderr logs

syslog logs

2013-02-28 23:01:24,052 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-28 23:01:24,908 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/5651432844641255173_-1826301297_584014340/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-28_13-57-19_791_3663977264499258484/-mr-10003/fb822c07-e95c-413f-b2b5-b6edb08f63c6 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_0/work/HIVE_PLANfb822c07-e95c-413f-b2b5-b6edb08f63c6
2013-02-28 23:01:24,918 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_0/work/.job.jar.crc
2013-02-28 23:01:24,921 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_0/work/job.jar
2013-02-28 23:01:24,974 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-28 23:01:24,975 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-28 23:01:25,622 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-28 23:01:25,635 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@3e152f4
2013-02-28 23:01:26,156 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-02-28 23:01:26,436 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-02-28 23:01:26,436 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-02-28 23:01:26,441 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-02-28 23:01:26,452 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-02-28 23:01:26,629 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-28 23:01:26,634 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)

second one;

Task Logs: 'attempt_201302210135_0003_m_000000_1'


stdout logs


stderr logs


syslog logs
2013-02-28 22:57:53,287 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-28 22:57:54,499 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/4869933896674749018_-1826301297_584014340/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-28_13-57-19_791_3663977264499258484/-mr-10003/fb822c07-e95c-413f-b2b5-b6edb08f63c6 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_1/work/HIVE_PLANfb822c07-e95c-413f-b2b5-b6edb08f63c6
2013-02-28 22:57:54,506 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_1/work/.job.jar.crc
2013-02-28 22:57:54,508 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_1/work/job.jar
2013-02-28 22:57:54,558 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-28 22:57:54,560 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-28 22:57:55,155 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-28 22:57:55,170 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@2326a29c
2013-02-28 22:57:55,881 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-02-28 22:57:56,469 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-02-28 22:57:56,469 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-02-28 22:57:56,475 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-02-28 22:57:56,499 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-02-28 22:57:56,676 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-28 22:57:56,680 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)

third one;
Task Logs: 'attempt_201302210135_0003_m_000000_2'


stdout logs


stderr logs


syslog logs
2013-02-28 23:01:43,350 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-28 23:01:44,477 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/-6064914283392164676_-1826301297_584014340/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-28_13-57-19_791_3663977264499258484/-mr-10003/fb822c07-e95c-413f-b2b5-b6edb08f63c6 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_2/work/HIVE_PLANfb822c07-e95c-413f-b2b5-b6edb08f63c6
2013-02-28 23:01:44,510 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_2/work/.job.jar.crc
2013-02-28 23:01:44,528 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_2/work/job.jar
2013-02-28 23:01:44,659 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-28 23:01:44,661 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-28 23:01:45,147 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-28 23:01:45,169 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@7d15d06c
2013-02-28 23:01:45,535 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-02-28 23:01:45,842 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-02-28 23:01:45,843 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-02-28 23:01:45,850 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-02-28 23:01:45,866 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-02-28 23:01:46,034 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-28 23:01:46,037 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)

 and last one;
Task Logs: 'attempt_201302210135_0003_m_000000_3'


stdout logs


stderr logs


syslog logs
2013-02-28 23:05:29,916 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-02-28 23:05:30,877 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/2824718158320928215_-1826301297_584014340/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-02-28_13-57-19_791_3663977264499258484/-mr-10003/fb822c07-e95c-413f-b2b5-b6edb08f63c6 <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_3/work/HIVE_PLANfb822c07-e95c-413f-b2b5-b6edb08f63c6
2013-02-28 23:05:30,920 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_3/work/.job.jar.crc
2013-02-28 23:05:30,924 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201302210135_0003/attempt_201302210135_0003_m_000000_3/work/job.jar
2013-02-28 23:05:30,989 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-02-28 23:05:30,990 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-02-28 23:05:31,447 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-02-28 23:05:31,463 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@67071c84
2013-02-28 23:05:31,819 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-02-28 23:05:32,099 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-02-28 23:05:32,100 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-02-28 23:05:32,103 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-02-28 23:05:32,111 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-02-28 23:05:32,282 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-02-28 23:05:32,286 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)




and also;

I followed tutorial from cloudera webseite , which you sad to me. ( https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-RunninganexampleapplicationwithMRv1)
	
	so far as i understood  i have created a user as onur but i could not create a  new directory(input) as the user onur.

actually i  guess , i have to learn more about hadoop commands. if you know some hands on tutorial for hadoop and map-reduce i appreciate you.



thank you for your attention
Onur Turna

...

turna...@gmail.com

unread,

Mar 1, 2013, 5:14:13 PM3/1/13

to hue-...@cloudera.org, turna...@gmail.com

Hi ,

I had tried MapReduce through that manual . when i want to put xml datei from hadoop configuration i getting permission denied error. that's way i can not test mapreduce functionality. i suppose i have to create a new hadoop user on my centos? or ?

if i have to create new hadoop user on my centos how should i do that?

thanks

onur turna

Romain Rigaux

unread,

Mar 1, 2013, 6:29:03 PM3/1/13

to turna...@gmail.com, hue-...@cloudera.org

The memory problem is a well known one, you can find how to bump it on:

http://stackoverflow.com/questions/8464048/out-of-memory-error-in-hadoop

For the second question, you need to create a Unix user 'onur'. e.g.

adduser onur

Romain

turna...@gmail.com

unread,

Mar 1, 2013, 7:35:14 PM3/1/13

to hue-...@cloudera.org, turna...@gmail.com

i had tired to run mapr-reduce example. i get same error;

[root@n1 ~]# sudo -u hdfs /usr/bin/hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /user/onur/input /user/onur/output 'dfs[a-z.]+'

13/03/02 01:26:55 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.

13/03/02 01:26:55 INFO mapred.FileInputFormat: Total input paths to process : 3

13/03/02 01:26:56 INFO mapred.JobClient: Running job: job_201302210135_0006

13/03/02 01:26:57 INFO mapred.JobClient: map 0% reduce 0%

13/03/02 01:27:17 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000000_0, Status : FAILED

Error: Java heap space

13/03/02 01:27:19 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000001_0, Status : FAILED

Error: Java heap space

13/03/02 01:27:27 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000002_0, Status : FAILED

Error: Java heap space

13/03/02 01:27:35 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000000_1, Status : FAILED

Error: Java heap space

13/03/02 01:27:35 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000001_1, Status : FAILED

Error: Java heap space

13/03/02 01:27:46 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000002_1, Status : FAILED

Error: Java heap space

13/03/02 01:27:49 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000001_2, Status : FAILED

Error: Java heap space

13/03/02 01:27:51 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000000_2, Status : FAILED

Error: Java heap space

13/03/02 01:28:03 INFO mapred.JobClient: Task Id : attempt_201302210135_0006_m_000002_2, Status : FAILED

13/03/02 01:28:10 INFO mapred.JobClient: Job complete: job_201302210135_0006

13/03/02 01:28:10 INFO mapred.JobClient: Counters: 8

13/03/02 01:28:10 INFO mapred.JobClient: Job Counters

13/03/02 01:28:10 INFO mapred.JobClient: Failed map tasks=1

13/03/02 01:28:10 INFO mapred.JobClient: Launched map tasks=12

13/03/02 01:28:10 INFO mapred.JobClient: Data-local map tasks=9

13/03/02 01:28:10 INFO mapred.JobClient: Rack-local map tasks=3

13/03/02 01:28:10 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=86972

13/03/02 01:28:10 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0

13/03/02 01:28:10 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0

13/03/02 01:28:10 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0

13/03/02 01:28:10 INFO mapred.JobClient: Job Failed: NA

java.io.IOException: Job failed!

at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1322)

at org.apache.hadoop.examples.Grep.run(Grep.java:69)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.examples.Grep.main(Grep.java:93)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)

at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)

at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

i'm going to figure it out. i hope soon.

thanks

onur turna

<b style="line-height:17px;font-size:11px;font-family:sans-seri
...

Romain Rigaux

unread,

Mar 1, 2013, 7:56:41 PM3/1/13

to turna...@gmail.com, hue-...@cloudera.org

If playing with the options from the link in my previous post does not help, I would recommend just setting up a pseudo distributed cluser.

I also guess that '/user/onur/input' is pretty small?

Romain

Abraham Elmahrek

unread,

Mar 1, 2013, 8:21:00 PM3/1/13

to Romain Rigaux, turna...@gmail.com, hue-...@cloudera.org

Hey Onur,

Please check your system limits via 'ulimit -a'... some times the out of memory exception is thrown if your system cannot open a process or if there aren't enough file descriptors available.

-Abe

turna...@gmail.com

unread,

Mar 1, 2013, 9:50:29 PM3/1/13

to hue-...@cloudera.org, turna...@gmail.com

Hi,

i changed child hesap size just like example of webseite(1024 mib). and it worked ;

[root@n1 ~]# sudo -u hdfs /usr/bin/hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /user/onur/input /user/onur/output 'dfs[a-z.]+'

13/03/02 03:14:08 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.

13/03/02 03:14:11 INFO mapred.FileInputFormat: Total input paths to process : 3

13/03/02 03:14:14 INFO mapred.JobClient: Running job: job_201303020304_0001

13/03/02 03:14:15 INFO mapred.JobClient: map 0% reduce 0%

13/03/02 03:14:40 INFO mapred.JobClient: map 33% reduce 0%

13/03/02 03:14:45 INFO mapred.JobClient: map 66% reduce 0%

13/03/02 03:14:56 INFO mapred.JobClient: map 100% reduce 0%

13/03/02 03:15:06 INFO mapred.JobClient: map 100% reduce 100%

13/03/02 03:15:08 INFO mapred.JobClient: Job complete: job_201303020304_0001

13/03/02 03:15:08 INFO mapred.JobClient: Counters: 33

13/03/02 03:15:08 INFO mapred.JobClient: File System Counters

13/03/02 03:15:08 INFO mapred.JobClient: FILE: Number of bytes read=200

13/03/02 03:15:08 INFO mapred.JobClient: FILE: Number of bytes written=704772

13/03/02 03:15:08 INFO mapred.JobClient: FILE: Number of read operations=0

13/03/02 03:15:08 INFO mapred.JobClient: FILE: Number of large read operations=0

13/03/02 03:15:08 INFO mapred.JobClient: FILE: Number of write operations=0

13/03/02 03:15:08 INFO mapred.JobClient: HDFS: Number of bytes read=4190

13/03/02 03:15:08 INFO mapred.JobClient: HDFS: Number of bytes written=416

13/03/02 03:15:08 INFO mapred.JobClient: HDFS: Number of read operations=8

13/03/02 03:15:08 INFO mapred.JobClient: HDFS: Number of large read operations=0

13/03/02 03:15:08 INFO mapred.JobClient: HDFS: Number of write operations=4

13/03/02 03:15:08 INFO mapred.JobClient: Job Counters

13/03/02 03:15:08 INFO mapred.JobClient: Launched map tasks=3

13/03/02 03:15:08 INFO mapred.JobClient: Launched reduce tasks=2

13/03/02 03:15:08 INFO mapred.JobClient: Data-local map tasks=3

13/03/02 03:15:08 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=76944

13/03/02 03:15:08 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=13855

13/03/02 03:15:08 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0

13/03/02 03:15:08 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0

13/03/02 03:15:08 INFO mapred.JobClient: Map-Reduce Framework

13/03/02 03:15:08 INFO mapred.JobClient: Map input records=147

13/03/02 03:15:08 INFO mapred.JobClient: Map output records=7

13/03/02 03:15:08 INFO mapred.JobClient: Map output bytes=188

13/03/02 03:15:08 INFO mapred.JobClient: Input split bytes=329

13/03/02 03:15:08 INFO mapred.JobClient: Combine input records=7

13/03/02 03:15:08 INFO mapred.JobClient: Combine output records=7

13/03/02 03:15:08 INFO mapred.JobClient: Reduce input groups=7

13/03/02 03:15:08 INFO mapred.JobClient: Reduce shuffle bytes=256

13/03/02 03:15:08 INFO mapred.JobClient: Reduce input records=7

13/03/02 03:15:08 INFO mapred.JobClient: Reduce output records=7

13/03/02 03:15:08 INFO mapred.JobClient: Spilled Records=14

13/03/02 03:15:08 INFO mapred.JobClient: CPU time spent (ms)=33400

13/03/02 03:15:08 INFO mapred.JobClient: Physical memory (bytes) snapshot=604430336

13/03/02 03:15:08 INFO mapred.JobClient: Virtual memory (bytes) snapshot=7996264448

13/03/02 03:15:08 INFO mapred.JobClient: Total committed heap usage (bytes)=267726848

13/03/02 03:15:08 INFO mapred.JobClient: org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter

13/03/02 03:15:08 INFO mapred.JobClient: BYTES_READ=3861

13/03/02 03:15:08 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.

13/03/02 03:15:08 INFO mapred.FileInputFormat: Total input paths to process : 2

13/03/02 03:15:08 INFO mapred.JobClient: Running job: job_201303020304_0002

13/03/02 03:15:09 INFO mapred.JobClient: map 0% reduce 0%

13/03/02 03:15:21 INFO mapred.JobClient: map 100% reduce 0%

13/03/02 03:15:26 INFO mapred.JobClient: map 100% reduce 100%

13/03/02 03:15:28 INFO mapred.JobClient: Job complete: job_201303020304_0002

13/03/02 03:15:28 INFO mapred.JobClient: Counters: 33

13/03/02 03:15:28 INFO mapred.JobClient: File System Counters

13/03/02 03:15:28 INFO mapred.JobClient: FILE: Number of bytes read=165

13/03/02 03:15:28 INFO mapred.JobClient: FILE: Number of bytes written=415431

13/03/02 03:15:28 INFO mapred.JobClient: FILE: Number of read operations=0

13/03/02 03:15:28 INFO mapred.JobClient: FILE: Number of large read operations=0

13/03/02 03:15:28 INFO mapred.JobClient: FILE: Number of write operations=0

13/03/02 03:15:28 INFO mapred.JobClient: HDFS: Number of bytes read=658

13/03/02 03:15:28 INFO mapred.JobClient: HDFS: Number of bytes written=146

13/03/02 03:15:28 INFO mapred.JobClient: HDFS: Number of read operations=7

13/03/02 03:15:28 INFO mapred.JobClient: HDFS: Number of large read operations=0

13/03/02 03:15:28 INFO mapred.JobClient: HDFS: Number of write operations=2

13/03/02 03:15:28 INFO mapred.JobClient: Job Counters

13/03/02 03:15:28 INFO mapred.JobClient: Launched map tasks=2

13/03/02 03:15:28 INFO mapred.JobClient: Launched reduce tasks=1

13/03/02 03:15:28 INFO mapred.JobClient: Data-local map tasks=2

13/03/02 03:15:28 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=13984

13/03/02 03:15:28 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=4753

13/03/02 03:15:28 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0

13/03/02 03:15:28 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0

13/03/02 03:15:28 INFO mapred.JobClient: Map-Reduce Framework

13/03/02 03:15:28 INFO mapred.JobClient: Map input records=7

13/03/02 03:15:28 INFO mapred.JobClient: Map output records=7

13/03/02 03:15:28 INFO mapred.JobClient: Map output bytes=188

13/03/02 03:15:28 INFO mapred.JobClient: Input split bytes=242

13/03/02 03:15:28 INFO mapred.JobClient: Combine input records=0

13/03/02 03:15:28 INFO mapred.JobClient: Combine output records=0

13/03/02 03:15:28 INFO mapred.JobClient: Reduce input groups=1

13/03/02 03:15:28 INFO mapred.JobClient: Reduce shuffle bytes=184

13/03/02 03:15:28 INFO mapred.JobClient: Reduce input records=7

13/03/02 03:15:28 INFO mapred.JobClient: Reduce output records=7

13/03/02 03:15:28 INFO mapred.JobClient: Spilled Records=14

13/03/02 03:15:28 INFO mapred.JobClient: CPU time spent (ms)=2910

13/03/02 03:15:28 INFO mapred.JobClient: Physical memory (bytes) snapshot=365584384

13/03/02 03:15:28 INFO mapred.JobClient: Virtual memory (bytes) snapshot=4818108416

13/03/02 03:15:28 INFO mapred.JobClient: Total committed heap usage (bytes)=169353216

13/03/02 03:15:28 INFO mapred.JobClient: org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter

13/03/02 03:15:28 INFO mapred.JobClient: BYTES_READ=244

[root@n1 ~]# sudo -u hdfs hadoop fs -ls

Found 1 items

drwx------ - hdfs hdfs 0 2013-03-02 03:15 .staging

[root@n1 ~]# sudo -u hdfs hadoop fs -ls /user/onur

Found 2 items

drwxr-xr-x - hdfs supergroup 0 2013-03-02 00:19 /user/onur/input

drwxr-xr-x - hdfs supergroup 0 2013-03-02 03:15 /user/onur/output

[root@n1 ~]# sudo -u hdfs hadoop fs -ls /user/onur/output

Found 3 items

-rw-r--r-- 3 hdfs supergroup 0 2013-03-02 03:15 /user/onur/output/_SUCCESS

drwxr-xr-x - hdfs supergroup 0 2013-03-02 03:15 /user/onur/output/_logs

-rw-r--r-- 3 hdfs supergroup 146 2013-03-02 03:15 /user/onur/output/part-00000

[root@n1 ~]# sudo -u hdfs hadoop fs -ls /user/onur/output

Found 3 items

-rw-r--r-- 3 hdfs supergroup 0 2013-03-02 03:15 /user/onur/output/_SUCCESS

drwxr-xr-x - hdfs supergroup 0 2013-03-02 03:15 /user/onur/output/_logs

-rw-r--r-- 3 hdfs supergroup 146 2013-03-02 03:15 /user/onur/output/part-00000

[root@n1 ~]# sudo -u hdfs hadoop fs cat /user/onur/output/part-00000 | head

cat: Unknown command

Did you mean -cat? This command begins with a dash.

[root@n1 ~]# sudo -u hdfs hadoop fs -cat /user/onur/output/part-00000 | head

1 dfs.blocksize

1 dfs.https.port

1 dfs.replication

1 dfs.client.use.datanode.hostname

1 dfs.datanode.hdfs

1 dfs.https.address

1 dfs.namenode.http

But when i make a query through hive i am getting still same error.

ID	İsim	Status	Kullanıcı	Maps	Reduces	Queue	Priority	Duration	Date
201303020304_0003	SELECT sample_07.description, sample_...DESC(Stage-1)	failed	hdfs	0 / 1	0 / 1	default	normal	46s	03/01/13 18:20:09
201303020304_0002	grep-sort	succeeded	hdfs	2 / 2	1 / 1	default	normal	18s	03/01/13 18:15:08
201303020304_0001	grep-search	succeeded	hdfs	3 / 3	2 / 2	default	normal	54s	03/01/13 18:14:12

Hadoop job_201303020304_0003 failures on n1

Attempt	Task	Machine	State	Error	Logs

attempt_201303020304_0003_m_000000_0

task_201303020304_0003_m_000000

n3.example.com

FAILED

Error: Java heap space

Last 4KB
Last 8KB
All

attempt_201303020304_0003_m_000000_1

task_201303020304_0003_m_000000

n1.example.com

FAILED

Error: Java heap space

Last 4KB
Last 8KB
All

attempt_201303020304_0003_m_000000_2

task_201303020304_0003_m_000000

n2.example.com

FAILED

Error: Java heap space

Last 4KB
Last 8KB
All

attempt_201303020304_0003_m_000000_3

task_201303020304_0003_m_000000

n4.example.com

FAILED

Error: Java heap space

Last 4KB
Last 8KB
All

2013-03-02 03:24:02,074 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2013-03-02 03:24:02,861 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/1153535521351037204_943032454_686175782/n1.example.com/tmp/hive-beeswax-hdfs/hive_2013-03-01_18-20-01_992_5419204033953410960/-mr-10003/fd47ce6a-5113-4f45-88e2-90c469b71edf <- /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/attempt_201303020304_0003_m_000000_0/work/HIVE_PLANfd47ce6a-5113-4f45-88e2-90c469b71edf
2013-03-02 03:24:02,870 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/jars/.job.jar.crc <- /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/attempt_201303020304_0003_m_000000_0/work/.job.jar.crc
2013-03-02 03:24:02,872 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/jars/job.jar <- /mapred/local/taskTracker/hdfs/jobcache/job_201303020304_0003/attempt_201303020304_0003_m_000000_0/work/job.jar
2013-03-02 03:24:03,010 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2013-03-02 03:24:03,011 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-03-02 03:24:03,382 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-03-02 03:24:03,393 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@3d7dc1cb
2013-03-02 03:24:03,692 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2013-03-02 03:24:03,955 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://n1.example.com:8020/user/beeswax/warehouse/sample_07/sample_07.csv
2013-03-02 03:24:03,955 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2013-03-02 03:24:03,959 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-03-02 03:24:03,968 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 50
2013-03-02 03:24:04,113 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-03-02 03:24:04,117 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:385)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)

Should i assign more heap size again ( more than 1024 mib) ? i suppose, my namenode use full of your Ram. It can be because of that or ? i assigned to my all clusters 1.5 gb ram. hosts look like :

Actions for Selected

Add Hosts

Host Inspector

Re-run Host Upgrade Wizard

View Columns

	Name	IP	Rack	CDH Version	Health	Last Heartbeat	Number of Cores	Load Average	Disk Usage	Physical Memory	Swap Space

<td style="padding:4px 5px;vertical-align:top;border-top-width:1px;border-top-style:solid;borde
...

turna...@gmail.com

unread,

Mar 1, 2013, 9:54:00 PM3/1/13

to hue-...@cloudera.org, Romain Rigaux, turna...@gmail.com

Hey Abe

i checked my system limits through ulimit -a ; it showed me up like;

core file size (blocks, -c) 0

data seg size (kbytes, -d) unlimited

scheduling priority (-e) 0

file size (blocks, -f) unlimited

pending signals (-i) 11543

max locked memory (kbytes, -l) 64

max memory size (kbytes, -m) unlimited

open files (-n) 1024

pipe size (512 bytes, -p) 8

POSIX message queues (bytes, -q) 819200

real-time priority (-r) 0

stack size (kbytes, -s) 10240

cpu time (seconds, -t) unlimited

max user processes (-u) 1024

virtual memory (kbytes, -v) unlimited

file locks (-x) unlimited

is it normal or ?

onur turna

<span
...

Abraham Elmahrek

unread,

Mar 1, 2013, 10:04:36 PM3/1/13

to turna...@gmail.com, hue-...@cloudera.org, Romain Rigaux

Hey Turna,

Your limits show a small 'max user processes' and 'open files' I think. When HDFS is installed, it adds a file to /etc/limits.d normally that sets those numbers much higher.

I just read in your previous email that you got it working :). Glad to hear!

-Abe

turna...@gmail.com

unread,

Mar 1, 2013, 10:17:06 PM3/1/13

to hue-...@cloudera.org, turna...@gmail.com, Romain Rigaux

Hey,

How should i configure my limits? i installed hdfs through cdh 4 package .

Example of map-reduce is working but when i make a query through beeswax i am still getting error.

Onur Turna

<div style="text-align:center;overflow:hidden;margin-top:auto;margin-bottom:auto;font-size:11px;line-height:11px;background-image:-webkit-gradient(linear,0% 0%,0%
...

Romain Rigaux

unread,

Mar 1, 2013, 10:42:39 PM3/1/13

to turna...@gmail.com, hue-...@cloudera.org

Great, almost there!

How did you change the heap parameter?

If you set it in CM for MapReduce (/etc/hadoop/conf/mapred-site.xml without CM) it should be taken into account for all the jobs (even created by Hive)

   <property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx1024m</value>
  </property>

Or in Beeswax 'SETTINGS' on the left:


key: mapred.child.java.opts


value: -Xmx1024m

Romain

turna...@gmail.com

unread,

Mar 2, 2013, 1:45:46 PM3/2/13

to hue-...@cloudera.org, turna...@gmail.com

Hey,

I changed that value on Cloudera Manager as well as etc/hadoop/mapred-site.xml . but when I make a query still getting error. should I change also hive's configuration environment.

Thanks

Onur Turna

<label style="font-s
...

turna...@gmail.com

unread,

Mar 2, 2013, 2:00:44 PM3/2/13

to hue-...@cloudera.org, turna...@gmail.com

Hi,

ı had changed in CM for Mapreduce heap parameter but now ı checked hue's heap size it has still default size. which mapred-site.xml should I change for aplly to all tools(hive,hbase ..) from which folder exactly.

Thanks

Onur Turna

<table style="max-width:100%;border-collapse:collapse;background-c
...

Abraham Elmahrek

unread,

Mar 2, 2013, 3:18:49 PM3/2/13

to turna...@gmail.com, hue-...@cloudera.org

Hey Onur,

Which heap size are you referring to? Beeswax heap size is managed in the "beeswax" section of CM.

Also, beeswax has a tendency to run a little high on the number of file descriptors it opens up. I'm guessing you're on a redhat variant... newer versions of redhat/centos/fedora sometimes provide a hard override of the system limits. Please check all files under /etc/security/limits.d. More information on limits type 'man limits.conf' in your linux shell.

-Abe

turna...@gmail.com

unread,

Mar 4, 2013, 5:51:09 PM3/4/13

to hue-...@cloudera.org

Hey Abe,

I had assigned 1024 mib as java child heap size both for mapreduce( through cloudera's GUI) as well as for hue's java child heap size manually from hue's config files. and I also changed hadoop's The /etc/hadoop/hadoop-env.sh maximum java heap memory for Hadoop as "-Xmx1024m . and Now everthings work fine .

and except from these i also checked conf files from /etc/security/limits.d which are looks;

[root@n1 limits.d]# ls
90-nproc.conf hbase.nofiles.conf impala.conf mapreduce.conf
cloudera-scm.conf hdfs.conf mapred.conf yarn.conf

there are many conf files, which one should i check and how should I configure?

I actually want to start to import my tables , which are .csv formated, to hbase. I have read also some presentation , to make hbase hive integration .that's why I have established mysql server on one of my cluster for hive's metastore. I thought, I'm going to get a visual for my hbase table on beeswax GUI. but as far as I learned this is not an option at the moment. because of this how can somebody else(except who has built these tables) make a query through beeswax on hbase tables , which are not have a visual of tables on beeswax. and also I want to be sure , whether is that an option through beeswax sql like query for hbase tables or not.

Thank you very much for your attention,
Onur Turna

Romain Rigaux

unread,

Mar 4, 2013, 6:07:38 PM3/4/13

to turna...@gmail.com, hue-...@cloudera.org

So ulimit should not need to be touched for now so you are fine I think.

In Beeswax you can select your tables in the 'Tables' tab and view a data sample. You can also do a simple 'SELECT * from your_table' and see the content.

About using Hive on top of HBase this is possible: https://ccp.cloudera.com/display/CDH4DOC/Hive+Installation#HiveInstallation-UsingHivewithHBase

Romain

turna...@gmail.com

unread,

Mar 4, 2013, 6:23:30 PM3/4/13

to hue-...@cloudera.org, turna...@gmail.com

Hi Romain,

I already made that configuration :

Using Hive with HBase

To allow Hive scripts to use HBase, add the following statements to the top of each script:

ADD JAR /usr/lib/hive/lib/zookeeper.jar;

ADD JAR /usr/lib/hive/lib/hbase.jar;

ADD JAR /usr/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.2.0.jar

ADD JAR /usr/lib/hive/lib/guava-11.0.2.jar;

and i made also hive metastore configurations.

now can i get a visual of hbase tables on beeswax , which i created through hbase shell?

Thanks,

Onur Turna

turna...@gmail.com

unread,

Mar 14, 2013, 2:47:36 PM3/14/13

to hue-...@cloudera.org

Hey,
After 2 weeks pause . I started again to figure out hbase hive integration. I am reading tutorial from offical website of hive and ı hade encountered some things what ı do not understand.

CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key,a:b,a:c,d:e"
);
INSERT OVERWRITE TABLE hbase_table_1 SELECT foo, bar, foo+1, foo+2
FROM pokes WHERE foo=98 OR foo=100; from https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration

According that example hbase_table_1 have 3 column but here (INSERT OVERWRITE TABLE hbase_table_1 SELECT foo, bar, foo+1, foo+2 ) there are 4 column to overwrıte? or ı am misunderstanding ?

Thanks
Onur turna

Romain Rigaux

unread,

Mar 18, 2013, 1:16:17 PM3/18/13

to turna...@gmail.com, hue-...@cloudera.org

It has 4 columns: CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int)

I am not an HBase expert but 'key' is just the name of the first column in the table. Its references the HBase key at the same time.

Romain

Reply all

Reply to author

Forward

Pointing Beeswax to external hive metastore

Anki

bc Wong

Anki

bc Wong

Anki

bc Wong

Anki

bc Wong

Anki

Vinithra Varadharajan

Anki

mtanquary

bc Wong

mtanquary

bc Wong

mtanquary

bc Wong

bc Wong

bc Wong

bc Wong

turna...@gmail.com

Chris Conner

turna...@gmail.com

Chris Conner

turna...@gmail.com

Chris Conner

Romain Rigaux

turna...@gmail.com

Chris Conner

turna...@gmail.com

turna...@gmail.com

Chris Conner

Chris Conner

turna...@gmail.com

turna...@gmail.com

turna...@gmail.com

Romain Rigaux

turna...@gmail.com

Romain Rigaux

turna...@gmail.com

Romain Rigaux

turna...@gmail.com

Romain Rigaux

turna...@gmail.com

View Failed Tasks » Failed Tasks

Recent Tasks

Hadoop job_201302210135_0002 on n1

Romain Rigaux

turna...@gmail.com

hi Romain,

FAILED MAP task list for 0002_1361991376229_hdfs

Romain Rigaux

turna...@gmail.com

Hadoop job_201302210135_0003 failures on n1

Task Logs: 'attempt_201302210135_0003_m_000000_0'

Task Logs: 'attempt_201302210135_0003_m_000000_1'

Task Logs: 'attempt_201302210135_0003_m_000000_2'

Task Logs: 'attempt_201302210135_0003_m_000000_3'

turna...@gmail.com

Romain Rigaux

turna...@gmail.com

Romain Rigaux

Abraham Elmahrek

turna...@gmail.com

Hadoop job_201303020304_0003 failures on n1

turna...@gmail.com

Abraham Elmahrek

turna...@gmail.com

Romain Rigaux

turna...@gmail.com

turna...@gmail.com

Abraham Elmahrek

turna...@gmail.com

Romain Rigaux

turna...@gmail.com

Using Hive with HBase

turna...@gmail.com

Romain Rigaux