HDFS replication factor and block size configuration

154 views
Skip to first unread message

Stan Barton

unread,
Jan 11, 2011, 4:32:37 AM1/11/11
to Hypertable User
Hi,

I am using HT in version 0.9.4.3 with HDFS. I am trying to control the
replication factor of the files representing the DB in DFS. The DFS is
set to replication factor 2 (I know that it is file dependent and
depends also on what the client demands), and I have tried:

1. using the predicate 'replication = 2' while doing CREATE TABLE.
2. putting Hypertable.RangeServer.CellStore.DefaultReplication=2 to
the hypertable.cfg file.

Unfortunately, neither of those two works. Whenever I check the files
directly in HDFS I see the replication factor of 3 in the Hadoop's
GUI.

That goes also to a setting of the size of a HDFS block, which is
Hadoop's default 256 but I would like to use larger, since am aiming
for much larger table size.

Am I doing something wrong?

Stan

Doug Judd

unread,
Jan 11, 2011, 11:30:39 AM1/11/11
to hyperta...@googlegroups.com
Hi Stan,

This is a known bug (see issue 385).  The problem is that the HDFS broker currently does not read the default replication and blocksize from the config file in the Hadoop installation.  It should be an easy fix.  We'll try to get a fix into 0.9.5.0, but if you want to take a crack at it, the code is in:

src/java/Core/org/hypertable/DfsBroker/hadoop/HdfsBroker.java

The other two approaches that you list should work.  Can you instrument the HDFS broker (in HdfsBroker.java:Create) and print out these values right before the call to Filesystem.create ?

To re-build the hypertable jar file, pull the source code, run cmake and then run "make java".  This is described in sections 1 and 3 of the HOW TO BUILD FROM SOURCE section of the Hypertable README file.  You don't need to install all of the dependencies, just the jdk.  This will  create the jar file in a subdirectory of the build directory called java/.  Replace the jar file in your installation with this one and then restart hypertable.  The output will appear in the DfsBroker.hadoop.log file.

- Doug


--
You received this message because you are subscribed to the Google Groups "Hypertable User" group.
To post to this group, send email to hyperta...@googlegroups.com.
To unsubscribe from this group, send email to hypertable-us...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/hypertable-user?hl=en.


Stan Barton

unread,
Jan 13, 2011, 5:00:00 AM1/13/11
to Hypertable User
Hi Doug,

I have created my own Hypertable.jar with logging the parameters
passed to Crete method, and am getting:

INFO: Passed parameters: replication: -1, blockSize: -1, bufferSize:
-1
INFO: Created file : replication: 3, blockSize: 67108864, bufferSize:
4096


where the first line logs the parameters passed to the method and the
second line the values of the parameters actually passed to HDFS. It
is funny, because I have taken a look on the getDefaultReplication()
method in Hadoop and it should return "1" as a default, but 3 is used
instead and have no idea where that number comes from. Anyway, the
"-1" passed are not correct and should contain the values got either
from hypertable.cfg and/or the table description, I guess, but was not
able to find the c++ code that creates the events passed to the broker
to check.

Stan

On Jan 11, 5:30 pm, Doug Judd <d...@hypertable.com> wrote:
> Hi Stan,
>
> This is a known bug (see issue
> 385<http://code.google.com/p/hypertable/issues/detail?id=385>).
>  The problem is that the HDFS broker currently does not read the default
> replication and blocksize from the config file in the Hadoop installation.
>  It should be an easy fix.  We'll try to get a fix into 0.9.5.0, but if you
> want to take a crack at it, the code is in:
>
> src/java/Core/org/hypertable/DfsBroker/hadoop/HdfsBroker.java
>
> The other two approaches that you list should work.  Can you instrument the
> HDFS broker (in HdfsBroker.java:Create) and print out these values right
> before the call to Filesystem.create ?
>
> To re-build the hypertable jar file, pull the source
> code<http://code.google.com/p/hypertable/wiki/SourceCode?tm=4>,
> run cmake and then run "make java".  This is described in sections 1 and 3
> of the HOW TO BUILD FROM SOURCE section of the Hypertable
> README<https://github.com/nuggetwheat/hypertable/blob/master/README.md>file.
>  You don't need to install all of the dependencies, just the jdk.
>  This will  create the jar file in a subdirectory of the build directory
> called java/.  Replace the jar file in your installation with this one and
> then restart hypertable.  The output will appear in the DfsBroker.hadoop.log
> file.
>
> - Doug
>
> On Tue, Jan 11, 2011 at 1:32 AM, Stan Barton <stanislav.bar...@gmail.com>wrote:
>
> > Hi,
>
> > I am using HT in version 0.9.4.3 with HDFS. I am trying to control the
> > replication factor of the files representing the DB in DFS. The DFS is
> > set to replication factor 2 (I know that it is file dependent and
> > depends also on what the client demands), and I have tried:
>
> > 1. using the predicate 'replication = 2' while doing CREATE TABLE.
> > 2. putting Hypertable.RangeServer.CellStore.DefaultReplication=2 to
> > the hypertable.cfg file.
>
> > Unfortunately, neither of those two works. Whenever I check the files
> > directly in HDFS I see the replication factor of 3 in the Hadoop's
> > GUI.
>
> > That goes also to a setting of the size of a HDFS block, which is
> > Hadoop's default 256 but I would like to use larger, since am aiming
> > for much larger table size.
>
> > Am I doing something wrong?
>
> > Stan
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Hypertable User" group.
> > To post to this group, send email to hyperta...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > hypertable-us...@googlegroups.com<hypertable-user%2Bunsu...@googlegroups.com>
> > .

Doug Judd

unread,
Jan 14, 2011, 2:16:02 PM1/14/11
to hyperta...@googlegroups.com
Hi Stan,

Thanks for doing this.  I took a more thorough look through the code and I see what's happening.  We'll be sure this gets fixed in the upcoming release.

- Doug

To unsubscribe from this group, send email to hypertable-us...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages