Why KosmosBroker eat up too much memory?

23 views
Skip to first unread message

kuer

unread,
Jul 20, 2009, 10:06:06 PM7/20/09
to Hypertable Development
Hi, all,

I run Hypertable + KFS on 4 boxes.
On each box, there are :
* 1 chunk server
* 1 kosmosBorker connect to chunk-server locally
* 1 RangeServer connect to kosmosBroker locally

I found kosmosBroker eat up 1.7gb memory, I want to limit the memory
eaten by kosmosBroker. How to do it ?

thanks

-- kuer

kuer

unread,
Jul 21, 2009, 5:57:48 AM7/21/09
to Hypertable Development
Hi, all,

I analyze the dfs.log on one box :

# count files that were opened
$ grep ' open' dfs.log | grep 'fd=' | wc -l
21028

# count files that were created
$ grep ' create' dfs.log | grep 'fd=' | wc -l
10260

# count files that were closed
$ grep ' close' dfs.log | grep 'fd=' | wc -l
28519

# how many files that are opened now
$ echo '21028 + 10260 - 28519' | bc
2769

I found some files opened many times before close :
$ grep -n 'fd=7398' dfs.log
14310:2009-07-15 03:28:06,774 1156786496 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:86) open( /hypertable/tables/storage_se/default/
75C73F17368096C166F0DC49/cs46 ) fd=7398 local_fd=1370
29715:2009-07-18 07:57:12,230 1216268608 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:86) open( /hypertable/tables/storage_se/default/
C8A2F2AC3B035CE469808D05/cs5 ) fd=7398 local_fd=1703
32123:2009-07-18 12:17:11,049 1153329472 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:148) close fd=7398

I also found some files opened but never close :
$ grep -n 'fd=7350' dfs.log
14221:2009-07-15 01:31:19,662 1261685056 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:131) create( /hypertable/tables/storage_se/default/
06825ADA6690593E88E09C91/cs53 ) fd=7350 local_fd=1372
14233:2009-07-15 01:31:21,995 1272174912 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:148) close fd=7350
29631:2009-07-18 07:42:17,139 1237248320 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:86) open( /hypertable/tables/storage_se/default/
D9AFD0F87BF44F88B676BCC1/cs77 ) fd=7350 local_fd=1358

$ grep -n 'fd=7400' dfs.log
14321:2009-07-15 03:28:07,114 1261685056 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:86) open( /hypertable/tables/storage_se/default/
00BF88DA6E4A1B18B26FB4BB/cs49 ) fd=7400 local_fd=677
29726:2009-07-18 08:04:34,106 1205778752 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:86) open( /hypertable/tables/storage_se/agPORT/
8BD9582FEBEF358E57ED1D11/cs19 ) fd=7400 local_fd=642

So, I wrote a script to statist the status of fd, and found about
1729 openning handles (opened but not close). If every KFS handle
reserve some chunk-buffer for caching data, are these 1729 kfs-file-
handles
responsible for the 1.7GB memory eaten by KosmosBroker ?

In my system, the size of data are about 150 GB, there are about 2400
kfs-chunks ( 64MB/chunk). In KFS, replica-factor is 3.
Does this mean that Hypertable opened every chunk file and remain
opening (for query service) ??
Is there any timeout parameter that Hypertable will close read-only
file handle?

Thanks.

-- kuer

kuer

unread,
Jul 21, 2009, 6:04:18 AM7/21/09
to Hypertable Development
script of statist fd status in dfs.log

#!/usr/bin/env perl

use strict;
use warnings;

use constant OP_CREATE => 'create';
use constant OP_OPEN => 'open';
use constant OP_CLOSE => 'close';
use constant OP_UNKNOWN => 'unknown';

my %fds = ();

open(FH, 'dfs.log') or die "fail to open dfs.log -- $!\n";
while (<FH>) {
if (/fd\=(\d+)/) {
my $fd = $1;
my $op = OP_UNKNOWN;
if (/create/) {
$op = OP_CREATE;
if (exists $fds{$fd}) {
my $op_last = $fds{$fd};
print "ERROR -- FD=$fd is $op_last before $op.\n";
}
else {
$fds{$fd} = $op;
}
}
elsif (/open/) {
$op = OP_OPEN;
if (exists $fds{$fd}) {
my $op_last = $fds{$fd};
print "ERROR -- FD=$fd is $op_last before $op.\n";
}
else {
$fds{$fd} = $op;
}
}
elsif (/close/) {
$op = OP_CLOSE;
if (exists $fds{$fd}) {
my $op_last = $fds{$fd};
print "fd=$fd $op_last $op\n";
delete $fds{$fd};
}
else {
print "ERROR -- FD=$fd is closed before opened.\n";
}
}

#print "$op $fd\n";
}
}
close(FH);

print "===== opened file handles : \n";
foreach my $fd (sort {$a <=>$b} keys(%fds)) {
print "openning-fd:$fd\n";
}


-- kuer

Doug Judd

unread,
Jul 21, 2009, 9:27:39 AM7/21/09
to hyperta...@googlegroups.com
Hi kuer,

Comments inline...

2009/7/21 kuer <kue...@gmail.com>
[...]

I found some files opened many times before close :
$ grep -n 'fd=7398' dfs.log
14310:2009-07-15 03:28:06,774 1156786496 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:86) open( /hypertable/tables/storage_se/default/
75C73F17368096C166F0DC49/cs46 ) fd=7398 local_fd=1370
29715:2009-07-18 07:57:12,230 1216268608 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:86) open( /hypertable/tables/storage_se/default/
C8A2F2AC3B035CE469808D05/cs5 ) fd=7398 local_fd=1703
32123:2009-07-18 12:17:11,049 1153329472 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:148) close fd=7398

If you have cronolog installed, each successive run of Hypertable will append log data to the same set of log files.  I suspect that these two open lines were from two different runs of the RangeServer.

I also found some files opened but never close :
$ grep -n 'fd=7350' dfs.log
14221:2009-07-15 01:31:19,662 1261685056 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:131) create( /hypertable/tables/storage_se/default/
06825ADA6690593E88E09C91/cs53 ) fd=7350 local_fd=1372
14233:2009-07-15 01:31:21,995 1272174912 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:148) close fd=7350
29631:2009-07-18 07:42:17,139 1237248320 kosmosBroker [INFO] (kosmos/
KosmosBroker.cc:86) open( /hypertable/tables/storage_se/default/
D9AFD0F87BF44F88B676BCC1/cs77 ) fd=7350 local_fd=1358

Currently when the RangeServer opens a cell store, it leaves it open.  This seems to work fine on HDFS.  You should check on the KFS mailing list to see if there is any limitation on the number of file handles that can be open or if there is some penalty to having an open file handle.

- Doug

kuer

unread,
Jul 21, 2009, 8:32:08 PM7/21/09
to Hypertable Development
Hi, Doug,

> If you have cronolog installed, each successive run of Hypertable will
> append log data to the same set of log files. I suspect that these two open
> lines were from two different runs of the RangeServer.

I have no cronolog running. The log files was generated by one process
only. Maybe there are multiple threads in it, but the files were
opened in one process.


> Currently when the RangeServer opens a cell store, it leaves it open. This
> seems to work fine on HDFS. You should check on the KFS mailing list to see
> if there is any limitation on the number of file handles that can be open or
> if there is some penalty to having an open file handle.

I think overhead of chunk-buffer is a somehow penalty if too many
files were left open in KFS.


-- kuer

On 7月21日, 下午9时27分, Doug Judd <nuggetwh...@gmail.com> wrote:
> Hi kuer,
>
> Comments inline...
>
> 2009/7/21 kuer <kuer...@gmail.com>

xuya...@gmail.com

unread,
Apr 8, 2013, 4:24:52 AM4/8/13
to hyperta...@googlegroups.com
Hi,I had the same problem.Have you got a clue?
thanks

在 2009年7月21日星期二UTC+8上午10时06分06秒,kuer写道:
Reply all
Reply to author
Forward
0 new messages