Can I modify the chunk size?

85 views
Skip to first unread message

Bo Fu

unread,
Mar 22, 2016, 3:59:15 PM3/22/16
to QFS Development
Hi all,

I just wonder how I can change the chunk size from 64MB down to some value like 6MB or so (I didn't find a way to modify this value in configuration file). 
Will that reshape the data placement in QFS?

Thanks
Bo

mcan...@quantcast.com

unread,
Mar 22, 2016, 5:39:19 PM3/22/16
to qfs-...@googlegroups.com
Hi Bo,

Chunk size parameter is defined in common/kfstypes.h by the following line:


const size_t CHUNKSIZE = 64u << 20; //!< (64MB)

Currently, the API doesn't offer a function to set chunk size to an arbitrary value.

You can change that line as you desire, BUT; 


This is not something that was attempted before. We have good reasons to believe that

changing this value to a size that isn't power of two can cause problems.

One potential problem is that b+tree keys assume certain number of low order bits

in chunk position to be 0. The other thing is that a chunk size less than some internal parameters

(such as checksum block size) is unlikely to work. Note that this is not an exhaustive list of things 

that might go wrong. Also,reducing chunk size should cause further performance penalties such as 

increasing meta server RAM usage and chunk budgets, hence the overhead.


Hope this helps,


Mehmet

Bo Fu

unread,
Mar 23, 2016, 7:07:33 PM3/23/16
to QFS Development
Hi Mehmet,

Thank you for your detailed answer. I'm going to try 8 MB and get back to you it doesn't work.

Bo


On Tuesday, March 22, 2016 at 4:39:19 PM UTC-5, mcan...@quantcast.com wrote:
Hi Bo,

Chunk size parameter is defined in common/kfstypes.h by the following line:


const size_t CHUNKSIZE = 64u << 20; //!< (64MB)

Currently, the API doesn't offer a function to set chunk size to an arbitrary value.

You can change that line as you desire, BUT; 


This is not something that was ever attempted before. We have good reasons to believe that

mcan...@quantcast.com

unread,
Mar 23, 2016, 7:29:45 PM3/23/16
to QFS Development
You're welcome. We would love to hear back about your experience 
on changing this parameter.

Best,

Bo Fu

unread,
Mar 30, 2016, 11:36:49 AM3/30/16
to QFS Development
If my understanding is correct, QFS client tries to read 1 MB of data each time through request/response. I just realize for my experiment, I need to shrink that value, too. Is it also configurable?

Thanks

mcan...@quantcast.com

unread,
Mar 30, 2016, 12:38:48 PM3/30/16
to QFS Development
Hi Bo,

The number of bytes read each time is directly related to read-ahead buffer size parameter.
For RS 6+3, QFS client divides read-ahead buffer size by 6 (number of data stripes) and 
reads that many bytes from each chunk at each time. By default, read-ahead buffer size is set to 6MB. You can 
change this value for a file by calling KfsClient::SetReadAheadSize(int fd, size_t size)
after file is created/opened. However, if the total number of bytes you want to read (the number you specified in the read call)
exceeds read-ahead buffer size, you'll probably end up with a larger value.

Alternatively, you can set maximum read size for each read by setting QFS_CLIENT_CONFIG
environment variable like the following;
export QFS_CLIENT_CONFIG=client.maxReadSize=<value>
This should effectively set a limit to the number of bytes that is read each time.
Note that the value you specify should be larger than checksum block size (64KB).
Otherwise, client defaults this value to max(4MB, targetDiskIoSize). targetDiskIoSize
is another I/O related parameter.

We'll soon release a document that explains each I/O related parameter
and how one affects another.
 
Let me know if this solution works for you,

Best,

Bo Fu

unread,
Mar 30, 2016, 9:19:42 PM3/30/16
to QFS Development
Thanks for the quick answer!

setReadAheadSize works when I set the buffer size to be 1.5MB. However, when I further decrease this value to 0.75 MB, 1 of the 2 map tasks fails. Here is the error msg:

pure virtual method called

terminate called without an active exception


Right now I can't locate which virtual method it is calling

mcan...@quantcast.com

unread,
Apr 7, 2016, 6:10:03 PM4/7/16
to QFS Development
Sorry for the late response.

I couldn't recognize the error message. Do you see any errors directly from QFS? 
Also, can you write a simple c++ program and make QFS C++ API calls directly? 
For example, write some data to a file and then read from that file.
Make a call to setReadAheadSize inside before you read and see if you get any errors.

Let me know if you need any help with that.

Mehmet

minahi...@gmail.com

unread,
Nov 18, 2017, 8:19:21 AM11/18/17
to QFS Development

Hello i am student of software engineering. i am doing research on GFS(Google file system) Can u tell me that Can we modify the chunk size in gfs? can we exceed its size if our file size is large? And what is the meaning of padding of chunk?? plzz reply.
Reply all
Reply to author
Forward
0 new messages