Many thanks, I finally make it work. Adding a new coding scheme!!!!.

62 views

Skip to first unread message

peng li

unread,

May 27, 2016, 12:35:20 AM5/27/16

to qfs-...@googlegroups.com

Hello guys, thanks for all your help about my questions here. I finnally mke it work about adding a new coding scheme to QFS source code.

these are some question list i have posted:

How to configure the parameters of <rs k, m+n>

how to start all servers using one command

file system id mismatch and hello authentication error

I have been working on QFS about two weeks.

During the first week, I keep reading the wike pages. These documents are wonderful for a new comer like me, who have never heard of QFS before.

I am from China and Reading English is a little complex for me. So I tried to translate some documents like Overview, Introduction, Deployment, Developer sections. And the Chinese version is posted in my blog . I am keeping adding things on it. If you are also a Chinese guy, maybe it is useful for you. In the first week, I also download the source code, comile it and deploy it into a one-node machine. I also tried some useful commands like qfsshell, qfs-put, cptoqfs and so on.

During the second week, I keep reading the source code. I want to add a new coding scheme into QFS. So my focus is on the ECMethod.c and also its derived classes.

This part is not difficult for me. I quickly add a derived class which inherits ECMethod and make it work. The difficult part is the coding scheme needs another parameter like "numStripeCount, numStripeRecoveryCount, StripeSize". The parameter is called "numShorten", which is related to the coding scheme and should be passed to the Encoding and Decoding methods. And this parameter can be set different for different files, which means they are encoded different. Thus, I make this parameter as the attribute of a file, just like "numStripeCount".

This process of modifying source code is really time-consuing and challeging. I modify about 200-300 places of the source code to finally make it work. The details will not be listed here. I will post the questions I met, the idears and also the analysis of source code to my blog. Sorry, I will not translate it because of my poor skill in English.

This is how the new coding scheme works:

When I use qfsshell to see the file attributes, it shows "Num shorten: 1" which is added by me.

QfsShell> stat f1

File: f1

ctime: 1464584934

mtime: 1464584948

Size: 398458880

Id: 4

Replication: 1

Chunks: 14

Files: 0

Dirs: 0

Owner: 500

Group: 500

Mode: 664

MinTier: 15

MaxTier: 15

Stripe size: 65536

Data stripes : 4

Recovery stripes: 3

Num shorten: 1

Type: 4

Of course, the parameter can be passed by cptoqfs which is shown here:

qfs-put.sh --help

-s -- meta server name or ip

-p -- meta server port

-d -- source path; "-" means stdin

-k -- destination (qfs) path

[-v] -- verbose debug trace

[-r] -- replication factor; default 3

[-W] -- testing -- number test rewrites

[-n] -- dry run

[-a] -- append

[-b] -- input buffer size in bytes; default is 8MB

[-w] -- qfs write buffer size in bytes; default is 4MB, or 1MB per stripe

[-t] -- truncate destination files if exist

[-x] -- delete destination files if exist

[-u] -- stripe size

[-y] -- data stripes count

[-z] -- recovery stripes count (0 or 3 with file type 2)

[-S] -- 6+3 RS 64KB stripes 1 replica

[-R] -- op retry count, default -1 -- qfs client default

[-D] -- op retry delay, default -1 -- qfs client default

[-T] -- op timeout, default -1 -- qfs client default

[-X] -- create exclusive

[-m] -- min storage tier

[-l] -- max storage tier

[-B] -- write from this position

[-f] -- configuration file name

[-F] -- file type -- default 1 or 2 if stripe count not 0. 1 is NONE, 2 is RS(N,3), 3 is JerasureRS(N,K), 4 is MyCoding(N,K) , k is the recovery stripe count.

[-L] -- shortened num of lines, when -F is 4. Default is 0, no shorten

I use command like: bin/qfs-put.sh -S -F 4 -y 4 -z 3 -L 1 -k f1 -d 384MB.dat

to create a file which is encoded in MyCoding(4,3). -F to give the type of 4 and -L to give the numShorten of 1.

Following are some ideas I have posted several days ago.:

In ECMethod.cc, the Decode inerface is as follows:

virtual int Decode(

int inStripeCount,

int inRecoveryStripeCount,

int inLength,

void** inBuffersPtr,

int const* inMissingStripesIdxPtr)

where inStripeCount=6, inRecoveryStripeCount=3, inLength=64KB(stripe size), inBuffersPtr(9 buffer pointers), inMissingStripesIdxPtr(the missing chunk Index).

It seems as if inBuffersPtr stores the avaiable data which can be used to recover the lost chunk. And the buffer size are all the same. In my opinion, it should be equal to stripe size. So it is different from my new coding scheme which has the different size of the recovery chunk.

Another question is about inMissingStripesIdxPtr which has the size of "recovery count+1"(for example, when RS(6,3), it is 3+1). It stores the missing chunkIdx. So if we have only one data chunk lost, the inMissingStripesIdxPtr still stores 3 missing chunks. I have doubt on it.

Something I have done on it:

I have read the source code of ECMethod.h, ECMethod.cc, QCECMethod.cc, RSStriper.cc and know the basis on how to add a new scheme.

1, I should add a new type of KFS_STRIPED_FILE_TYPE_ANOTHER in qfs.h and kfstypes.h

2, Create another file called ANOTHERECMethod.cc which has the same architecture as QCECMethod.cc. What I should do is implemented the detail Encode and Decode method.

Something other:

I have to say that the architecture about coding is wonderful. It adopts the technology to seperate the definition and implementation using inheriting the abstract class ECMethod which is defined in ECMethod.h. All I need to do when I want to add another coding scheme is to implement another entity class inherited ECMethod. It also adopts the registration technology to register all coding type and coding scheme which makes the extension very feasible.

lipeng

mcan...@quantcast.com

unread,

Jun 13, 2016, 5:48:55 PM6/13/16

to QFS Development

Hello again,

Thanks for the detailed description. I'm sure the steps here will be helpful for other people who want to add a new coding scheme.

We'd be happy to hear about how exactly your coding scheme is different than the already supported ones. I believe you said the unit of recovery is different.

Are there any other major differences?

Mehmet

Reply all

Reply to author

Forward

0 new messages