QFS Start and Stop Scripts

Michael Kamprath

unread,

Jul 31, 2016, 5:07:48 PM7/31/16

to QFS Development

I installed QFS onto a small single board computer cluster I have made out of ODROID XU4 boards ( read more here ). In the process, I wrote a couple of shell scripts to easily start and stop QFS on my cluster. You can view them here:

https://github.com/DIYBigData/odroid-xu4-cluster/tree/master/qfs/sbin

Just sharing in case anybody else might find them useful. Note that they expect an additional configuration file that contains the chunk server list.

I will work on putting together a pull request to see if they could added to the normal distribution.

Michael Kamprath

Michael Ovsiannikov

unread,

Jul 31, 2016, 11:29:53 PM7/31/16

to <qfs-devel@googlegroups.com>

Michael,

Thank you for your post about QFS on XU4 cluster.

If I remember right, word count example only reads from distributed file system. 15% run time improvement looks rather noticeable, given that the read time is most likely a small fraction of the execution time.

The parallel build should work (make -j8 for octa core A7), and might speedup the build.

The Read Solomon encoding would work on 3 nodes cluster, as experiment only, of course. It appears that the binaries that you’ve built have neon support enabled. If you have time to try it, it would be interesting to find out if RS neon code woks, and how performance compares with replication 3 on these boards.

There is also RS unit and performance test program.

build/release/src/cc/qcrs/rstest 6 655360 10

— Mike.

--
You received this message because you are subscribed to the Google Groups "QFS Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qfs-devel+...@googlegroups.com.
To post to this group, send email to qfs-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/qfs-devel/865769a4-e5ac-48bd-84b1-d38e61d0e00a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Kamprath

unread,

Aug 1, 2016, 11:42:01 AM8/1/16

to QFS Development

On Sunday, July 31, 2016 at 8:29:53 PM UTC-7, movsiannikov wrote:

If I remember right, word count example only reads from distributed file system. 15% run time improvement looks rather noticeable, given that the read time is most likely a small fraction of the execution time.

When I get a chance, I will dig keeping into the difference to see if the time difference can be better explained.

The Read Solomon encoding would work on 3 nodes cluster, as experiment only, of course. It appears that the binaries that you’ve built have neon support enabled. If you have time to try it, it would be interesting to find out if RS neon code woks, and how performance compares with replication 3 on these boards.

Similarly, I will try this out. I thought you had to have 9 nodes for RS, but I guess that you only need 9 nodes to truly take advantage of the fault tolerance (that is if more than on stripe is on a given node and you lose the node, then you likely exceed the fault tolerance)?

Thanks!

Michael Kamprath

mcan...@quantcast.com

unread,

Aug 1, 2016, 11:50:10 AM8/1/16

to qfs-...@googlegroups.com

Hi Michael,

The Read Solomon encoding would work on 3 nodes cluster, as experiment only, of course. It appears that the binaries that you’ve built have neon support enabled. If you have time to try it, it would be interesting to find out if RS neon code woks, and how performance compares with replication 3 on these boards.

Similarly, I will try this out. I thought you had to have 9 nodes for RS, but I guess that you only need 9 nodes to truly take advantage of the fault tolerance (that is if more than on stripe is on a given node and you lose the node, then you likely exceed the fault tolerance)?

That's right, you don't have to have 9 nodes for RS (in fact, you can run it even on a single node). However, ideally, each chunk (in a chunk group) should be placed in a different node in order to take the full advantage as you pointed out. In other words, you can tolerate up to 3 failures, only if all of the 9 chunks are in a different node. So, your 3 nodes ODROID x4 cluster would be able to survive a single node failure if RS were employed.

Best,

Mehmet

Michael Kamprath

unread,

Aug 3, 2016, 1:03:34 AM8/3/16

to QFS Development

On Sunday, July 31, 2016 at 8:29:53 PM UTC-7, movsiannikov wrote:

If I remember right, word count example only reads from distributed file system. 15% run time improvement looks rather noticeable, given that the read time is most likely a small fraction of the execution time.

Mike,

You are right. I realized I had my HDFS instance configured to a smaller block size as QFS's default 64MB block size. That means there were more blocks and thus more tasks with their fixed overhead. I reconfigured my HDFS instance to have a 64MB block size, and both QFS and HDFS performed relatively the same (within a few seconds of each other).

Michael Kamprath

Michael Ovsiannikov

unread,

Aug 3, 2016, 1:25:56 AM8/3/16

to <qfs-devel@googlegroups.com>

Michael,

Makes sense, thank you for the update. Perhaps JVM start up time is to blame. Though, assuming small data size that fits in memory, over 2 min run time on 3 node cluster seems like a lot. I recall under a half a minute small sort jobs completion time on production cluster with Quantcast map reduce.

— Mike.

--
You received this message because you are subscribed to the Google Groups "QFS Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qfs-devel+...@googlegroups.com.
To post to this group, send email to qfs-...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/qfs-devel/4aecf6e2-df4b-48a9-815d-70389feb1a1f%40googlegroups.com.

Michael Kamprath

unread,

Aug 6, 2016, 1:12:58 AM8/6/16

to QFS Development

Mike,

I ran the RS tests. First, I ran the simple word count program against the same data set, except one time on QFS with simple replication of 3, and the other with RS 6+3 (the -S option with cptoqfs). The RS run was slower at 2.5 minutes as compared to the straight replication at 2.2 minutes.

For the performance tests, this is what I get:

odroid@master:~/qfs$ ./build/release/src/cc/qcrs/rstest 6 655360 10

encode 1.012e+05 clocks 1.012e-01 sec 3.886e+08 bytes/sec

decode missing: 3,4,5 1.431e+05 clocks 1.431e-01 sec 2.748e+08 bytes/sec

decode missing: 4,5,6 1.110e+05 clocks 1.110e-01 sec 3.542e+08 bytes/sec

decode missing: 4,5,7 9.608e+04 clocks 9.608e-02 sec 4.093e+08 bytes/sec

decode missing: 4,5,8 8.549e+04 clocks 8.549e-02 sec 4.600e+08 bytes/sec

decode missing: 5,6,7 7.498e+04 clocks 7.498e-02 sec 5.244e+08 bytes/sec

decode missing: 5,6,8 5.981e+04 clocks 5.981e-02 sec 6.574e+08 bytes/sec

decode missing: 5,7,8 4.725e+04 clocks 4.725e-02 sec 8.322e+08 bytes/sec

decode average: 6.177e+05 clocks 6.177e-01 sec 4.456e+08 bytes/sec

Michael Kamprath

On Sunday, July 31, 2016 at 8:29:53 PM UTC-7, movsiannikov wrote:

Michael Ovsiannikov

unread,

Aug 6, 2016, 1:47:25 AM8/6/16

to <qfs-devel@googlegroups.com>

Michael,

Thank you for your testing.

The performance test number looks good: encode only 2.4 times slower than on 2.6GHz intel i7, and it appears that the neon code works!

How big is the test file / set? If it is bigger than 64KB, and the test only reads it, the first thing that comes to mind that would likely explain the difference, might be fetching the data from more than one network node, and possibly different number of map reduce tasks. With RS fils get data location would return 64KB stripes location [at least with reasonably recent code], and this could be equivalent to setting 64KB HDFS block size. If stripe size defines number of MR tasks, then increasing it (by explicitly specifying striping parameters via cptoqfs command line parameters) would make a difference.

The 10 seconds difference seems rather big. Do you know long does cpfromqfs into /dev/null of this file / set take?

— Mike.

To view this discussion on the web visit https://groups.google.com/d/msgid/qfs-devel/c8015256-fb4d-42a3-a9f9-e70a3a462679%40googlegroups.com.

Reply all

Reply to author

Forward