Redis-based filesystem

667 views
Skip to first unread message

Steve Kemp

unread,
Mar 2, 2011, 9:50:13 AM3/2/11
to Redis DB
This is just a brief message to let you all know I put together a
quick redis & FUSE-based filesystem.

You can see the overview and the code here:

http://www.steve.org.uk/Software/redisfs/

It uses the hiredis C-client library for communication and is both
stable and simple.

Steve

Dvir Volk

unread,
Mar 2, 2011, 10:32:37 AM3/2/11
to redi...@googlegroups.com, Steve Kemp
cool! I was just thinking how come on one has done this :)
do you have any benchmarks?


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.




--
Dvir Volk
System Architect, Do@, http://doat.com

Steve Kemp

unread,
Mar 2, 2011, 10:46:57 AM3/2/11
to Redis DB
I'm not 100% sure what kind of benchmarks will be useful. I see a
little lag when traveling across the internet & back - but against a
fast LAN very little delay.

A simple test is:

$ time rsync -vazr /etc /tmp/local/
real 0m3.577s
user 0m0.352s
sys 0m0.152s

Compared to my filessytem:

$ time rsync -vazr /etc /mnt/redis/
real 0m26.820s
user 0m0.836s
sys 0m0.672s

That's actually slower than I expected - but I suspect a lot of that
is the errors being output as my filesystem supports everything rsync
needs *except* for symlinks ..

Steve

Dvir Volk

unread,
Mar 2, 2011, 10:56:29 AM3/2/11
to redi...@googlegroups.com, Steve Kemp
it also might be because fuse adds its own overhead compared to a kernel filesystem (not sure how much).
mayb try "cp -r" on a LOT of files instead of rsync, to demonstrate how this can be useful (I suspect the O(1) behavior will become useful here, no?)


Steve

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Javier Guerra Giraldez

unread,
Mar 2, 2011, 11:37:06 AM3/2/11
to redi...@googlegroups.com, Dvir Volk, Steve Kemp
On Wed, Mar 2, 2011 at 10:56 AM, Dvir Volk <dvi...@gmail.com> wrote:
> I suspect the O(1) behavior will become useful here, no?

most modern filesystems are O(1) on path->data operations. it's the
wildcard-globbing process that kills huge flat directories

--
Javier

Didier Spezia

unread,
Mar 2, 2011, 11:46:35 AM3/2/11
to Redis DB
Hi,

nice work!

Looking at the implementation, it appears that all accesses to
Redis are synchronous and non pipelined. Each time a key is
get/set, redisfs has to pay for the network latency. I suspect
it is the current performance bottleneck.

With recent versions of hiredis, you now have a clean API to
pipeline your queries. Should you want to optimize it,
this would be the very first low-hanging fruit IMO.

Regards,
Didier.

Steve Kemp

unread,
Mar 2, 2011, 12:02:07 PM3/2/11
to Redis DB

> nice work!

Thanks.

> Looking at the implementation, it appears that all accesses to
> Redis are synchronous and non pipelined. Each time a key is
> get/set, redisfs has to pay for the network latency. I suspect
> it is the current performance bottleneck.

Agreed. I put this together a few months ago, and recently
remembered it. I've been cleaning up the code for the past night
or two - and just now committed it to mercurial.

There will be a few updates later tonight when I get the chance
to work on my development system again - but as you say switching to
an async model will help a lot.

Initially I was going to abstract out the storage to a pair
of "get" + "set" primitives to make this easier, but since I
started using sets to store directory entries that plan fell
by the wayside. It might make sense to change the storage
of directory entries from:

SADD skx:/ [INODE:]1
SADD skx:/ [INODE:]2
SADD skx:/ [INODE:]3

To:

SET skx:DIRENT:/ 1,2,3

That way a lookup of directory entries becomes a single operation.
The complexity of adding a new entry just becomes appending a new
key - and using strtok to parse for lookups should be sane.


Steve
--

Malformed Username

unread,
Mar 10, 2011, 5:09:29 PM3/10/11
to Redis DB
Just as a final closure on this - the filesystem is now a little more
robust and uses zlib to compress the file contents which are stored in
redis.

Additionally I've implemented a simple filesystem-snapshot tool, which
does nothing more than copy all keys which are relating to the
filesystem with a new prefix. By default the filesystem is mounted at
/mnt/redis and all keys have a prefix of "skx", but a snapshot may be
made with a different prefix and that can be mounted too:

redisfs-snapshot --prefix=copy
redisfs --mount /mnt/copy --prefix=copy

Because there is no "copy" primitive for keys & sets I had to use the
"keys" function which is suboptimal, but robust & simple to implement.

Find redisfs at http://www.steve.org.uk/Software/redisfs/

Steve
--

Reply all
Reply to author
Forward
0 new messages