Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

making a virtual file system

141 views
Skip to first unread message

budden

unread,
Jan 5, 2009, 2:47:33 AM1/5/09
to
Hi!
I'm interested in making a virtual file system. I know two ways to
make it:
1. fuse. There is a http://common-lisp.net/project/cl-fuse/ project,
but it is empty
2. plan9. There are references to (non-existent) http://common-lisp.net/project/ninep/
projects.
Any comments?

Pascal J. Bourguignon

unread,
Jan 5, 2009, 8:17:50 AM1/5/09
to

Fuse works on Linux and MacOSX AFAIK. It seems to be nice.

But since you're asking this on cll, last year I started to implement
a virtual-fs in lisp, storing the data _inside_ the lisp image
(providing a CL-like file and stream API). If you're interested I may
share this GPL code...

--
__Pascal Bourguignon__

budden

unread,
Jan 5, 2009, 12:27:39 PM1/5/09
to
Hi!

> But since you're asking this on cll, last year I started to implement
> a virtual-fs in lisp, storing the data _inside_ the lisp image
> (providing a CL-like file and stream API). If you're interested I may
> share this GPL code...
I'm interested in a solution which would make filesystem visible for
non-lisp clients. That is, data should be accessible not only from
lisp itself, but from any program which can read OS's filesystem. Lisp
should serve it, but some link to OS provided mechanisms is necessary
too. Is that what you've done?

Pascal J. Bourguignon

unread,
Jan 5, 2009, 3:16:46 PM1/5/09
to
budden <budde...@mail.ru> writes:

No. For that, you will have to use FUSE. I see no reason why you
couldn't implement it in Lisp, using CFFI. On the other hand, what
characteristic feature of your file system justify using lisp to
implement it?

--
__Pascal Bourguignon__

budden

unread,
Jan 5, 2009, 4:44:23 PM1/5/09
to
> No. For that, you will have to use FUSE. I see no reason why you
> couldn't implement it in Lisp, using CFFI.
I see no such reason too. But I also see no reason why I shouldn't
check if this was done already by someone else...
Life's too short to do everything by myself. I don't have experience
with CFFI. But currently I have fuse's "hallo world" running with file
contents generated from lisp via CFFI (with the little help of SWIG).
That's nice, but many work is ahead. E.g., there is a problem: I can
not use SWANK to communicate to lisp process once I've called
fuse_main. Lisp simply do not reply to slime-connect attempt. It is
nasty and solving it seems to be out of my abilities today.

> characteristic feature of your file system justify using lisp to
> implement it?

It is a feature of me, not of lisp. Lisp is just the universal
language I know rather well.

Pascal J. Bourguignon

unread,
Jan 5, 2009, 5:23:55 PM1/5/09
to
budden <budde...@mail.ru> writes:

>> No. For that, you will have to use FUSE. I see no reason why you
>> couldn't implement it in Lisp, using CFFI.
> I see no such reason too. But I also see no reason why I shouldn't
> check if this was done already by someone else...
> Life's too short to do everything by myself. I don't have experience
> with CFFI. But currently I have fuse's "hallo world" running with file
> contents generated from lisp via CFFI (with the little help of SWIG).
> That's nice, but many work is ahead. E.g., there is a problem: I can
> not use SWANK to communicate to lisp process once I've called
> fuse_main. Lisp simply do not reply to slime-connect attempt. It is
> nasty and solving it seems to be out of my abilities today.

What lisp implementation do you use?
You would need an implementation with thread support, and launch the
swank server in a separate thread from the one serving the FUSE
requests.


>> characteristic feature of your file system justify using lisp to
>> implement it?
> It is a feature of me, not of lisp. Lisp is just the universal
> language I know rather well.

That's good enough a reason :-)

--
__Pascal Bourguignon__

Andy Chambers

unread,
Jan 5, 2009, 7:24:41 PM1/5/09
to

This article might provide you with a few ideas.

http://www.xach.com/lisp/lispvan-2008-02-28.pdf

--
Andy

budden

unread,
Jan 6, 2009, 5:34:47 AM1/6/09
to
> This article might provide you with a few ideas.
> http://www.xach.com/lisp/lispvan-2008-02-28.pdf
Parsing debugging info is extremely interesting. I though of it some
time ago, but I do not know executable
files in depth. I'd consider participating, but I've got too much work
now.

What do you think about hybrid approach to your task: you might parse
debugging info directly from binary but interface gdb instead of
writing your own lispy debugger? It should save lots of effort. I came
from Windows world and never used gdb but I've heard it supports
runtime evaluation. All you need to add is a support for lisp
callbacks to gdb itself, not to program you debug. Interfacing can be
done safely with sockets/pipes.

I've seen reports that linking into lisp is generally a bad practice:
C code being debugged is often unreliable and lisp is relatively heavy-
weight. If you debug small C library linked into huge lisp image and
it crashes, you need to restart entire lisp image and lose all lisp
session context. Much worse is the situation where lisp gets hurt, but
_dont_ crash. Interfacing gdb should solve this issue.

What is VFS and could it help in my task. It is easier than fuse? Can
I work with is safely?

Pascal, thanks. I use sbcl with threads enabled and run fuse_main in
separate thread. Problem was with daemonizing. Fuse daemonizes itself
unless -f switch is passed and swank refuses to work (I don't know
why). Now I've passed -f switch and all works fine: I start sbcl from
Emacs as usual and filesystem is mounted by just

(bordeaux-threads:make-thread (lambda () (my_main (callback
readstringfromlisp))))

Then I continue working from slime as usual, trace calls to my code
from fuse, etc. For the first 5 minutes it seem to work fine :) I can
even unmount filesystem, rebuild and reload my fuse example hello.so
and all still works.

Few words about what I'm doing. In years, I've tried to use several
approaches to MP in other languages. I've used m4, custom
preprocessors written in lisp, codegenerators written in lisp. But I
was never satisfied completely and I could never approach the ease of
use I had in lisp. Now I try to make reversible macroexpander. It
would consist of:
- lexical analyser for a target language which converts file to a list
of lexems (I've got one for Delphi now, other languages can be added
easily)
- lispy toolbox for operating lexems in memory, e.g. for search and
replace, parsing of some constructs. Maybe a complete parser. I have
some bits of it.
- printer which writes converted lexems back to a text file. This is
very easy

The main idea to explore is to try to make macroexpansion partially
reversible so that I could edit macroexpanded file and propagate some
of changes I've made from macroexpanded file back to initial
"macroimploded" file. It is rather easy by adding special comments to
a macroexpanded text which show a correspondence between expanded and
imploded code. I came to a conclusion that this is really required for
pragmatic use of macros in languages other than lisp. This way we get
two views of semantically the same code tree and this is the point why
we need virtual filesystem. Other problem is that macroexpanding is
time consuming. Not macroexpanding itself, but touching every file. In
case of C++ it is a kind of disaster and it in fact prevents using
macros directly. So I need to make sure I do not touch macroexpanded
files which didn't modified at macroexpansion stage vs their previous
macroexpanded version. Filesystem is not strictly required to solve
this, but it is useful here too.

Pascal J. Bourguignon

unread,
Jan 6, 2009, 6:55:07 AM1/6/09
to
budden <budde...@mail.ru> writes:

It seems to me it would be easier to work at the level of the editor.

When you macroexpand, it's rather trivial to mark the extended code (be
it syntactic tree like in lisp, or text like in other macro systems), to
be able to distinguish the "template" parts from the "hook" parts (the
parts provided by the user. Then the editor could present the expanded
template parts in some read-only face, while the hook parts would still
be editable. When imploding the macro, you could also easily use this
marking to recover the macro and its arguments (the "hook" parts).


Also, notice that it's hardly possible in general with lisp macros,
since they're turing complete. For example:

(defmacro m (a)
(if (and (listp a) (member (first a) '(+ - * /)))
`(print ,a)
(let ((r (gensym)))
`(let ((,r (multiple-value-list ,a)))
(mapcar 'print ,r)
(values-list ,r)))))

would expand to different forms depending on the expression a.
If you change the expression a in the expanded form, then you would have
to go back to the imploded form, and reexpand it.

Also the expansion could duplicate the arguments or parts of them:

(defmacro m (a)
`(progn (print ',a)
,a))

Expanding (m (+ 3 4)) you'd get:

(progn (print '(+ 3 4))
(+ 3 4))

if you edit it to have:

(progn (print '(+ 31 42))
(+ 3 4))

then how can you implode it back, since the two occurences of the
argument are now different?


And no part of the argument may even be present in the expanded form:

(defmacro m (a)
(if (consp a)
'42
'(do-something-else)))


Expanded you'd have either:

42

or:

(do-something-else)

both being marked as "template" and non editable.

So your what you're trying to do may be possible with lesser macro
systems, but if they're as powerfull as Lisp then it won't be possible
to do it systematically. You may implement some heuristics working
sometimes, in some circumstances, but nothing general.

--
__Pascal Bourguignon__

budden

unread,
Jan 6, 2009, 7:24:03 AM1/6/09
to
> It seems to me it would be easier to work at the level of the editor.
Yeah, but fuse allows me to use Delphi IDE freely to edit expanded
files. It is extremely important.

> When you macroexpand, it's rather trivial to mark the extended code (be
> it syntactic tree like in lisp, or text like in other macro systems),

It will be at lexical level. It is problematic to parse C/C++/Delphi.
But working with lexems is much easier than with text. Some partial
parsing will be introduced, but full parsing seems to be non-
pragmatic.

> When imploding the macro, you could also easily use this
> marking to recover the macro and its arguments (the "hook" parts).

Yes, this is what I intend to do.

> Also, notice that it's hardly possible in general with lisp macros,
> since they're turing complete.

Yes, and this is not my intention at all.

> And no part of the argument may even be present in the expanded form:

Yes

> both being marked as "template" and non editable.

In lisp we could expand to something like this:

#wme 42 (m (foo bar))

where first form is result and second is an original form (which
should be skipped by reader). In Delphi it would be something like

42(*me m(foo(bar))*)

But making fuse available from lisp have some other (unexpected)
advantages. E.g. now we working with heritage software which uses dbf
files and have problems with locking. With fuse I could map dbf files
to SQL database and solve locking problems.

Pascal J. Bourguignon

unread,
Jan 6, 2009, 7:34:13 AM1/6/09
to
budden <budde...@mail.ru> writes:
> But making fuse available from lisp have some other (unexpected)
> advantages. E.g. now we working with heritage software which uses dbf
> files and have problems with locking. With fuse I could map dbf files
> to SQL database and solve locking problems.

Cool! :-)

--
__Pascal Bourguignon__

budden

unread,
Jan 10, 2009, 8:30:49 AM1/10/09
to
Hi all!
There is some small success. I have mapped every package to a
directory, every symbol to a file and symbol documentation to a
contents of a file:

deb:~/fuse-2.5.3/example/foo/SWANK-LOADER# ls -l
итого 0
-r--r--r-- 1 root root 0 1970-01-01 03:00 DUMP-IMAGE
-r--r--r-- 1 root root 48 1970-01-01 03:00 *FASL-DIRECTORY*
-r--r--r-- 1 root root 0 1970-01-01 03:00 INIT
-r--r--r-- 1 root root 43 1970-01-01 03:00 *SOURCE-DIRECTORY*
deb:~/fuse-2.5.3/example/foo/SWANK-LOADER# grep directory *
*FASL-DIRECTORY*:The directory where fasl files should be placed.
*SOURCE-DIRECTORY*:The directory where to look for the source.

Pascal J. Bourguignon

unread,
Jan 10, 2009, 12:01:14 PM1/10/09
to
budden <budde...@mail.ru> writes:

> Hi all!
> There is some small success. I have mapped every package to a
> directory, every symbol to a file and symbol documentation to a
> contents of a file:

Great!

> deb:~/fuse-2.5.3/example/foo/SWANK-LOADER# ls -l
> итого 0
> -r--r--r-- 1 root root 0 1970-01-01 03:00 DUMP-IMAGE
> -r--r--r-- 1 root root 48 1970-01-01 03:00 *FASL-DIRECTORY*
> -r--r--r-- 1 root root 0 1970-01-01 03:00 INIT
> -r--r--r-- 1 root root 43 1970-01-01 03:00 *SOURCE-DIRECTORY*

^^^^^^^^^^ ^^^^^^^^^^
Read only? |
Symbols comming from the COMMON-LISP package
could be timestamped 1994-04-11 ;-)

> deb:~/fuse-2.5.3/example/foo/SWANK-LOADER# grep directory *
> *FASL-DIRECTORY*:The directory where fasl files should be placed.
> *SOURCE-DIRECTORY*:The directory where to look for the source.

For symbols not from 'locked' packages, you could let the file be
read-write to be able to easily edit the documentation.
Creating new files would intern the corresponding symbol.
Deleting would unintern them.

You could have other trees for symbol values, symbol plists, and symbol
functions. Entries for symbol functions would be executable, and you
could call lisp functions from the shell:

~/fuse-2.5.3/functions/COMMON-LISP/DIRECTORY "/tmp/*.lisp" RET

would run (CL:DIRECTORY "/tmp/*.lisp")

~/fuse-2.5.3/functions/COMMON-LISP/+ 1 2 3 4 RET

would print 10.


--
__Pascal Bourguignon__

budden

unread,
Jan 10, 2009, 1:02:13 PM1/10/09
to
> Read only?
Yes.

>
> For symbols not from 'locked' packages, you could let the file be
> read-write to be able to easily edit the documentation.
Yeah, this is what I'm going to do. But do not forget that all this is
just a sample. In fact, there is no new functionality which is
otherwise unavailable. Very soon I'm going to switch to doing
reversible macroprocessing - this is what I've started all this thing
for.

> Entries for symbol functions would be executable

This looks really useful. E.g. it allows to incorporate lisp into
makefiles efficiently or to do a CGI efficiently. Instead of loading
new lisp image every time, just call a lisp function. But I think this
could also be achieved by writing a lightweight client written in C
and a lisp server. I'm almost sure it is done already though I was
unable to find a one. Swank/telnet seem not to work for me.

Btw, I'll share the code once I am able to host it.

Rob Warnock

unread,
Jan 10, 2009, 9:22:58 PM1/10/09
to
budden <budde...@mail.ru> wrote:
+---------------

| > Entries for symbol functions would be executable
| This looks really useful. E.g. it allows to incorporate lisp into
| makefiles efficiently or to do a CGI efficiently. Instead of loading
| new lisp image every time, just call a lisp function. But I think this
| could also be achieved by writing a lightweight client written in C
| and a lisp server. I'm almost sure it is done already though I was
| unable to find a one. Swank/telnet seem not to work for me.
+---------------

Feel free to mine this one for ideas:

http://rpw3.org/hacks/lisp/cgi_sock.c

It was originally written to provide a rough equivalent to the
"mod_lisp" Apache module but as a separate external CGI program.
However, I've also hacked up versions on occasion to pass command
line argument to the persistent CL application server "as if"
they'd come from the Web.

The biggest problem with the "C client/Lisp server" model is
securing the connection without compromising the lightweight
nature of the protocol. [That is, you don't want to run a whole
OpenSSL suite, which would be more expensive than fork/exec'ing
a new Lisp image!] On Linux/Unix systems, that can be easily
accomplished by (1) using only local domain sockets (AF_UNIX,
AF_LOCAL, or AF_POSIX, depending on your operating system's stack),
*not* TCP, and (2) securing access to the directory *above* the
socket, since some operating systems ignore the file permissions
on the socket file itself.


-Rob

p.s. There's also a matching stub server in C that might be helpful
while debugging your Lisp server:

http://rpw3.org/hacks/lisp/cgi_sockd.c

-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607


budden

unread,
Jan 12, 2009, 2:55:56 AM1/12/09
to
> Feel free to mine this one for ideas: http://rpw3.org/hacks/lisp/cgi_sock.c

Thanks, but I think now for me it is more easy to use fuse already.
BTW, I've implemented writable documentation-to-file mapping: if you
alter the file, the documentation changes.
Some functions of a filesystem are missging still, and the most
important of them is file creation (create and mknod).
But now I'm short in time again. I think the next thing to do is to
share my code somewhere on common-lisp.net or some other place.

budden

unread,
Jan 14, 2009, 4:57:51 PM1/14/09
to
Not sure someone cares, but it took me about 20 minutes today to fix
my yesterday's attempt to
add file creation function.

Now I can intern symbol to my lisp image just from the command shell:
# echo My new symbol docs > foo/ASDF/MY-NEW-SYMBOL

budden

unread,
Feb 1, 2009, 8:10:27 AM2/1/09
to
Hi, everybody!
I'm preparing my fuse experiments for publication. Currently the
main problem is a FFI. I'm interfacing stat_t structure from kernel
headers. Problem is that types of its slots are hard to know: they
depends on many #define's in multiple kernel headers. CFFI-grovel was
unable to help as it can't extract C structure slot type info (and it
looks like this is impossible using CFFI-grovel approach at all).
I tried to use SWIG, but this was a failure too: you need to copy
portions of headers from you headers manually. Finally, I've did what
I needed manually and this works, but this is not portable, this is
dirty and this is not I what I want to publish.
After communicating with Andy Chambers I have made some gdb
experiments. It is easy to print type info with gdb:
1. Make a C program of that kind:
//gdb-grovel.c
#include <sys/stat.h>

int main(int args, char**argv) {
struct stat test;
return 0;
}
//eof
2. Build it with
$gcc -g -m32 -fPIC -o gdb-grovel gdb-grovel.c
3. Start gdb:
$gdb gdb-grovel
4. In gdb, issue the following commands
(gdb)break main
(gdb)r
; gdb starts program and breaks
(gdb)ptype test
type = struct stat {
__dev_t st_dev;
short unsigned int __pad1;
...
}
If we grab gdb's output, we'll get structure slots list.
Then, we can parse it and iterate (automatically) on its slots:
(gdb) whatis test.st_dev
type = __dev_t
(gdb) whatis __dev_t
type = __u_quad_t
(gdb) whatis __u_quad_t
type = long long unsigned int

Thus we can get full type information of structure.
If we grab it, it seems we can do FFI easily and reliably.

Questions are:
1. Is this approach new?
2. If not, is it really viable? Some pitfalls?
3. If all is Ok, is there a place where I can find its complete
implementation (not necessary in lisp,
but I like it to understand all C subtleties rather well).

D Herring

unread,
Feb 1, 2009, 11:19:36 AM2/1/09
to
budden wrote:

... use gdb to query dwarf/stabs info ...

> Thus we can get full type information of structure.
> If we grab it, it seems we can do FFI easily and reliably.
>
> Questions are:
> 1. Is this approach new?
> 2. If not, is it really viable? Some pitfalls?
> 3. If all is Ok, is there a place where I can find its complete
> implementation (not necessary in lisp,
> but I like it to understand all C subtleties rather well).

1. Using the compiler's debug output is not new; but nobody has
finished a good interface for it (myself included).

2. Highly viable, especially for C. The only pitfalls are getting the
right compiler and compilation flags (true of any grovelling method).

3. CL is one of the few systems to allow direct FFI binding at runtime
(as opposed to compiling C stubs for the API translation). As such,
most systems take a SWIG-style approach instead of using the debug info.


Since you probably only need a few fields out of struct stat, and you
already know their names, Autoconf provides a more portable (not
gcc/gdb-specific) way to grovel information about the structure.

# cat <<_EOF > configure.ac
AC_INIT
AC_PROG_CC

AC_COMPUTE_INT(SIZEOF_stat, [sizeof(struct stat)],
[#include <sys/stat.h>])
AC_COMPUTE_INT(SIZEOF_ctime, [sizeof(((struct stat *)0)->st_ctime)],
[#include <sys/stat.h>])
AC_COMPUTE_INT(ADDROF_ctime, [&(((struct stat *)0)->st_ctime)],
[#include <sys/stat.h>])

echo sizeof stat $SIZEOF_stat
echo sizeof st_ctime $SIZEOF_ctime
echo addrof st_ctime $ADDROF_ctime
_EOF

# autoconf
# ./configure
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
sizeof stat 144
sizeof st_ctime 8
addrof st_ctime 104


So here we see that "struct stat" has a total size of 144 chars, and
that chars 104 to 104+8=112 form the st_ctime field...

Autoconf compiles at least one program per value found; but this
generally isn't noticeable for a small number of values. When
Autoconf detects a cross-compile, it uses a slower bit-banging
approach to finding the values.


If you don't care about the cross-compile capability, it is easier to
ask C to report the values directly.

# cat <<_EOF > test.c
#include <sys/stat.h>
#include <stdio.h>

int main(int argc, char **argv)
{
struct stat *dummy=0;
printf("sizeof stat %d\n", sizeof(*dummy));
printf("sizeof st_ctime %d\n", sizeof(dummy->st_ctime));
printf("addrof st_ctime %d\n", &(dummy->st_ctime));
return 0;
}
_EOF

# gcc -o test test.c
# ./test
sizeof stat 144
sizeof st_ctime 8
addrof st_ctime 104


Once you know the sizes and addresses of each field of interest, a
CFFI binding (with filler for unused values) should be easy to write.

For example, on my system, struct stat could be imitated as
struct pseudo_stat
{
char filler0[104];
uint64_t st_ctime;
char filler1[32];
};

- Daniel

P.S. The above values are from a 64-bit linux machine.

budden

unread,
Feb 1, 2009, 4:51:48 PM2/1/09
to
Thanks, Daniel, I'm encouraged with your estimation of the approach
viability.

Autoconf is very like cffi-grovel, and I think cffi-grovel is more
natural to use with lisp:
it produces ready cffi bindings. But cffi-grovel need names of
structure slot types too.

The most problematic case is

#ifndef __USE_FILE_OFFSET64
__blkcnt_t st_blocks; /* Number 512-byte blocks allocated. */
#else
__blkcnt64_t st_blocks; /* Number 512-byte blocks allocated. */
#endif

I do not know name of the type, so I can't give it to cffi-grovel.

But I think I found the solution for that just now :) Solution is to
have my own

typedef
#ifndef __USE_FILE_OFFSET64
__blkcnt_t
#else
__blkcnt64_t
#endif
my_st_blocks_type;

in a file which I'll #include to cffi-grovel's test program.

And then I'll say cffi-grovel that st_blocks is of type
my_st_blocks_type.
Case of

#ifndef __USE_FILE_OFFSET64
__ino_t st_ino; /* File serial number. */
#else
__ino_t __st_ino; /* 32bit file serial
number. */
#endif
remains uncovered, but I do not need it just now and it should be
solve not only at C side, but at lisp side too: solution should be
able to know __USE_FILE_OFFSET64 value and issue different FFI
declrations depending on it).

0 new messages