[Sbcl-help] Logical pathnames as source locations for user defined functions

55 views
Skip to first unread message

Shubhamkar Ayare via Sbcl-help

unread,
Jun 26, 2023, 5:26:39 AM6/26/23
to sbcl...@lists.sourceforge.net

I want to deliver a compiled lisp image along with its source code. But I also
want the end user to be able to locate the definition locations of the functions
and other objects in the delivered lisp image.

SBCL seems to use logical pathnames to achieve this; most of the functions
provided with SBCL seem to have their source file specified in the form of
logical pathnames. Eg:

(DESCRIBE 'FDEFINITION)
; ...
; Source file: SYS:SRC;CODE;FDEFINITION.LISP

However, user defined functions have their source file specified as an absolute
pathname. Eg:

(DESCRIBE 'ALEXANDRIA:TYPE=)
; ...
; Source file: /path/to/quicklisp/dists/quicklisp/software/alexandria-20220707-git/alexandria-1/types.lisp

Is there a way to let the user defined functions have their source files
specified as a logical pathname?

Shubhamkar


_______________________________________________
Sbcl-help mailing list
Sbcl...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sbcl-help

Stas Boukarev

unread,
Jun 26, 2023, 8:27:46 AM6/26/23
to Shubhamkar Ayare, sbcl...@lists.sourceforge.net
If you pass a logical pathname to compile-file it will be recorded as such.

Richard M Kreuter via Sbcl-help

unread,
Jun 26, 2023, 8:59:30 AM6/26/23
to Shubhamkar Ayare, sbcl...@lists.sourceforge.net
Shubhamkar Ayare via Sbcl-help <sbcl...@lists.sourceforge.net> wrote:

> I want to deliver a compiled lisp image along with its source
> code. But I also want the end user to be able to locate the definition
> locations of the functions and other objects in the delivered lisp
> image.

ISTR some folks mentioning how they achieve the desired result, either
here or on sbcl-devel@. Have you had a look at the list archives? (I
might be misremembering.)

> However, user defined functions have their source file specified as an
> absolute pathname. Eg:
>
> (DESCRIBE 'ALEXANDRIA:TYPE=)
> ; ...
> ; Source file: /path/to/quicklisp/dists/quicklisp/software/alexandria-202
> 20707-git/alexandria-1/types.lisp
>
> Is there a way to let the user defined functions have their source files
> specified as a logical pathname?

tl;dr maybe, but if if you're serious about logical pathnames as part of
a delivery feature, it might require considerable and ongoing effort on
your part.

0. IME, there are a large number of larger and smaller incompatible
details among different implementations' logical pathnames. If you
care about the portability of your approach across more than one or
two current and future implementations, I'd suggest avoiding logical
pathnames.

1. SBCL's COMPILE-FILE will record the (defaulted) pathname designated
by the argument. So the "primitive" has the capability you're asking
about.

* (with-open-file (f "/tmp/program.lisp" :direction :output)
(print '(defun program () 'program) f))
(DEFUN PROGRAM () 'PROGRAM)
* (setf (logical-pathname-translations "program")
'(("source;*.*.*" "/tmp/*.*")))
(("SOURCE;*.*.*" "/tmp/*.*"))
* (compile-file "program:source;program")
; compiling file "PROGRAM:SOURCE;PROGRAM.LISP.NEWEST" (written 26 JUN 2023 05:30:26 AM):

; wrote PROGRAM:SOURCE;PROGRAM.FASL.NEWEST
; compilation finished in 0:00:00.004
#P"/tmp/program.fasl"
NIL
NIL
* (load *)
T
* (describe 'program)
COMMON-LISP-USER::PROGRAM
[symbol]

PROGRAM names a compiled function:
Lambda-list: ()
Derived type: (FUNCTION NIL
(VALUES (MEMBER COMMON-LISP-USER::PROGRAM) &OPTIONAL))
Source file: PROGRAM:SOURCE;PROGRAM.LISP.NEWEST

2. I'm pretty sure ASDF is willing to work with logical pathnames, i.e.,
to supply them (or their namestrings) to COMPILE-FILE, but it's been
years since I've tried. Let's assume that can be made to work.

3. It looks like you're using Quicklisp for package management. I've no
idea whether it's possible to ask Quicklisp to use logical pathnames.

4. Packages themselves, however, are liable to be the sticking point,
and where you may end up in a world of pointless work, possibly quite
a bit, depending on how "conservatively" all your dependencies'
authors name their source files.

(It takes about a hundred lines of english to explain the limitations
around LPNs that I'm alluding to here. Let me know if you're not
already familiar with that?)

Pretty much any way you might address that problem will require to
you "vendor" at least some your dependencies (maybe all? I don't know
about Quicklisp's capabilities for different things).

For instance, you might rename certain files within source trees, to
make it easier to refer to them via LPNs. Or you might leave the
files alone, and do some sneaky tricks in the LPN layer. Either way,
you'll need to maintain modifications to packages' system defintion
files, and possibly to source files too (in case a package's
source files refer to source files in the package, say).

It may all be more work than it's worth, I'm afraid. Depending on how
you expect your users to work with your deliverable, it might well be
easier to solve the problem within their development environments, e.g.,
post-processing your build-time pathnames into ones that are right for
your users via Slime or something.

Regards,
Richard

Shubhamkar Ayare via Sbcl-help

unread,
Jun 26, 2023, 1:41:22 PM6/26/23
to Richard M Kreuter, sbcl...@lists.sourceforge.net

> ISTR some folks mentioning how they achieve the desired result, either
> here or on sbcl-devel@. Have you had a look at the list archives? (I
> might be misremembering.)

I took a look at sbcl-help archives, and did find some relevant discussions,
although not exactly about this.

> if you're serious about logical pathnames as part of
> a delivery feature, it might require considerable and ongoing effort on
> your part.

Thanks for the suggestions on avoiding logical pathnames! I'm certainly checking
out LPNs for the first time, but upon some testing, it seems that to solve my
problem it seems that one good way is if the sources were defined relative to
the core image. However, LPNs seem to only translate into absolute pathnames. A
translation definition such as the following

(setf (logical-pathname-translations "LOCAL")
`(("local:*.*.*" "local/test/*")))

does not translate into what I'd want

(translate-logical-pathname (logical-pathname "LOCAL:source-file.lisp"))
;=> #P"/local/test/source-file.lisp"
; I think a relative pathname would have solved my problem here.

So, even if I use LPNs, I'd need to wrap my dumped image so that the appropriate
LPN hosts are set up on the user's machine - or I could use a init-hook.

The other idea I was thinking about was to simply rewrite the source code
locations of the functions (and may be other objects users might be interested
in) to relative pathnames before dumping the image itself. So, something like

(setf
(sb-introspect::debug-source-namestring
(sb-introspect::debug-info-source
(sb-introspect::function-debug-info (fdefinition 'my-package:my-function))))
"relative/pathname/to/my-package/my-function.lisp")

At the moment, I'm not that concerned about portability; so I'm okay with SBCL
specific solutions. Given the issues you mentioned about LPNs (ASDF manual too
seems to suggest avoiding LPNs), I might go with this approach of rewriting the
source locations.

Thanks a lot for the pointers! And also thanks to both you and Stas for pointing
out that COMPILE-FILE supports LPNs.

Shubhamkar

Douglas Katzman via Sbcl-help

unread,
Jun 26, 2023, 2:29:09 PM6/26/23
to Shubhamkar Ayare, sbcl...@lists.sourceforge.net
Might I suggest you take a look at the comments starting at line 493 of ir1-translators.lisp and decide if it applies to you.  Then maybe use the internal variable and hope that its behavior remains stable. And/or suggest a way to do this without hacking the internals.

Shubhamkar Ayare via Sbcl-help

unread,
Jun 26, 2023, 2:52:44 PM6/26/23
to Douglas Katzman, sbcl...@lists.sourceforge.net

> the comments starting at line 493 of ir1-translators.lisp

https://github.com/sbcl/sbcl/blob/54efc42e69256570f0bcbdef69b7a7921fd35416/src/compiler/ir1-translators.lisp#L493-L533

Ah, thanks! This would be exactly what I want. However, it seems that while
this place nice with COMPILE-FILE, it does not play nice with asdf out of the box.

Richard M Kreuter via Sbcl-help

unread,
Jun 27, 2023, 6:08:45 AM6/27/23
to Shubhamkar Ayare, sbcl...@lists.sourceforge.net
Shubhamkar Ayare writes:

> However, LPNs seem to only translate into absolute pathnames.
> A translation definition such as the following
>
> (setf (logical-pathname-translations "LOCAL")
> `(("local:*.*.*" "local/test/*")))
>
> does not translate into what I'd want
>
> (translate-logical-pathname (logical-pathname "LOCAL:source-file.lisp"))
> ;=> #P"/local/test/source-file.lisp"
> ; I think a relative pathname would have solved my problem here.

Hi Shubhamkar,

Although nothing in ANSI explicitly states "logical pathnames only
translate into absolute pathnames", I believe that this behavior is what
the designers of the logical pathname facility intended; for instance,
every example showing a logical pathname translation in ANSI ends up
with an absolute physical pathname (modulo that the last example in
LOGICAL-PATHNAME-TRANSLATIONS stipulates a file system with no
directories, so "absolute" isn't meaningful there). Designers'
intentions and ANSI examples are not "binding", so I suppose
implementations could do things differently, so long as that doesn't
lead to any other violation of the standard.

Moving on from ANSI, it seems to me the observed behavior can be viewed
as one arbitrary convention. Other conventions would be possible, they'd
simply imply different usage.

If you're a Unix person, consider that

$ FOO=/home/me/foo-1.0.3; export FOO

lets you refer to a file under that directory using, say,
$FOO/bin/foo-helper, no matter what your cwd is over time. That is, if
you establish the convention, "FOO is always absolute", then
$FOO-references are decoupled from the cwd.

You could, of course, establish a different convention, say, "FOO is
always an absolute pathname ending in slash or empty". In this
convention, a $FOO-reference names a file relative to FOO when FOO is
defined, otherwise it's relative to the cwd.

Both seem to me like plausible conventions, just different. In the
first, you can 'cd' independently of how you set FOO; in the second, you
can 'cd' around freely while FOO is set, or leave FOO unset in order to
get ${FOO}bin/foo-helper to be a cwd-relative name. Some things might be
easier in the first convention, other things in the second.

(ISTM the difference here is a bit like the difference between countries
where people drive on the left vs ones where they drive on the right:
left-hand-turns are easier in one, right-hand-turns in the other, but
either way, you must absolutely know which convention you're dealing
with!)

Anyhow, I'd say this observed behavior of SBCL's logical pathnames
(which agrees, in this detail, with what I believe were the intentions
of the LPN facility's designers) are like the first environment variable
convention: they always translate to absolute physical pathnames, and so
they're decoupled from *DEFAULT-PATHNAME-DEFAULTS* and/or the process's
current working directory. So you can vary *D-P-D* and/or cwd without
affecting what "LOCAL:source-file.lisp" refers to.

So yeah, there isn't currently a way in SBCL to get logical pathname
translation to produce a physical pathname that refers to a filename
relative to a dynamic notion of working directory. There could be, but
probably only by extending something somewhere, if backward
compatibility is to be preserved for public interfaces.

For my part, ISTM that once I got used to the convention SBCL's LPNs
implement, it's never bothered me. I'd just use an absolute pathname for
the right-hand-side of the translation:

(setf (logical-pathname-translations "LOCAL")
'(("local:*.*.*" "/home/me/src/local/test/*"))) ;;or wherever

And if I should need to move the source tree, or move development over
to a host where my home isn't /home/me, I'll just use a different
translation,

(setf (logical-pathname-translations "LOCAL")
'(("local:*.*.*" "/Users/me/source/local/test/*")))

But honestly, even though I do occasionally use logical pathnames
myself, I think of that usage as a personal preference that I really
can't recommend. Like a strong-smelling cheese or something; nobody
strictly needs the stuff, some people like it, some people hate it, and
it may not be good for you to use every day. :-)

Regards,
Richard

Fábio Almeida

unread,
Jun 27, 2023, 11:13:43 AM6/27/23
to Shubhamkar Ayare, sbcl...@lists.sourceforge.net
Hello,

We do have the same problem.

Building on CI servers with "random" paths doesn't play all that well
with
local setups. We use a mix of ASDF to load libraries and DEFSYSTEM to
load
our code (some legacy from the TI Explorer, I think...).

Up until now, our questionable approach, has been to advice or
destructively
set a bunch of internals before dumping a core, both from ASDF and SBCL.
This works, for a definition of working, because we don't upgrade SBCL
that
often, we don't use Quicklisp (we "vendor" our own libraries), and our
developers use the same setup as far as source location goes.

In this regard, ACL is a bit more helpful with its "source file
frobbers".
They do provide a couple of special variables to help in this regard,
allowing
users to define functions that will "frob" sources before being
returned.
They also seem to centralize all source finding in a single point which
is
also nicer for advising it and doing this sort of stuff.
https://franz.com/support/documentation/current/doc/source-file-recording.htm

In this regard, SBCL (or maybe SLIME?) is a bit more annoying because
there
are more places that need to be "translated". Maybe defining that single
point and using it throughout would be an approach to deal with these
cases?


Anyway, I've been exploring some alternatives to this brute force
approach.

I've managed to get ASDF to load libraries with Logical Pathnames, it
does
support them. Not only FIND-DEFINITION (and company) works as expected,
but
loading new libraries on top of a dumped core also works (Why do we need
this?
We don't load testing libraries, for example, on a core by default).

Unfortunately, I also have no idea if Quicklisp supports them.

One of the problems is the restrict set of characters allowed in Logical
Pathnames. As Richard pointed, many libraries available on Quicklisp do
not
conform with the restricted set of characters and SBCL doesn't really
support
extending the set of legal characters outside of what the standard
defines.

I've implemented a simple extension on top of SBCL that would allow an
extended
set of characters to be used with Logical Pathnames (but still with
restrictions,
specially regarding dots). I have no idea if this is desirable or not,
but I'll
leave it attached so that maintainers can have a look.

Even with this extension, not all of the libraries available on
Quicklisp would
be loadable. And again, this only works for us because of all I
mentioned above.


Best Regards,
Fábio Almeida
Software Engineer

SISCOG - Sistemas Cognitivos, SA
A Campo Grande, 378 - 3º, 1700-097 Lisboa, Portugal
T +351 217 529 100
W www.siscog.pt

Optimising the resources of the world

DISCLAIMER This message may contain confidential information. You should
not copy or address
this message to third parties. If you are not the appropriate recipient
we
kindly ask you to delete
the message and notify the sender.
The contents of this message and its attachments are the sole
responsibility of the sender and under
no circumstances can SISCOG - Sistemas Cognitivos, SA be liable for any
resulting consequences.
0001-Extend-logical-pathnames-to-support-non-standard-cha.patch

Richard M Kreuter via Sbcl-help

unread,
Jun 27, 2023, 1:43:56 PM6/27/23
to Fábio Almeida, sbcl...@lists.sourceforge.net
Fábio Almeida <fabio....@siscog.pt> wrote:

> I've implemented a simple extension on top of SBCL that would allow an
> extended set of characters to be used with Logical Pathnames (but
> still with restrictions, specially regarding dots). I have no idea if
> this is desirable or not, but I'll leave it attached so that
> maintainers can have a look.

Hi Fábio,

A few questions:

- as far as I can see, nothing in that patch would compensate for the
lettercase folding that happens when translating, e.g.,

(translate-logical-pathname "FOO:SOURCE;BAR.LISP")
=> #P"/home/me/src/foo/bar.lisp"

So this prompts a question: do you build/develop/run exclusively on
case-insensitive file systems (the defaults on MacOS and Windows)? Or
do you (sometimes) use a case-sensitive file system, but mixed-case or
solid-uppercase filenames don't occur in your stack?

- Have you got a sense for how frequently the "offending" filenames
occur, say, as a percentage of all your source files?

- Do you happen to find many non-ASCII characters in filenames? Many
non-European characters in filenames? Emojis? Flags? [ No joke, it
might matter for different approaches! ]

If these things are rare, there might be some "lightweight" approaches
available, even without altering SBCL at all.

Although I am not an active maintainer, I have spent more time studying
and thinking about pathnames than any human ever should. I would be
concerned about using a special variable for the job: it seems like one
of those variables that must never never vary after you've configured
it, right? If it changes dynamically, then you'll get errors at
unpredictable moments in a program: during parsing, during reading,
during any pathname operation that happens to validate LPN components.

So one option that occurs to me would be to make a list of "extra
allowed characters" be a slot in the logical pathname host, so that, say

foo:source;bar+baz.lisp

would be valid for the host FOO only if FOO were configured that way,
and not affect other hosts at all. This way, reading or parsing FOO's
namestrings, or constructing pathnames on FOO couldn't go wrong over
time due to changes in the dynamic environment, only if you modified FOO
itself. This would be pretty similar to the effect of your patch, just
different semantics and data flow within the pathnames implementation.

Another option could be to add an "escape character" to LPN namestring
syntax, either on a per-host basis or just generally. Let's say it was
#\^ (circumflex). Then you could write

"foo:source;bar^+baz.lisp"

and move on with life. (Prior art: an escape character was a feature of
logical pathnames on the MIT-descended Lisp Machines, though their
escape character was something else.) An escape character would get very
ugly if an LPN needed a lot of them, though, say

"foo:source;^Ba^R^_^Ba^Z.^Li^Sp" ;; but who names a file BaR_BaZ.LiSp?

This is why I've asked the questions above: if the "offenders" are rare
in fact, maybe an escape character would be okay in practice, if
not in principle.

Regards,
Richard

(I'm being deliberately a vague about how to set & get an extra slot in
an LPN host, and/or the precise effect of an "escape character"; the
details can be worked out, but no sense spending cycles on it unless
people would want it.)

Fábio Almeida

unread,
Jun 28, 2023, 7:52:01 AM6/28/23
to Richard M Kreuter, sbcl...@lists.sourceforge.net
Yes, this is a very simplistic approach that works on my machine™.
The expertise, knowledge and questions you are asking are exactly what
I was looking for.

> So this prompts a question: do you build/develop/run exclusively on
> case-insensitive file systems (the defaults on MacOS and Windows)? Or
> do you (sometimes) use a case-sensitive file system, but mixed-case
> or
> solid-uppercase filenames don't occur in your stack?

Yes, we do build/develop/run exclusively on Windows which, by default,
is
case insensitive (although it does support case sensitive folders).

Case folding was definitely an overlook.

> - Have you got a sense for how frequently the "offending" filenames
> occur, say, as a percentage of all your source files?

I'd say less than 1% are problems on naming files inside the projects
(there was only one instance of a file with a dot on the name, that I
can recall). Out of the 60 something libraries we are using, the
majority
of offending pathnames were related to dots on folder names (quite
common
for semantic versioning), an odd underscore in CFFI folder name and, of
course 'cl+ssl' 😅.

> - Do you happen to find many non-ASCII characters in filenames? Many
> non-European characters in filenames? Emojis? Flags? [ No joke, it
> might matter for different approaches! ]

Not on libraries, not that I am aware. But definitely on user facing
"stuff". On that front, at least UTF-8 support is mandatory. But in
those cases I'd strongly recommend against Logical Pathnames (even for
temporary files/paths).

> (...) I would be


> concerned about using a special variable for the job: it seems like one
> of those variables that must never never vary after you've configured
> it, right?

That seems very sensible, and I totally agree.

The escaping you suggest was something that we also discussed
internally,
but this small hackathon took me somewhere else.

But I think I like the idea of using logical hosts more.
Some prototyping might be in order (no promises though).

On a final note. As far as I am concerned, I don't really feel a
necessity
to use Logical Pathnames (or expand them for that matter), but there
seems
to be a loophole in this front for Common Lisp + ASDF + Quicklisp:
What if I want to build on a different system/folder structure from
which
I'll be developing? Logical Pathnames do seem to be the way to answer
this,
but alas, it doesn't really work, does it?

From my exploration and experimentation, there isn't really a de facto
way
on how to address this, is there?


Best regards,
Fábio Almeida
Software Engineer

SISCOG - Sistemas Cognitivos, SA
A Campo Grande, 378 - 3º, 1700-097 Lisboa, Portugal
T +351 217 529 100
W www.siscog.pt

Optimising the resources of the world

DISCLAIMER This message may contain confidential information. You should
not copy or address
this message to third parties. If you are not the appropriate recipient
we
kindly ask you to delete
the message and notify the sender.
The contents of this message and its attachments are the sole
responsibility of the sender and under
no circumstances can SISCOG - Sistemas Cognitivos, SA be liable for any
resulting consequences.

Reply all
Reply to author
Forward
0 new messages