Hi all,
I've been thinking about making libraries that would generate submodules when they're used. However, submodules exist in a flat namespace, I'm a bit afraid of conflicts if I choose the same name as some other library does, and I don't really want users to have to supply their own local choices of names (`rename-in` style) since I'm thinking of these submodules as an implementation detail.
To be more specific about my higher-level goals, I'm thinking of experimenting with a system of modules that have *optional compile-time arguments*, which make them somewhat like ML functors. If a user requires the module the usual way, they get the default arguments, but they can use a special require spec and a system of extended module path indexes to supply arguments. For instance, an extended module path could represent "apply module X to the arguments 1 and 2, and then access the resulting module's Y submodule." Since the default way to require a module just gets its no-argument version, I'm thinking of hiding away the argument-processing logic in a submodule of its own.
When someone supplies these arguments to a module, what's really going to happen is that they're defining a local submodule and requiring it on the spot. After all, the compilation of that module with those arguments has to happen sometime, and it couldn't have happened already, so it must be compiled alongside the current module. A submodule represents this situation well.
A subtler design challenge with this idea is that a library with compile-time arguments probably need to stop using "generative" definitions of structure types, so that their types can remain stable across various choices of module arguments. So I'd probably supply a type definition mechanism that associated the defined type with a stable module path, similar to the way `serializable-struct` creates a submodule called `deserialize-info`.
As you can see, if I proceed the way I'm imagining, my library is going to be generating submodules for several reasons. These submodules would exist mostly as a means to an end, so I'm not immediately inclined to expose them to users the way `serializable-struct` does. I probably could stabilize them if I put in some extra thought, but my first choice, especially early in development, would be to keep these details private. At least, private to anyone who isn't using reflective tools like `module-compiled-submodules` or `current-module-name-resolver`.
In the past, I've guarded against accidental namespace conflicts by using gensyms as my variable names. That approach seems viable here too.
It's a little tricky to do. The name of a submodule being defined or required must be known at compile time, but due to Racket's separate compilation guarantee, different clients using my library at compile time will be using different instantiations of it. If my library just calls (gensym), those clients will all end up using different gensyms, and it won't work. So every instantiation of my library needs to obtain the same gensym, and I do that by generating a gensym one phase up and embedding it in a quotation, like #`(... '#,(gensym) ..). While the Racket compiler can't marshal every kind of 3D syntax into the compiled code, gensyms are one thing it actually can marshal. The gensym's unique identity seems to be generated again at the time it's unmarshaled, which is exactly what I want. Since a (non-reflective) program will unmarshal my library only once, the gensym will be unique to my library but shared across all my library's instantiations.
(See the end of this email for example code.)
Using that technique, everything works fine... at least on the command line. Unfortunately, in DrRacket, the submodule simply isn't found when I try to require it:
require: unknown module
module name: #<resolved-module-path:(submod "/path/to/badlibrary.rkt" badlang-submodule-name12021)>
As far as I can tell, this error I'm getting is really specific to
DrRacket. I tried compiling my code with more instrumentation at
the command line using "racket -e '(compile-context-preservation-enabled #t)' -l errortrace -t client.rkt" but even that works successfully.
Am I simply running into a bug in DrRacket, or is this gensym technique obscure enough that I shouldn't rely on it? Is there a more stable technique that would give me a similar guarantee that my names aren't collision-prone? My goal with using a gensym was to avoid accidental incompatibilities with other code, so of course an immediate incompatiblity with DrRacket is a sign I might want to take a different approach.
Here are the three files of code I prepared to try this out (badlang.rkt, badlibrary.rkt, and client.rkt):
#lang racket
; badlang.rkt
(require (for-meta 2 syntax/parse))
(require (for-syntax racket))
(require (for-syntax syntax/parse))
(provide (all-defined-out))
; As long as we define `badlang-submodule-name` as an interned symbol
; like this, it works everywhere.
#;
(define-for-syntax badlang-submodule-name
'private/generated-by-badlang/submodule)
; As long as we define `badlang-submodule-name` as a quoted uninterned
; symbol like this, it works at the command line but not in DrRacket.
(begin-for-syntax
(define-syntax (define-quoted-gensym stx)
(syntax-parse stx
[
(_ var:id)
#`(define var '#,(gensym (syntax-e #'var)))]))
(define-quoted-gensym badlang-submodule-name))
(define-syntax (define-badlang-submodule-here stx)
#`(module #,badlang-submodule-name racket))
(define-syntax (require-badlang-submodule-from stx)
(syntax-parse stx
[
(_ parent-module)
#`(require (submod parent-module #,badlang-submodule-name))]))
#lang racket
; badlibrary.rkt
(require "badlang.rkt")
(define-badlang-submodule-here)
#lang racket
; client.rkt
(require "badlang.rkt")
(require-badlang-submodule-from "badlibrary.rkt")
-Nia