-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Two very basic and very intentional elements of Crack's design are:
1) declaration order is significant, allowing for a single pass compile.
2) Module dependencies are DAGs. If module A imports module B, module B
cannot directly or indirectly import module A. This lets us preserve rule
1 (because we can safely compile B knowing that it won't be looking for
anything that hasn't been defined yet in A) and also greatly simplifies
module initialization and cleanup: we can assume that all imported
modules have been initialized prior to a module's initialization.
These rules break down when it comes to generics. In order to generate a
single copy of the executable representation of a generic instantiation, Crack
produces an "ephemeral module." Ephemeral modules are modules that do not
directly correspond to a top-level source file. So for example, consider the
following simple script:
class A[T] {}
A[int] ai = {};
If you run this with caching enabled, you'll generate two sets of
cache files: one for the module itself, and one for the instantiation of
A[int]. If you use your script as a module, and another script imports it and
instantatiates A[int], it won't have to recompile A[int]: it can just used the
cached representation produced by the original file.
Unfortunately, this lets you break the "Modules are DAGs" rule:
int g;
class A[T] {
int f() { return g; }
}
A[int]().f();
The module above generates what we call a "cyclic ephemeral." A.f() (and
therefore, all instantiations of A) uses the "g" variable from its enclosing
module. The enclosing module then creates an instance of A. So we have two
modules with a cyclic relationship: the script's module, and the ephemeral
module for A[int].
A substantial portion of the time I've spent developing Crack has been spent
dealing with the implications of this problem. I would be surprised if it
hasn't consumed more than half of my development time. I have often
considered that we would be much better off in terms of being able to deliver
a 1.0 version of Crack if I had decided early on to not support this pattern,
but if I had, I don't think the language would be as good.
When dealing with this problem in the context of gettting caching to work I
was faced with two options:
1) Work around it using stubs. When an ephemeral module requires a dependency
on its master module (which hasn't been deserialized yet) just obtain the
dependency from a stub for the master module. When the master module is
finally fully deserialized, go through its ephemerals and fix all of the
stub references.
2) Treat cyclic ephemerals as part of the same module for purposes of caching.
The entire set of cyclic modules would be persisted in the same set of
meta-data and bitcode files, then the problem of cyclic dependencies
collapses to the solved problem of intra-module cyclic dependencies.
I made a mistake and chose option 1. It seemed like the easiest thing to do
at the time, and there were certain design elements in play then ("JIT
on import" behavior, for example) that seemed to favor it.
Unfortunately, the use of stubs proved non-viable. After a huge amount of
effort implementing them, I discovered that the approach broke down when a
recompile of an ephemeral was necessary during deserialization of the master
module. In that case, the compiler requires deep information on the symbols
used from the stub: more than you could in any way hope to provide with
stubbing.
So I've spent the past month converting the system to use copersistence.
There were several other fundamental changes that preceded copersistence and
made it more feasible. These were things like "late-jitting" (we now only jit
code when we absolutely need to run it) and the collapse of all modules to a
single module that greatly simplified the codebase and laid the groundwork for
the special treatment that copersistence requires.
During compilation, when an ephemeral module is created with potential cycles
with its source module, we designate it as a "slave" of the source module,
which
is now referred to as the slave's "master." (Some of the nomenclature is
still wrong in the source code, this used to be the "owner" but that conflicts
with the term used to describe the namespace that owns any kind of variable.
Going forward, we'll use "master/slave" to define the cyclic module
relationship, and "owner" only for the more common variable/namespace
situation).
The master module is serialized with all of the information in all of its
slaves: the meta-data file contains meta-data for the entire group and the
bitcode
file contains bitcode for the entire group. There are still separate
meta-data files for the slaves, but these are just stubs that reference the
master module. It is questionable whether these need to be generated at all.
In playing with the system since last night, I have thus far been able to
discover any problems with it (I'm sure I will). Even with very complicated
generics like Array, the system seems to behave as I expect. It is also
interesting to note that in implementing this, I removed more than two-thirds
as much code as I added, and I think I still have a lot of dead code to
remove. Usually when you end up with less code providing the same
functionality, it's a good sign that you've done the right thing.
I'm going to do some more testing and code removal, but at this point I plan
to release version 0.10 of the code at the end of the month. I still won't
have caching enabled by default, but hopefully it will be at the point where
the daring can finally use it for most of their work.
Caching does improve performance. It generally seems to reduce startup time
by a factor of about 40%. Unfortunately, jitting is still quite expensive.
After 1.0, I intend to start implementing some alternate, non-LLVM backends
to try to address this.
Meanwhile, thank you all for your patience. The caching effort has been way
bigger than expected, but it looks like there's finally some light at the end
of the tunnel.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAEBAgAGBQJTjx0kAAoJEKvbffPCTR+TITsQAKSfpxLa9osjcVysl26iVvp6
BiLOuZe/5+HnLucZJsunEhdYLRbTjQ+vpe8okreHjcn1hwdLm+QvKePtZjmSlROl
sXZvbvWekM7rxKahCJk9htF9u7gPwgcLIsRqk8yxkOqsp5uV+rz9CP9j1gb6McpN
Zx9cRIf4TU5IvTAgS3abZ36Cgh92Bxtjxg5ZxboiMcgJJWfPy0XdAHTzWyIen8Kq
OOM3LnYXGUzNvPBI75FWysY7P+0MUBYJ0EMXmOBZlxbgQHIfPRlBl8xj8q5M8k8u
qoNRI+rSDkYtKBKbMhk6klAmbEVn7dVtjLopZ6Z4ehvdQ8qORkPDXnnJD2uXZ+59
iHjS0flRTI8XSx6sHyvEEddJPFFAp4OVSbAR+IKOuPZmDSVLB9l3VQc+MHJWpfPH
iv3ZHf37fHHQgKttGJ8XLMUsp36vC5dWIzuW0qIwiT04IFM0kFdxfsBIkMs9cZhJ
4iU45sPoCYAjUUuj96o55dcRmZE6tz/WDCyPLdLfdDDpmC9lvNSeCZSpWxUXH13G
wT8gg3Gc5yhKH50x7H/TSJmxwYGIiiy+s8gKj4A/LPYR4KZHVqtmIw7U39bmVfuJ
MogtfkaURSAjn7fudeJwuYpGHdfV9SO+GCPt8bd5drVGKm0wYieOhOFC/9FqmyoM
2hYVfCoX0+wWYtjpYHuw
=MeTE
-----END PGP SIGNATURE-----
=============================================================================
michaelMuller =
mmu...@enduden.com |
http://www.mindhog.net/~mmuller
-----------------------------------------------------------------------------
There is no concept that is more demeaning to the human spirit than the
notion that our freedom must be limited in the interests of our own
protection.
=============================================================================