Milan asked for more details about document serialization. Below are some thought on the matter. They are preliminary. I haven't implemented the scheme below, and so it is subject to change.
The context here is that I would like a solution that allows me to save and load documents, including their dependencies, which include resources like images (or audio or video media etc), classes they use in their amplets, and transcluded documents.
Moreover, if we are to fulfill the vision of documents as full blown applications, we need to support scenarios where the app should run with specific data. For example, Telescreen is a presentation manager written in Newspeak, and one wants to save and load not just the Telescreen application, but one or more specific presentations. The plan is to revise Telescreen to be itself a document. Today the Telescreen app has its own scheme for saving presentation as zip files, and this does not cope with dependencies.
Now let me me explain how I've been looking at this problem.
We want a serialization solution that preserves a document as completely as possible. There is a spectrum here, from fully preserving document state, to preserving just document "source" with or without is dependencies. Currently, we only do the latter. If object serialization worked properly, it would subsume all of these - - but I need something I can implement quickly.
As noted above, my focus here is on an intermediate case - preserving the document source, its dependencies, and the data needed to start it up.
One perspective is that we have a player and content (think a media player and a audio/image/video file). The player is program, that may in fact have dependencies.
Thererfore, we shift perspective and consider a program, its dependencies and the data it needs to start up. We can represent it as a main program (the entry point of which is standardized), a list of dependencies (programs again) and some data the program will read at launch. The dependencies may be named symbolically within the program. This leads us to representation that is a map of names to code (which embeds names symbolically referencing code or programs) paired with data.
We can also view the program as an interpreter of its initial data. Especially if the program is indeed an interpreter, in which case the data or content is itself a program. In the case where the data itself is an interpreter, we have a recursion.
Now we can think about a document as a program. Its dependencies include resources, classes, but also transcluded documents. So we can use a map of named entities, that are either classes or documents or resources (media), paired with data (which may be empty). In the case of Telescreen, the data includes the Telescreen slides, each of which is itself a document, which may have its own dependencies such as classes or transcluded documents.
This is actually general enough to describe apps that are not documents, as these would consist of the application class and its dependencies - with empty data or not.
A delicate question we've glossed over is whether the map is flat or not. That is, when we name a dependency which is a program (say, a transcluded document) is it a recursive instance of the same structure, or are we sharing dependencies in the top level map? Obviously, sharing is important both to prevent redundancy and to ensure correct semantics. Yet one must account for namespacing (name conflicts); reusing the recursive structure is also elegant and attractive.
One idea is to identify and remove redundant dependencies when serializing. This is what one would do with object serialization. Assuming we have a consistent IDE Root namespace when saving, we should be fine. If we load documents, we may get problems when a loaded document includes classes that are already in the system. In that case, we may must decide on a policy: load subsumes existing or vice versa. The former is consistent with current practice. If the result is undesirable, one can load the correct piece I suppose. Likewise with the latter - one can always load the desired version. Might be good to warn of overwrites and gave options.
Now we can consider how to realize these ideas. Extending the Telescreen scheme of zip files seems the most direct possibility. It can handle media files (unlike our object serializers) and is likely to efficient and easy to implement. It naturally consists of a dictionary of names, and we use conventions to handle the main entry point and the contents as a subfile.
This should be enough to make Ampleforth a a universal app-builder that allows you to construct apps as simple as pure text documents or as complex as Telescreen or more, and to save and load them as standalone docu-apps or in the context of the general editor.