An idea I've been toying with for a while involves two commands that would be added to the metamath (or other) program, EXTRACT and MERGE. Conceptually, these are very sophisticated extensions of WRITE SOURCE and READ respectively.
EXTRACT <label-list> <out-file>
MERGE <inp-file> <inp-file>
EXTRACT would be given a theorem or list of theorems, recursively determine what they depend on, and produce a
set.mm subset with exactly and only the axioms, definitions, and theorems needed. Little details would need to be paid attention to like keeping section headers whenever a theorem from that section is used, keeping comments intact, cleanly breaking out theorems sharing common $e's, keeping relevant htmldefs, and overall maintaining the formatting conventions used in
set.mm. For example,
MM> read
set.mmMM> extract ax-resscn~cju
complex.mmwould create the file "
complex.mm" with exactly and only the
set.mm subset needed for basic complex number theory i.e. for statements ax-resscn through cju. This file will be much smaller that the whole of
set.mm, and work could continue on it independently of the rest of
set.mm. If we also want to make sure that all prop calc theorems are available in
complex.mm, just in case useful ones were missed because they weren't needed, we can do
MM> extract ax-1~trud,ax-resscn~cju
complex.mmMERGE would take two .mm's and reassemble a .mm containing both. The two .mm's could be two extractions, an edited extraction and the original
set.mm, or even two
set.mm's that were independently edited.
The first .mm would be "primary", meaning its labels, comments, and tokens would dominate (take precedence). In the ideal case, theorems would be matched by content, not labels (ideally even tolerating variable renaming), and in that case the primary file's label and variable names would be used. When there is a label conflict (same label for different theorems), the label in the secondary file, say
abc.mm, would be suffixed with the file name, say "_
abc.mm", which can be cleaned up by hand later. Conflicting tokens (a token with a different definition in each file) could be handled similarly.
If two theorems are the same with different proofs, we would use the proof requiring fewer axioms, or if the same axioms are used, the one with the shortest proof.
Sections would be matched up as best possible, based on the section name and section identifying conventions ("=-=-=-" etc.). The final output would be a valid .mm file no matter what, and a warning list would guide the user for what needed to be cleaned up by hand. For example,
MM> erase
MM> merge
complex.mm set.mmwould read and internally merge the two files, and the result can be written out with the usual 'write source' (or even 'extract'). In the above example,
complex.mm would take precedence in case of renamed labels, changed comments, etc., whereas in
MM> merge
set.mm complex.mmthe input file
set.mm would take precedence.
Note that (ideally) 'merge
set.mm set.mm' would be equivalent to 'read
set.mm', and 'extract *
set.mm' would be equivalent to 'write source
set.mm'.
I think these commands could make a significant difference because people can work with smaller
set.mm pieces or even the whole
set.mm independently, without much concern of clashes with what others are doing. No changes to the Metamath language are required.
I can envision a time in the future when mathematicians can share theorems by extracting and merging what they are interested in rather than depending on a huge master
set.mm, most of which would not be of interest to them.
Norm