The `id` vs `num` issue goes deeper, touching the XML indexer as well. The core indexing machinery was designed to handle this, but the two or three layers of abstraction I wrote on top of it didn't provide a way to parameterize that machinery accordingly. I derive two classes to fix this in `mzxml`, but proper handling could be moved in the main classes to better handle this problem in the future.
Since the tag attribute to be used for looking up elements may be different for different tags, we'd need to adjust _find_by_id_no_reset in a fashion which lets you specify the attribute to match by as an argument, and then provide a default for each class.
There's also no support for automatic schema deduction, since the version extractor doesn't work with mzXML, and the schema extractor doesn't retrieve anything but the schema for the offset index because the actual useful components are inside xsd files added by XInclude, which lxml doesn't automatically expand by default for security reasons. Since mzXML is a dead format, and will never be updated again, I'm not too concerned about this part.