Hi Charlie,
I did not document yet or explain the technical details because everything is not decided, and things keep evolving as I'm going further with these developments. It's not a very rigourous methodology, but it's the way I usually work: I first focus on a small issue, then code, test, experiment, back to the first step and then extent the scope of the work. I would call it an 'experimental and incremental ' approach that is quite easy to conduct using python. When these experiments are over, demonstrating the feasibility of the project, a global refactoring/optimization will certainly be necessary.
So far, here is the current state of the work:
1. I've been implementing in python the ISO-10303-22:1994 standard while trying to be as close as possible to the concepts defined in this document. For instance, something like 'ARRAY[1:8] OF UNIQUE REAL' is implemented as 'ARRAY(1,8,REAL,UNIQUE=True)' in python. ARRAY is a class: this is not very pythonic, since python recommendation is rather to use the CamelCase convention to name classes, but the priority is IMO to have he best matching between EXPRESS and the python implementation. The rules and restriction of the ARRAY data type, as described in the section 8.2.1 of the standard, are unit tested in the SCL_unittest.py script. Some mappings are quite obvious (LIST data type and the python list type, entity and class, whereas others are not (nothing similar to BAG in python). This implementation of ISO-10303-22:1994 is what I called the SCL python package.
2. fedex_python takes an express file as a parameter, and generates a python module that uses the SCL package. This translation of EXPRESS to python aims at being semantically conservative: I'd like to have all the semantics of the EXPRESS schema made available from the python module, including rules/functions/INVERSE attributes etc.
3. Importing the generated python module into a python session enable what I called in the wiki page a kind of 'EXPRESS interpreter'. It's possible to create instances, attributes etc. I currently achieved doing this with some unitary schemas, but the purpose is of course to deal with standard schemas (ap203, ifc etc.).
4. I've been fixing the Part21.py module, in order to be able to import p21 files with python.
Beyond the technical details of this implementation, the question maybe: what is the interest of such a project? What is the added value compared to the C++ library generated by fedex_plus? According to me, there are 2 major benefits from using python:
- python is portable. The same python module generated from fedex_python can be importing on any machine providing python>2.6.
- python is known to be suitable for integrating many external libraries in an easy, consistent and portable way. This features is of special interest for me, let me explain that further.
My main concern is actually: how to deal with *huge* set of instances entities? With 'deal with', I mean store and transfer. Indeed serializing entity instances into a (set of) file(s) (whether it is ascii/plain text/Part21 or xml/Part28) is IMO not the best way to store product data: parsing those big files (hundred of megabytes or more) consumes too much resources (computing time and/or memory footprint). And it's not either the best way to exchange product information: transfering a 1Gb Part21 file require that it is first written to disk by the exporter, and then parsed by the importer. I'd like to experiment:
- the use of distributed db for the storage of express data. I already tested Couchdb, the idea would be to develop some kind of db_connector that would allow to plug any db controller into the SCL python package. python is good at that kind of abstraction;
- the use of a network based product data transfer. I'd like to be able to make two or more distant db (whatever they are) exchanging some models with the help of remote services. python provides many excellent web frameworks for such services (whether they are SOAP, REST or anything else). This would enable solving an inconsistency of the ISO-10303 standard IMO: where as STEP is intended to be independent from any particular system, SDAI is however strongly dependent from a particular system. Loosely coupling of the service definition and its implementation would make the standard more consistent and up-to-date with current technologies (p21 and 28 could be considered as outdated at some point).
There are certainly ongoing projects I'm not aware of regarding these two issues (what is the HDF5 status?), please let me know if you have any information.
I hope it is a bit more clear. I will take time to write a more structured document when progresses and results are significant.
Regards,
Thomas