XMI file parsing in Python.

2,160 views
Skip to first unread message

Gelli Ravikumar

unread,
Jan 5, 2014, 1:42:25 AM1/5/14
to wncc...@googlegroups.com
Do any one know a python program for parsing a XMI (XML metadata interchange) file. For parsing a XML, there are many python libraries available, and yes we can build a parser based on these python XML libraries for parsing XMI file. But, do any one know a python library where I can import a XMI file and able to access all the objects inside the XMI file.

--
with Regards
----------------------------------------------------
Gelli Ravikumar
Research Scholar (Ph.D),
Field Computations Lab,
Dept. of Electrical Engineering,
IIT Bombay, Powai, Mumbai 400076
Ph: 022-2576 4424, 089 765 983 96

Gandhian Young Technological Innovation Award - 2013
POSOCO Power System Award: PPSA - 2013
IITB PhD Executive Member - 2013

Dilawar

unread,
Jan 5, 2014, 2:16:11 AM1/5/14
to wncc...@googlegroups.com
Actually there are many, from very simple (xml) to very complex (pyxb). A decent one is lxml which is well supported. For almost all application (where namespaces are not included in schema and XML) it is possibly the best bet. Its c-bindings are also available; which makes it pretty fast  (checkout http://lxml.de/tutorial.html).

If you have large xml file, serializing it is always a good idea (e.g. XML to python object). For small XML files (few thousand of lines) XPATH queries are good enough. To serialize, you can check out this script to generate python class from a given schema. (https://pypi.python.org/pypi/generateDS/). I often use it without any problem so far. lxml.etree also have similar capabilities.

If you are looking for some code to start with lxml, here is one example (https://github.com/dilawar/moose/blob/buildQ/python/moose/multiscale/parser/parser.py)

hope it helps.

- Dilawar

Dilawar Singh

unread,
Jan 6, 2014, 10:06:45 AM1/6/14
to wncc...@googlegroups.com
https://pypi.python.org/pypi/xmiparser -- Dilawar Dept. of EE, IIT Bombay Bhalla Lab, NCBS Bangalore On Sun, Jan 05, 2014 at 12:12:25PM +0530, Gelli Ravikumar wrote: >Do any one know a python program for parsing a XMI (XML metadata >interchange) file. For parsing a XML, there are many python libraries >available, and yes we can build a parser based on these python XML >libraries for parsing XMI file. But, do any one know a python library where >I can import a XMI file and able to access all the objects inside the XMI >file. > >-- >with Regards >---------------------------------------------------- >Gelli Ravikumar >Research Scholar (Ph.D), >Field Computations Lab, >Dept. of Electrical Engineering, >IIT Bombay, Powai, Mumbai 400076 >Ph: 022-2576 4424, 089 765 983 96 > >Gandhian Young Technological Innovation Award - 2013 >POSOCO Power System Award: PPSA - 2013 >IITB PhD Executive Member - 2013 > >-- >-- >The website for the club is http://stab-iitb.org/wncc >To post to this group, send email to wncc...@googlegroups.com > >--- >You received this message because you are subscribed to the Google Groups "Web and Coding Club IIT Bombay" group. >To unsubscribe from this group and stop receiving emails from it, send an email to wncc_iitb+...@googlegroups.com. >For more options, visit https://groups.google.com/groups/opt_out.

Gelli Ravikumar

unread,
Jan 6, 2014, 5:00:38 AM1/6/14
to wncc...@googlegroups.com
Thanks for your effort. I've tried this XMI parser earlier, but some how it is not working for my .xmi files.
It is working quite good for the .xmi file provided as a demo file, which is attached here as a sample2.xmi
I have also attached my file named model_1_1.xmi (Due to space issue, I have shared this file through google drive.. When I parse this file with the parser, I have encountered the following error.
=========
Traceback (most recent call last):
  File "test.py", line 13, in <module>
    model = XMIparser.parse(modelFile)
  File "/home/gelli/gelliMain/gelliFullBackup/pythonXMI/parsers/XMIparser.py", line 2875, in parse
    root = buildHierarchy(doc, packages, profile_docs=profile_docs)
  File "/home/gelli/gelliMain/gelliFullBackup/pythonXMI/parsers/XMIparser.py", line 2761, in buildHierarchy
    buildDataTypes(doc)
  File "/home/gelli/gelliMain/gelliFullBackup/pythonXMI/parsers/XMIparser.py", line 2724, in buildDataTypes
    XMI.collectTagDefinitions(doc, prefix=prefix)
TypeError: collectTagDefinitions() got an unexpected keyword argument 'prefix'
=========

 Can you resolve the above problem. Let me know if you need any other information in this regard.

--
with Regards
----------------------------------------------------
Gelli Ravikumar
Research Scholar (Ph.D),
Field Computations Lab,
Dept. of Electrical Engineering,
IIT Bombay, Powai, Mumbai 400076
Ph: 022-2576 4424, 089 765 983 96

Gandhian Young Technological Innovation Award - 2013
POSOCO Power System Award: PPSA - 2013
IITB PhD Executive Member - 2013



sample2.xmi

Dilawar Singh

unread,
Jan 6, 2014, 8:52:16 AM1/6/14
to wncc_iitb
Can you share your test.py file also?
Dilawar
NCBS Bangalore

Dilawar Singh

unread,
Jan 6, 2014, 9:48:48 AM1/6/14
to wncc_iitb
Download the package from pypi and change the line 406 in xmiparser/xmiparse.py from

    def collectTagDefinitions(self, el):

to

     def collectTagDefinitions(self, el, prefix=''):

To be sure, your problem is yet not over.  More problems are there with this package.

--
Dilawar
NCBS Bangalore

Saket Choudhary

unread,
Jan 6, 2014, 11:03:50 AM1/6/14
to wncc...@googlegroups.com
Ravi:

Can you give BS4[1] a try?  What is it that you are exactly trying to extract from the XML files?


[1] http://www.crummy.com/software/BeautifulSoup/bs4/doc/

Saket Choudhary

unread,
Jan 6, 2014, 11:49:55 AM1/6/14
to wncc...@googlegroups.com
$ ipython

In [1]: from xml.etree import ElementTree as ET

In [2]: parser = ET.iterparse("model_1_1.xmi")

In [3]: for event, element in parser:
    print element.tag




Gelli Ravikumar

unread,
Jan 6, 2014, 11:29:54 AM1/6/14
to wncc...@googlegroups.com
Hi saket,

I have gone through the beautifulsoup package, yes it is for XML file parser. What I exactly need to parse a XMI (XML metadata interchange) file, which is the representation of UML diagram in a XML dialect. Though the XMI is one of the XML dialect, it is not straight forward to parse as easy as a XML file because of the XMI file has complex tree structure. However, I'll share you the .XMI file which I work through google drive, but in this regard the BS4 is not worth as of my knowledge. May be I can get more in this regard from you. Please try the example file which I share to you.

--
with Regards
----------------------------------------------------
Gelli Ravikumar
Research Scholar (Ph.D),
Field Computations Lab,
Dept. of Electrical Engineering,
IIT Bombay, Powai, Mumbai 400076
Ph: 022-2576 4424, 089 765 983 96

Gandhian Young Technological Innovation Award - 2013
POSOCO Power System Award: PPSA - 2013
IITB PhD Executive Member - 2013



Saket Choudhary

unread,
Jan 6, 2014, 4:31:16 PM1/6/14
to wncc...@googlegroups.com
Ravi:

Sorry, I jumbled up XMI and XML. BS4 should however still work.
Could you try out the previous snippet I pasted?

You might want to play around with the 'parser' object itself. parser.next, .getchildren, root etc.

Saket

Dilawar Singh

unread,
Jan 7, 2014, 2:36:00 AM1/7/14
to wncc...@googlegroups.com
Unfortunately I can't spend much time on it, here are notes if you want to
salvage it.

- The XMI file you are using is in version 1.1. For this version class
XMI_V_1.1 is instantiated which does not have its own function
`collectTagDefinitions`. It uses function defined in class XMI_1.0 which is
dummy. I'd suggest to copy the function from XMI_1.2 to class XMI_1.1 and see
what happens. I tried changing the version of xmi document from 1.1 to 1.2.
And it generates run time error which you can easily reproduce by changing
the version="1.1" to version="1.2" in your xmi file. Rest will take time to
debug.

As far as parsing is concerned a simple module is available here
http://pyxmi.sourceforge.net/tutorial.html . It parses your file without any
glitch. The function you might me interested is `classSociety` (Example 4).
Attached is a file which has some trivial code to get started. You can easily
fetch `DOM` for this XMI.

Last to lines in this file will call the IPython shell, you can play with
classes object to see what can be done with it.

--
Dilawar
Dept. of EE, IIT Bombay
Bhalla Lab, NCBS Bangalore

test.py
Reply all
Reply to author
Forward
0 new messages