Hi, I am trying to parse a brand-new LAF resource (Hebrew Text Database, a CLARIN-NL curation-demonstrator project) by graf-python, as a kind of validation.
Now graf-python gives me a stack trace. In io.py I have inserted some print statements to show which part of the annotation file causes the error.
Here is the head of the annotation file:
<?xml version="1.0" encoding="UTF-8"?>
<graphHeader>
<labelsDecl>
<labelUsage label="clause" occurs="88387" />
<labelUsage label="clause_atom" occurs="90061" />
<labelUsage label="mother" occurs="75304" />
<labelUsage label="parents" occurs="178466" />
<labelUsage label="ft" occurs="178448" />
</labelsDecl>
<annotationSpaces>
<annotationSpace as.id="shebanq" default="true"/> </annotationSpaces>
</graphHeader>
<node xml:id="nl28737"><link targets="r_1 r_2 r_3 r_4 r_5 r_6 r_7 r_8 r_9 r_10 r_11"/></node>
<a xml:id="al28737" label="clause" ref="nl28737"/>
<a xml:id="alf1" label="ft" ref="nl28737"><fs>
<f name="clause_constituent_relation" value="none"/>
<f name="clause_type" value="xQtl"/>
<f name="domain" value="Unknown"/>
<f name="embedding_domain" value="none"/>
<f name="levels_of_embedding" value="0"/>
<f name="number_within_sentence" value="1"/>
<f name="text_type" value="?"/>
</fs></a>
<edge xml:id="el1" from="nl28737" to="nl88917"/>
<a xml:id="ale1" label="parents" ref="el1"/>
and here is the stack trace
dirk:~/Dropbox/DANS/current/demos/apps/shebanq/graftest > python graftest.py
<graf.io.GraphParser object at 0x1011169d0>
PARSING bhs3_lingo.c.xml
ELEMENT START graph
ELEMENT START graphHeader
ELEMENT START labelsDecl
ELEMENT START labelUsage
ELEMENT START labelUsage
ELEMENT START labelUsage
ELEMENT START labelUsage
ELEMENT START labelUsage
ELEMENT START annotationSpaces
ELEMENT START annotationSpace
ELEMENT START node
ELEMENT START link
ELEMENT START a
Traceback (most recent call last):
File "graftest.py", line 19, in <module>
graph = gparser.parse(datafile)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graf_python-0.3.0-py2.7.egg/graf/io.py", line 883, in parse
do_parse(stream, graph)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graf_python-0.3.0-py2.7.egg/graf/io.py", line 841, in do_parse
parser.parse(filename)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/sax/expatreader.py", line 210, in feed
self._parser.Parse(data, isFinal)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/sax/expatreader.py", line 304, in start_element
self._cont_handler.startElement(name, AttributesImpl(attrs))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graf_python-0.3.0-py2.7.egg/graf/io.py", line 625, in startElement
fn(attrs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graf_python-0.3.0-py2.7.egg/graf/io.py", line 747, in annot_start
aspace = self._aspace_stack[-1]
IndexError: list index out of range
dirk:~/Dropbox/DANS/current/demos/apps/shebanq/graftest >
It seems that the stack with annotation spaces is still empty. Yet I have declared a default annotation space, so it should be on the stack, is not it?
Is this a bug or am I doing something wrong?
Comments would be appreciated.
Cheers, Dirk