graf-python: annotation space

36 views
Skip to first unread message

Dirk Roorda

unread,
Sep 12, 2013, 3:30:29 AM9/12/13
to poio-d...@googlegroups.com
Hi, I am trying to parse a brand-new LAF resource (Hebrew Text Database, a CLARIN-NL curation-demonstrator project) by graf-python, as a kind of validation.
Now graf-python gives me a stack trace. In io.py I have inserted some print statements to show which part of the annotation file causes the error.

Here is the head of the annotation file:

<?xml version="1.0" encoding="UTF-8"?>
    <graphHeader>
        <labelsDecl>
            <labelUsage        label="clause" occurs="88387"       />
            <labelUsage        label="clause_atom" occurs="90061"       />
            <labelUsage        label="mother" occurs="75304"       />
            <labelUsage        label="parents" occurs="178466"      />
            <labelUsage        label="ft" occurs="178448"      />
        </labelsDecl>
        <annotationSpaces>
            <annotationSpace   as.id="shebanq" default="true"/>
        </annotationSpaces>
    </graphHeader>
<node xml:id="nl28737"><link targets="r_1 r_2 r_3 r_4 r_5 r_6 r_7 r_8 r_9 r_10 r_11"/></node>
<a xml:id="al28737" label="clause" ref="nl28737"/>
<a xml:id="alf1" label="ft" ref="nl28737"><fs>
<f name="clause_constituent_relation" value="none"/>
<f name="clause_type" value="xQtl"/>
<f name="domain" value="Unknown"/>
<f name="embedding_domain" value="none"/>
<f name="levels_of_embedding" value="0"/>
<f name="number_within_sentence" value="1"/>
<f name="text_type" value="?"/>
</fs></a>
<edge xml:id="el1" from="nl28737" to="nl88917"/>
<a xml:id="ale1" label="parents" ref="el1"/>

and here is the stack trace

dirk:~/Dropbox/DANS/current/demos/apps/shebanq/graftest > python graftest.py 
<graf.io.GraphParser object at 0x1011169d0>
PARSING bhs3_lingo.c.xml
ELEMENT START graph
ELEMENT START graphHeader
ELEMENT START labelsDecl
ELEMENT START labelUsage
ELEMENT START labelUsage
ELEMENT START labelUsage
ELEMENT START labelUsage
ELEMENT START labelUsage
ELEMENT START annotationSpaces
ELEMENT START annotationSpace
ELEMENT START node
ELEMENT START link
ELEMENT START a
Traceback (most recent call last):
  File "graftest.py", line 19, in <module>
    graph = gparser.parse(datafile)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graf_python-0.3.0-py2.7.egg/graf/io.py", line 883, in parse
    do_parse(stream, graph)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graf_python-0.3.0-py2.7.egg/graf/io.py", line 841, in do_parse
    parser.parse(filename)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/sax/expatreader.py", line 210, in feed
    self._parser.Parse(data, isFinal)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/sax/expatreader.py", line 304, in start_element
    self._cont_handler.startElement(name, AttributesImpl(attrs))
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graf_python-0.3.0-py2.7.egg/graf/io.py", line 625, in startElement
    fn(attrs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/graf_python-0.3.0-py2.7.egg/graf/io.py", line 747, in annot_start
    aspace = self._aspace_stack[-1]
IndexError: list index out of range
dirk:~/Dropbox/DANS/current/demos/apps/shebanq/graftest > 

It seems that the stack with annotation spaces is still empty. Yet I have declared a default annotation space, so it should be on the stack, is not it?
Is this a bug or am I doing something wrong?

Comments would be appreciated.
Cheers, Dirk

pbouda

unread,
Sep 12, 2013, 5:42:11 AM9/12/13
to poio-d...@googlegroups.com
Hi Dirk,

can you send or attach the complete XML file? Then I can do a test here on my computer. On first sight everything looks OK in your example... Either attach the file here or send me a mail:

http://www.cidles.eu/about/team/peter-bouda/

Best,
Peter
Message has been deleted

Dirk Roorda

unread,
Sep 12, 2013, 6:17:03 AM9/12/13
to poio-d...@googlegroups.com
Hi Peter,
here are links to the primary data header file and a truncated version of the annotation file (the full one is 130 MB and the exception occurs on the very first <a>).

pbouda

unread,
Sep 12, 2013, 7:02:22 AM9/12/13
to poio-d...@googlegroups.com
Hi Dirk,

that was a bug in graf-python, default annotation spaces were not supported until now. Can you check with the current master on Github? I just committed a new version with support for defaults.

Best,
Peter

Dirk Roorda

unread,
Sep 13, 2013, 1:32:58 AM9/13/13
to poio-d...@googlegroups.com
Indeed. Solved.
Thanks.
Reply all
Reply to author
Forward
0 new messages