problem with writing cxtm files

2 views
Skip to first unread message

Christian Wittern

unread,
Nov 2, 2010, 4:12:00 AM11/2/10
to ma...@googlegroups.com
Hi Lars, hi everybody,

Still kicking the tires and trying to get a grip on topicmapping with
mappa:-) Today I tried to write a map I generated to a file.
I first wrote to a XTM file, which worked (after I fixed a few bugs in my
thinking and program). However, when I try to write to a CXTM file, I keep
getting errors which I can't really debug:

2.6.egg/mappa/writer/cxtm/cxtm10.py", line 111, in write
write_topic(topic)
File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Mappa-0.1.6-py2.6.egg/mappa/writer/cxtm/cxtm10.py",
line 169, in _write_topic
write_name(name, pos)
File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Mappa-0.1.6-py2.6.egg/mappa/writer/cxtm/cxtm10.py",
line 221, in _write_name
self._write_type(name)
File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Mappa-0.1.6-py2.6.egg/mappa/writer/cxtm/cxtm10.py",
line 310, in _write_type
writer.emptyElement('type', self._topic_ref(typed.type))
File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Mappa-0.1.6-py2.6.egg/mappa/writer/cxtm/cxtm10.py",
line 387, in _topic_ref
return {'topicref': self._index(topic)}
File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Mappa-0.1.6-py2.6.egg/mappa/writer/cxtm/cxtm10.py",
line 131, in _index
return self._tmc2id[construct]
KeyError: <mappa.backend.mem.model.Topic object at 0x1020dc450>


To me this looks like something falls over while writing a topic name. When
I look at the generated file, it is in the middle of a topic like this:

<topic number="512">
<subjectIdentifiers>
<locator>http://psi.iba-net.org/exp/hnt_0001</locator>
</subjectIdentifiers>
<itemIdentifiers>
<locator>#hnt_0001</locator>
</itemIdentifiers>
<name number="1">
<value>(Chang a han jing)</value>

This same topic is written without problems to the XTM file:

<topic id="hnt_0001">
<itemIdentity href="ontology.ltm#hnt_0001"/>
<subjectIdentifier href="http://psi.iba-net.org/exp/hnt_0001"/>
<name>
<scope>
<topicRef href="#ja-rom"/>
</scope>
<value>JŌAGON GYŌ</value>
</name>
<name>
<scope>
<topicRef href="#zh"/>
<topicRef href="#hobogirin"/>
</scope>
<value>長阿含經</value>
</name>
<name>
<scope>
<topicRef href="#zh-Latn-x-pinyin"/>
</scope>
<value>(Chang a han jing)</value>
</name>
<name>
<scope>
<topicRef href="#sa"/>
</scope>
<value>Dīrghāgama</value>
</name>
</topic>

Now I would like to know primarily how I can debug this, so that I know if
it is a problem with the way I generate the topic map or if there is
something wrong with the writer. If you could give me some hints how to
deal with this, that would be great.

All the best,

Christian

--
Christian Wittern, Kyoto

Lars Heuer

unread,
Nov 2, 2010, 5:29:58 AM11/2/10
to Christian Wittern
Hi Christian,

> Still kicking the tires and trying to get a grip on topicmapping with
> mappa:-)

:)

[...]


> line 169, in _write_topic
> write_name(name, pos)

[...]
> self._write_type(name)
[...]


> line 131, in _index
> return self._tmc2id[construct]
> KeyError: <mappa.backend.mem.model.Topic object at 0x1020dc450>

The error says that the topic cannot be found in the index. It could
be a bug in the CXTM writer or a bug in Mappa.

The CXTM writer generates an internal index which maps all
topics/associations to a number and self._tmc2id[construct] should be
a safe operation since all topics and associations should be part of
that dict.

[...]


> To me this looks like something falls over while writing a topic name. When
> I look at the generated file, it is in the middle of a topic like this:

> <topic number="512">
[...]


> <value>(Chang a han jing)</value>

That's strange. Given the error above, the CXTM writer cannot find the
index number of the default name type. But I guess that topic number
512 is not the only topic in the map which has a name with the default
name type and the CXTM tests also have names with the default name
type and I've never encountered that error with the CXTM tests.

Do you get that error always with topic number 512 or does the number
vary?

> Now I would like to know primarily how I can debug this, so that I know if
> it is a problem with the way I generate the topic map or if there is
> something wrong with the writer.

That's a good question. The XTM result seems to be fine. So it could
be a bug in Mappa (i.e. a problem with __eq__ and __hash__ and the
internally generated construct id) or a bug in the writer. It could
also be a bug in the remove_duplicates code since it is automatically
invoked by the CXTM writer. And remove_duplicates may merge
duplicates, so it could be a bug in the merging code... .

If you could check if the topic number changes we should be able to
find the bug. You could import the topic map 10 times and try to
serialize the topic map after each import as CXTM.

Best regards,
Lars
--
Semagia
<http://www.semagia.com>

Christian Wittern

unread,
Nov 2, 2010, 7:15:44 AM11/2/10
to ma...@googlegroups.com
On 2010-11-02 18:29, Lars Heuer wrote:
> To me this looks like something falls over while writing a topic name. When
>> I look at the generated file, it is in the middle of a topic like this:
>> <topic number="512">
> [...]
>> <value>(Chang a han jing)</value>
> That's strange. Given the error above, the CXTM writer cannot find the
> index number of the default name type. But I guess that topic number
> 512 is not the only topic in the map which has a name with the default
> name type and the CXTM tests also have names with the default name
> type and I've never encountered that error with the CXTM tests.
>
> Do you get that error always with topic number 512 or does the number
> vary?
No, the error is always at the same place. However, the previous topics are
of a different type and usually have only one name, while this one has a few
different names in different scopes. So it could well be something that
occurs here for the first time and is legitimally wrong (although not giving
a better error could still be considered a bug:-). What I am doing is not
just reading and writing, but the map I am trying to generate is constructed
based on some data and another map, which means that it is quite possible
that I am doing something unorthodox here.

> If you could check if the topic number changes we should be able to
> find the bug. You could import the topic map 10 times and try to
> serialize the topic map after each import as CXTM.
As I said, it is always the same place.

All the best,

Christian

Christian Wittern

unread,
Nov 2, 2010, 8:19:50 PM11/2/10
to ma...@googlegroups.com
Lars,

I have made some further investigation about the problem. It seems to occur
whenever a reference like this is in the file:
<topicRef href="#t-230246228842210489405376139533456743740"/>
and this has no topic it refers to, so there is no topic with this ID. This
seems to happen for example with scoping topics that are not otherwise
declared. In these cases, the XTM file is produced just fine, but the CXTM
writing process aborts with the error mentioned yesterday.

All the best,

Christian

--
Christian Wittern
Institute for Research in Humanities, Kyoto University
47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN

Lars Heuer

unread,
Nov 3, 2010, 4:55:18 AM11/3/10
to Christian Wittern
Hi Christian,

> I have made some further investigation about the problem. It seems to occur
> whenever a reference like this is in the file:
> <topicRef href="#t-230246228842210489405376139533456743740"/>
> and this has no topic it refers to, so there is no topic with this ID.

Thanks for your investigations. The above mentioned topic reference
occurs in the source? Is it a XTM 2.0 or a XTM 2.1 source? Or do you
create the topic via the API?

Usually a "real" topic declaration like

<topic id="#t-230246228842210489405376139533456743740">
[...]
</topic>

shouldn't be necessary. Mappa should create a topic with the
identifier even if it is "just" mentioned in a topic reference.

> This seems to happen for example with scoping topics that are not
> otherwise declared.

That shouldn't be necessary.

Lars Heuer

unread,
Nov 5, 2010, 12:06:51 PM11/5/10
to Lars Heuer
Hi Christian,

[...]


> Thanks for your investigations. The above mentioned topic reference
> occurs in the source? Is it a XTM 2.0 or a XTM 2.1 source? Or do you
> create the topic via the API?

I tried to reproduce the error but failed. I created XTM sources with
name types / themes that do not have a <topic/> element. Works. I did
the same via the API. Works.

Do you have any hints how I may be able to reproduce the failure?

Christian Wittern

unread,
Dec 13, 2010, 12:15:27 AM12/13/10
to ma...@googlegroups.com
Hi Lars,

I had to spend a few weeks on completely different projects, but hope to be
able to return to topicmapping in the next few days. I did find a work
around to the problem mentioned here, so did not further investigate it, but
I will keep my eyes open.

In the mean time, I have a completely different question: Are there any
plans to add backends to Mappa? It would be nice to have some kind of
persistent storage for the maintenance of the topic maps managed through
mappa. Would a JSON based storage like CouchDB an easy solution, maybe
superior to the RDBMS based systems offered e.g. by Ontopia? Or do you have
any other plans for adding a backend?

All the best,

Christian


--

Reply all
Reply to author
Forward
0 new messages