dk-pro cassis is not recognizing the POS of the tokens that I have stored

SD

unread,

Feb 28, 2020, 10:33:15 AM2/28/20

to dkpro-core-user

I have the following code:

from cassis import *

with open('typesystem.xml', 'rb') as f:
    typesystem = load_typesystem(f)

with open('cas.xml', 'rb') as f:
    cas = load_cas_from_xmi(f, typesystem=typesystem)


for sentence in cas.select('de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence'):
    for token in cas.select_covered('de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token', sentence):
        print(token.get_covered_text())
        print(token.pos)

which prints the following output:

This
None
is
None
a
None
simple
None
test
None
sentence
None
.
None
Here
None
comes
None
another
None
one
None
.
None
I
None
went
None
to
None
the
None
park
None
yesterday
None
,
None
it
None
was
None
beautiful
None
.
None

I have tried many variations and I can't figure out how to make it print the POS values. I have attached the typesystem.xml and cas.xml files. I'm sure it's easy but I have been stuck in this for a long time. Any help is greatly appreciated!

cas.xml

typesystem.xml

Jan-Christoph Klie

unread,

Feb 29, 2020, 4:32:17 AM2/29/20

to dkpro-core-user

Hello,

the problem is that your XMI contains two views. One has POS and the other does not. What works for me is using the TargetView:

from cassis import *

with open('typesystem.xml', 'rb') as f:
    typesystem = load_typesystem(f)

with open('cas.xml', 'rb') as f:
    cas = load_cas_from_xmi(f, typesystem=typesystem)

view = cas.get_view("TargetView")

for sentence in view.select('de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence'):
    for token in view.select_covered('de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token', sentence):
        print(token.get_covered_text(), token.pos.PosValue)

You can see the views in the XMI here:

<cas:Sofa xmi:id="12" sofaNum="1" sofaID="_InitialView" mimeType="text"
          sofaString="This is a simple test sentence.&#10;Here comes another one.&#10;&#10;I went to the park yesterday, it was beautiful.&#10;"/>
<cas:Sofa xmi:id="300" sofaNum="2" sofaID="TargetView" mimeType="text"
          sofaString="This is a simple test sentence.&#10;Here comes another one.&#10;&#10;I went to the park yesterday, it was beautiful.&#10;"/>

It looks like the tokens and sentences are duplicated in both.

Thanks for using cassis!

Best,

Jan

SD

unread,

Mar 2, 2020, 8:34:58 AM3/2/20

to dkpro-core-user

Thank you so much! It works. I must admit that I do not find this anywhere in the DKPro-cassis documentation. Where can I read about TargetView and other similar options?

El dissabte, 29 febrer de 2020 10:32:17 UTC+1, Jan-Christoph Klie va escriure:

Jan-Christoph Klie

unread,

Mar 2, 2020, 8:56:26 AM3/2/20

to dkpro-core-user

Views are a concept of UIMA. I do not know what tools you used to create the XMI but that tool created the views for you. The documentation in cassis is just about how to handle them and can be e.g. found in https://github.com/dkpro/dkpro-cassis#managing-views

Am Freitag, 28. Februar 2020 16:33:15 UTC+1 schrieb SD:

Reply all

Reply to author

Forward