No Token annotations when using the SemCorXMLReader

11 views
Skip to first unread message

Torsten Zesch

unread,
Nov 15, 2013, 4:15:51 PM11/15/13
to dkpro-w...@googlegroups.com
Hi,

I am trying to use the SemCorXMLReadera a bit outside of the specification, i.e. not for WSD, but I just want to read the SemCor corpus in a DKPro Core pipeline.
It seems it creates Sentence annotations, but no Tokens.
Is this the expected behaviour?

-Torsten

Tristan Miller

unread,
Nov 18, 2013, 4:50:54 AM11/18/13
to dkpro-w...@googlegroups.com
Greetings.
Yes, this reader produces Paragraph and Sentence annotations, but not
Token ones. I don't recall why; perhaps we should update it so that it
produces Tokens as well. Should be a very easy change to make.

Regards,
Tristan

--
Tristan Miller, Research Scientist
Ubiquitous Knowledge Processing Lab (UKP-TUDA)
Department of Computer Science, Technische Universität Darmstadt
Tel: +49 6151 16 6166 | Web: http://www.ukp.tu-darmstadt.de/

signature.asc

Torsten Zesch

unread,
Nov 18, 2013, 6:57:27 AM11/18/13
to dkpro-w...@googlegroups.com
As I am using that outside of WSD anyway, would it make sense to migrate the new version to DKPro Core and let the WSD version use the one from there + WSD specific parts?
Or maybe there is little need to have a generic reader for that format anyway?

-Torsten

Tristan Miller

unread,
Nov 18, 2013, 11:45:49 AM11/18/13
to dkpro-w...@googlegroups.com
Greetings.

On 18/11/13 12:57 PM, Torsten Zesch wrote:
> As I am using that outside of WSD anyway, would it make sense to migrate the new version to DKPro Core and let the WSD version use the one from there + WSD specific parts?
> Or maybe there is little need to have a generic reader for that format anyway?

Per our subsequent conversation I've logged a feature request for this
at <https://code.google.com/p/dkpro-wsd/issues/detail?id=49>.
signature.asc
Reply all
Reply to author
Forward
0 new messages