sentence detection & line return

6 views
Skip to first unread message

Franck D

unread,
Apr 29, 2019, 10:08:03 AM4/29/19
to webanno-user
Hello

We work on text from young pupil and they don't  master the punctuation.
It's seems Webanno detects sentences form ". Upper" (is this right ?).

So is there a way to change that ?

With scripts ? (could you please point me to a documentation on how to use scripts in annotation ?).

Thanks, regards,

Richard Eckart de Castilho

unread,
Apr 29, 2019, 10:21:43 AM4/29/19
to Franck D, webanno-user
Hi,

WebAnno uses the Java BreakIterator. The exact rules are determined by the Java runtime that you are using since this is a platform class. You can look at [1] for what the rules may look like.

If you need control over the segmentation process, you best segment before importing into WebAnno. For your case, you might run some tool (or manually your files) to put one sentence on a line and then import in the "Plain text (sentence)" (name similar) format which treats every line as a sentence.

Cheers,

-- Richard

[1] http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/sun/text/resources/BreakIteratorRules.java#l289
Reply all
Reply to author
Forward
0 new messages