DKPro Core 1.2.0 released

3 views
Skip to first unread message

Richard

unread,
Oct 4, 2011, 4:28:09 PM10/4/11
to dkpro-core-announce, dkpro-c...@googlegroups.com
We are pleased to announce the release of DKPro Core 1.2.0 - a
collection of software components for natural language processing
(NLP) based on the Apache UIMA framework.

http://www.ukp.tu-darmstadt.de/software/dkpro/

The highlights in this release are:

* treetagger - improved stability and memory consuption
* io.wikipedia - completely new set of readers now also
supporting revisions and discussions
* io.annis - support for writing the RelAnnis format
* io.negra - support for reading the Negra export format

In addition there have been a number of bug fixes and
enhancements. For a more complete overview see:

http://code.google.com/p/dkpro-core-asl/issues/list?can=1&q=label:Milestone-1.2.0
http://code.google.com/p/dkpro-core-gpl/issues/list?can=1&q=label:Milestone-1.2.0

DKPro Core consists of a number of pre-processing components for NLP
tasks, often wrapping existing libraries or tools for easy use in an
UIMA pipeline.

* tokenization/segmentation
* compound splitting (Banana Split, JWordSplitter)
* stemming (Snowball)
* part-of-speech tagging (TreeTagger)
* parsing (Stanford Parser)
* language identification (TextCat)
* spelling correction (Jazzy)
* IO support for various data types (text, XML, PDF, WSDL,
Wikipedia, ...)

A basic UIMA type system is provided with which all of the components
work out-of-the-box. Some components can be configured for use with
other type systems.

DKPro Core builds heavily on uimaFIT, making use of features such as
injection of configuration parameters and automatic type detection.
Because using DKPro in Java code with uimaFIT is so easy, we do not
provide traditional UIMA XML descriptors for our analysis engines,
readers and consumers - only for the type systems.
We offer two sets of components with DKPro Core:

* DKPro Core ASL provides components under the Apache Software
License 2.0
http://code.google.com/p/dkpro-core-asl
* DKPro Core GPL provides components under the GNU Public License
3.0
http://code.google.com/p/dkpro-core-gpl

Currently DKPro Core is meant to be used with Apache Maven. We host a
public Maven repository containing DKPro Core ASL, DKPro Core GPL and
all their dependendies. You can also obtain JARs for individual
components from that repository. With future releases of DKPro, we
may
add the option of a downloadable archive.

This project was initiated by the Ubiquitous Knowledge Processing Lab
(UKP) at the Technische Universität Darmstadt, Germany under the
auspices of Prof. Dr. Iryna Gurevych. All former and current member
of
the UKP Lab have contributed in code, as testers or in spirit to this
project. It constitutes an essential cornerstone for our research
environment at the UKP Lab.

DKPro Core requires Java 1.6, UIMA 2.3.1 and uimaFIT 1.2.0 (amongst
other component-specific dependencies).

An introduction to DKPro Core can be found at

http://code.google.com/p/dkpro-core-asl/wiki/MyFirstDKProProject

Please direct any questions or suggestions to

dkpro-c...@googlegroups.com

Best,

Richard

--
-------------------------------------------------------------------
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab
FB 20 Computer Science Department
Technische Universität Darmstadt
Hochschulstr. 10, D-64289 Darmstadt, Germany
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
ecka...@tk.informatik.tu-darmstadt.de
www.ukp.tu-darmstadt.de
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------
Reply all
Reply to author
Forward
0 new messages