Projects / DKPro WSD


DKPro WSD provides UIMA components which encapsulate corpus readers, linguistic annotators, lexical semantic resources, WSD algorithms, and evaluation and reporting tools. You configure the components, or write new ones, and arrange them into a data processing pipeline. DKPro WSD is modular and flexible. Components which provide the same functionality can be freely swapped. You can easily run the same algorithm on different data sets, or test several different algorithms on the same data set.

Operating Systems

Recent releases

  •  16 Jun 2014 16:19

    Release Notes: Evaluators now permit chaining of backoff algorithms. There are now annotators that allow for disambiguating the complete text collectively. There is now a weighted MFS baseline. The sense cluster evaluator now computes McNemar's test. The sense cluster evaluator now handles the case where there are multiple gold-standard senses, and includes undisambiguated instances in the confusion matrix. Bugs were fixed.

    •  29 Nov 2013 17:01

      Release Notes: New features include support for the IMS disambiguator, a new sense inventory wrapping the GermaNet Java API, and a new wrapper module for easy disambiguation of text strings. The WebCAGe reader now works with the official release of WebCAGe. The SemCor reader optionally writes Token, Lemma, and POS annotations. Readers of XML-based data sets can now optionally ignore the DTD. The cluster evaluator's output is more verbose and informative. There are also a few bugfixes and API changes.

      •  14 Oct 2013 14:21

        Release Notes: Upgraded to DKPro Core 1.5.0, uimaFIT 2.0.0, UBY 0.4.0, and TWSI 1.0.1. Adds a module for word sense induction. Moves Wikipedia-specific graph algorithms to a separate module.

        •  30 Sep 2013 10:30

          Release Notes: This is the initial public release.


          Project Spotlight


          A Fluent OpenStack client API for Java.


          Project Spotlight

          TurnKey TWiki Appliance

          A TWiki appliance that is easy to use and lightweight.