Projects / libTextCat


Libtextcat is a library with functions that implement the classification technique described in Cavnar & Trenkle, "N-Gram-Based Text Categorization". It was primarily developed for language guessing, a task on which it is known to perform with near- perfect accuracy. Considerable effort went into making this implementation fast and efficient. The language guesser processes over 100 documents/second on a simple PC, which makes it practical for many uses.

Operating Systems

Recent releases

  •  05 Dec 2003 19:03

    Release Notes: A long overdue autoconfig script has been added.

    •  20 May 2003 13:35

      Release Notes: The distribution now contains Gertjan van Noord's language models for the automatic recognition of over 70 languages. The makefiles were cleaned up to make them more portable.


      Project Spotlight


      A Fluent OpenStack client API for Java.


      Project Spotlight

      TurnKey TWiki Appliance

      A TWiki appliance that is easy to use and lightweight.