Weka: Data Mining with Java
Weka is a collection of
machine learning algorithms for data mining tasks. Weka includes tools for data
pre-processing, classification, regression, clustering, association rules and
visualization.
MinorThird
MinorThird is a an open-source
collection of Java classes for storing, categorizing and annotating text, and
for learning to extract entities.
MinorThird offers a toolkit of learning methods which are tightly integrated
with other tools for annotating text, both manually and programmatically. It
also offers visualizing both training data and the performance of the various
classifiers.
SecondString
SecondString is another
open-source package from CMU Professor William W
Cohen that provides a collection of approximate string matching techniques.
|