The Free Discovery Project
Machine Learning for Everyone
Free Discovery is an open-source Web service tier that gives GUI developers an easy way to access complex analytics without having to deal with all the complexities of a machine learning/artificial intelligence package. FreeDiscovery is built on top of existing machine learning libraries (scikit-learn) and exposes a REST API for information retrieval applications.
FreeDiscovery Engine
provides a REST API for information retrieval applications
FreeDiscovery Core
a Python package that aims to extend scikit-learn
Powerful Features
Easy document categorization
Put sets of documents into categories, e.g., “medical” or “legal.” If you already have some documents categorized, you can use FreeDiscovery to teach the machine to categorize new ones.
Document clustering
Place sets of documents into natural clusters, and FreeDiscovery will organize them into groups of related documents. Unlike other clustering algorithms, many of which require you to figure out in advance how many clusters exist, FreeDiscovery simply clusters your collection and generates a logical name for each cluster. This is useful when you have a large collection and want to know what’s in it.
Duplicate detection
Duplicate detection will identify duplicates in your collection, and it does so in a smart way (it doesn’t have to be a 100% duplicate).
.
E-mail threading
If you have a group of e-mails that includes various conversations, this algorithm will identify the conversations.

Free white paper – find out why most popular e-Discovery algorithms fail
E-Discovery vendors often use tools that apply only one method for categorizing text. We investigated these tools to find out how well they work and found wide variations in performance. Learn more, by requesting a FREE white paper.
Download the FREE White Paper:
Effectiveness Results for Popular e-Discovery Algorithms
Originally Presented at the 2017 International Conference on Artificial Intelligence & Law at King’s College in London.