Skip to Main Content

Digital Humanities

This guide provides an introduction to digital humanities (DH) theory and practice and an overview of DH methods, tools, and resources.

Email this link:

What is Text Mining?

Text Tools for Cleaning & Processing Data - Free

Free integrated workflow of pre-processing, analysis, and visualization tools for finding and exploring patterns in texts

OpenRefine is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.

This package is a Java implementation of probabilistic natural language parsers

Assigns parts of speech to words or tokens

Text & Data Mining Tools - Free

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.

Open source e-book library management application developed by users of e-books for users of e-books. It has a multitude of features including eBook format conversion.

Ngram Viewer lets you find and visualize how words and phrases have developed and been used over time using the 30 million print books Google has scanned working with libraries located around the world as its dataset.

Text Analyzer is a beta tool built by JSTOR Labs. With it, researchers can search for content on JSTOR just by uploading a document.

Text analysis, sometimes referred as text mining, is the automated process of understanding and sorting unstructured text, making it easier to manage. Text analysis tools are often used to gain valuable insights from social media comments, survey responses, and online reviews.

                  Screengrab from MonkeyLearn


Free software for your own search engine. Explorer for discovery of large document collections, media monitoring, text analytics, document analysis & text mining platform.

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

PLOS provides access to its article corpus and article meta-data (data about the article) in multiple ways. The preferred method of access depends on the use case.

Free open source software to analyze and process your texts visually.

Voyant Tools is an open-source, web-based application for performing text analysis.