Skip to Main Content

DPDS Social Sciences Workshop Resources


Email this link:

Working with OpenRefine

Objectives:

1. Create a new OpenRefine project from a CSV file.
2. Understand potential problems with file headers.
3. Use facets to summarize data from a column.
4. Use clustering to detect possible typing errors.
5. Understand that there are different clustering algorithms which might give different results.
6. Employ drop-downs to remove white spaces from cells.
7. Manipulate data using previous steps with undo/redo.

Key Points:

1. OpenRefine can import a variety of file types.
2. OpenRefine can be used to explore data using filters.
3. Clustering in OpenRefine can help to identify different values that might mean the same thing.
4. OpenRefine can transform the values of a column.

File Types

OpenRefine can import a variety of file types:

  • Tab separated (tsv)
  • Comma separated (csv)
  • Excel (xls, xlsx)
  • JSON
  • XML
  • RDF as XML
  • Google Spreadsheets

Recommended Guides

Online Tutorials