In this tutorial you will use Google Refine to clean our dataset and create some files that may be useful for the upcoming assignments. We will perform some cleaning together in class.
Getting Started
- Install Google Refine. You can install download and install from here:
http://openrefine.org/download.html
You should install the version called "OpenRefine 3.0" at the top of the page (not the beta version).
- Download the following dataset: UniversityData.csv
References
The documentation for Google Refine / Open Refine is available here.
There are also a set of nice introductory tutorials available on YouTube: Part 1, Part 2, Part 3
Here are helpful pointers to the Open Refine Expression Language
If you have not been in class, follow this tutorial after having watched the videos above: http://enipedia.tudelft.nl/wiki/OpenRefine_Tutorial