Many of us find the need to do serious data refining, so I’m always happy to find a new tool to have a play with. This time it’s Google who have released code free of charge that will help polish, refine and tidy your data.
You can split data, export in a multitude of formats, reconcile errors, change formats all within the one program. It’s particularly powerful when it comes to dividing CSV files and making new columnar data. Google’s Refine program allows export to the Freebase Open Data Store which can pull all sorts of data in to your project and cleverly append it.
Have a peek, there’s a few useful videos to get you started:
https://code.google.com/p/google-refine/
It runs nicely in your browser from a single executable, and the extracted files are only 48MB in size.