Google Refine: Power tool for working with messy data, cleaning it, transforming it and extending it
Google recently announced the release of Google Refine version 2.0, which was called Gridworks before Google's acqusition of Metaweb. We've been using Gridworks for a while now, it's a great power tool for working with messy data, cleaning it up, transforming it and extending it, but best of all it's a free web based software so you can run it on your desktop and access it via a web browser.
For example, you can import Excel or CSV files into Google Refine and the tool will help you to merge similar values to maintain data consistency. You can also use it to split columns into multiple columns based on certain separators or even create expressions to manipulate the data. Sure, a lot of these things you can probably get done in Excel, but Google Refine will definitely save you a of lot time once you get the hang of the interface.
And since Google Refine is an open source project, developers are allowed to create their own extensions to make the tool even more powerful which means there will be more and more extensions available on the web for Google Refine users soon so watch this space!
