Amazon opens their real NoSQL database, awesome potential for data mining and business intelligence
Amazon have turned on access to Dynamo, which runs much of the infrastructure behind Amazon Web Services and provided the inspiration for Cassandra. All your data is hosted on SSDs (Solid State Drives) which means it should be pretty damn fast.
In tech terms this is a tuneable-consistency and throughput key-value store. That means you can dial up or down the resources (and cost) depending on your performance and consistency (whether or not the very latest data is available in your query) needs.
This kind of thing is useful in applications where you've got unknown and changing volumes of data turning up that you want to mine in real time, that will eventually grow to enormous volumes that you still need to be able to store and mine quickly. Web analytics is a good example: you don't know how many people will be hitting your web site, so you start with the default "5 writes/second" but can dial it up and back when there's peaks of load. If you use something like NodeDB to do the data collection and inserting, any backlog of data writes will just queue up until it gets in if there's momentary peaks in demand.
Querying these key-value stores isn't as easy as with relational databases, which have had over 40 years of development and can be very flexibly queried. The advantage they hold over relational databases is that they scale linearly. If a query on 100 records takes a second, a query of 100000 records will also take a second provided you throw 1000x the hardware at the database server. Relational databases are very hard to scale this way once you grow past small clusters of machines. With relational database clusters, ensuring each replica of the data is consistent takes up ever-increasing fractions of the resources, meaning the scaling curve is anything but linear.




