Do you know Riak? a decentralized, internet-scale database
Fast, Reliable, and Scale
Probably, many people know Cassandra, HBase, CouchDB, and MongoDB. But, It is not that popular for people to know Riak, a relatively new NonSQL database.
1. Expected minimum users: 1 million. Design to accommodate 10 million by the end of the year and have a plan for scaling out to tens of millions. (This is the 1x 10x 100x rule of estimation of which I am a fan)
2. Expected amount of data stored per experiment: 1.2 TB
3. Expected peak traffic: approximately 75 GB per hour for two 8 hour periods following the conclusion of an experiment window. This two day period will result in collection of approximately 90% of the total data.
Remain highly available under load
4. Provide necessary validation and security constraints to prevent bad data from polluting the experiment or damaging the application
5. Provide a flexible and easy-to-use way for data analysts to explore the data. While all of these guys are great with statistics and thinking about data, not all of them have a programming background, so higher-level APIs are a plus.
6. Do it fast.
After comparing HBase, Cassandra, and Riak very carefully, they chose RIAK. That’s the simple answer. There is also another open source app called LUWAK dedicated for reading/writing large size of blocks to RIAK.