There are three characteristics that a database can be “goodest” at. The three are:
1) Consistency - How quickly can your dB be relied upon to show the most up to date changes.
2) Availability - The reliability of your dB to respond to requests.
3) Partition Tolerance - How well your dB can handle a clustered system.
For reasons that I don’t have time to explain (i.e. I don’t know) any one database can only be really good at 2 of the 3. This makes dB selection very critical depending on the requirements of your project. It’s sort of like Heisenberg's uncertainty principle in quantum mechanics where you can’t know the position and velocity of a particle… ah never mind.
Since we’re talking about big data here, number 3 (Partition Tolerance) has to be a given. So the choice is really only between the other two.
I find examples easier to grasp:
Facebook: Will the world come to an end if you don’t get to see your brother’s new baby pictures for 5 seconds after they’re posted? Probably not. So in this case one could give up a little on Consistency.
E*Trade: If you buy/sell a stock online, it’s pretty imperative that that trade is successful. In this case Availability would be critical, while Consistency (seeing the latest stock numbers immediately) could take a bit less of a priority.
The choice is yours my pretties...