April 10, 2010
15 notes
NoSQL is About…
To my fellow NoSQL-istas, present and future as well as everybody else who is interested:
Every time I read about somebody “having learned about NoSQL” at a conference or meet-up and they took away that “NoSQL is about X”, boy do I have a low opinion about the presenter (not pointing fingers here, I’m sure I’m just as guilty).
NoSQL is not about any one feature of any of the projects. NoSQL is not about scaling, NoSQL is not about performance, NoSQL is not about hating SQL, NoSQL is not about ease of use, NoSQL is not about sharding, NoSQL is not about throughput, NoSQL is not about speed, NoSQL is not about dropping ACID, NoSQL is not about Eventual Consistency, NoSQL is not about CAP, NoSQL is not about open standards, NoSQL is not about Open Source and NoSQL is most likely not about whatever else you want NoSQL to be about.
The NoSQL world is tiny with a great future ahead (yeah, yeah, I have vested interest, but I wouldn’t be here if I didn’t believe in it). The worst we can do is confuse our users, people who have yet to learn about the benefits of NoSQL. We all need to find a consistent message and relentlessly promote this message as loud as we can. If we don’t do that, NoSQL might as well be called off and we all can go home.
Let me repeat: We need to make sure people understand what NoSQL is about in the simplest terms possible.
And we’re doing a pretty bad job.
But, But…Bwahaahha, Mooommy, the Uncle Says I’m Doing a Bad Job!
All NoSQL projects have their respective features. Their developers will do best to promote their project by highlighting what’s awesome about the project (and boy are there awesome things going on). But every time his or her audience takes away that NoSQL is about any one of the features of a single project we as a movement lose. Big time.
At this point Cassandra and HBase lead the pack with managing huge clusters, Riak is a great runner-up. In comparison, MongoDB’s and CouchDB’s capabilities are pale. And that is okay, they have different priorities. MongoDB drops ACID for speed. The CouchDB and Cassandra people grow grey hair over such statements; on the other hand MongoDB seems crazy fast. CouchDB trumps everyone with peer-to-peer replication and a reliable simple storage model, and boy does it handle concurrency well. Scalaris keeps transaction in NoSQL-land; distributed transactions (you heard me). CAP says we’re all fucked (except Mike Stonebraker, apparently). Redis rocks the world with remote data structures and atomic operations over them plus being one of the kick-assest Open Source projects out there. Graph traversal? Hello Neo4j. CouchDB laughs at the rest for not being fully based on open web standards, and they laugh right back because HTTP has more overhead than traditional binary database protocols.
I could go on forever. All pro-points are the result of somebody choosing a trade-off. This is not about listing all trade-offs or listing all NoSQL projects or claiming anybody’s choices are better.
This brings me to the point about what NoSQL is. For me, NoSQL is about choice: The possibility to choose a storage system that fits your project’s needs instead of sticking to a one-size-fits-all-solution that really isn’t one.
NoSQL is about choice.
I’m not saying this should be the message the NoSQL community agrees on, although this is my vote for it. Instead, I want to invite and encourage anyone with any stake in the NoSQL game to think about what NoSQL means for you and how we as a community can agree on a message that gets across what NoSQL is about without losing ourself in turf wars. Offering choice is about demonstrating open-mindedness. We’ll never get there if we each confuse our respective users. We need to get a lot better at educating our potential users about our ideals.
Thanks, Jan