The term “NoSQL” is very ill-defined. It’s generally applied to a number of recent nonrelational databases such as Cassandra, Mongo, Neo4J, and Riak. They embrace schemaless data, run on clusters, and have the ability to trade off traditional consistency for other useful properties. Advocates of NoSQL databases claim that they can build systems that are more performant, scale much better, and are easier to program with. — NoSQL Distilled, by Martin Fowler
RethinkDB is built to store JSON documents, and scale to multiple machines with very little effort. It has a pleasant query language that supports really useful queries like table joins and group by, and is easy to setup and learn.
Some characteristics of this new database system:
- every document must contain only JSON data;
- every document can be manipulated on the filesystem: the idea is to allow someone to edit a document using vim, for instance, without breaking anything. It also lets you easily backup, move or replicate your data without creating a heavy load on the database;
- the underlying filesystem can be changed: changing the underlying filesystem offers numerous possibilities, like easy distribution (using OpenAFS, XTREEMFS or others), replication (using finefs or others), and even attaching different backends (using FUSE, for example);
- the system must be available through an HTTP REST interface: this makes it very easy to immediately integrate the system into any existing applications. Bonus points if the API is compatible with CouchDB;
- queries must be performed using a MapReduce approach: this should make it very easy to perform queries on a large dataset.
So, right now I’m half way through it. I managed to write the backend and I’m on my way to the HTTP REST interface. I’ll create a github repo after I have something functional that can be easily built and tested.
What do you think of a system like this? Any features you’d like to add or change?