3 min read

RethinkDB

Martin Fietkiewicz
Written by:
Martin Fietkiewicz

RethinkDB is a free and open-source scalable JSON database management system written in C++ which can be used by realtime web applications that require continuously updated query results.

RethinkDB makes use of a custom query language called ReQL which offers a way of manipulating JSON documents, and supports table joins, aggregation functions, and mixing queries with JavaScript expressions and map-reduce functions.

Potential use case of RethinkDB:

  • Streaming analytics applications

  • Multiplayer games

  • Social networks

RethinkDB is NOT a good choice if: 

  • You need full ACID (Atomicity, Consistency, Isolation, Durability) support.

  • You’re doing deep, computationally intensive analytics.

RethinkDB data management GUI with ReQLPro

RethinkDB was created in 2009 and open-sourced in 2012 with the first version being an SSD-optimized storage engine for MySQL, later changed to a document DBMS similar to MongoDB. The first production-ready release of RethinkDB was in 2015, and it provided support for the JSON data model, immediate consistency, sharding, failover, among other features.

On October 5, 2016, the company announced it was shutting down and would no longer offer production support because they could not build a sustainable business. On February 6, 2017, Cloud Native Computing Foundation purchased the rights to the source code and licensed it under the Apache License 2.0

Currently, RethinkDB is the second most popular database on GitHub, and it has lots of interest and support from the developer community.

Copy link to clipboard

RethinkDB Core Components

RethinkDB uses the primary key attribute in a table (defaulting to using the id if the primary key is not specified) in order to index any record added to the table. If the table does not have a primary key, a random unique ID is generated for indexing automatically. The primary key is used by RethinkDB to place the document in the correct/appropriate shard and index within that shard using a B-tree data structure. Fetching data using the primary key is efficient because the query can be directed to the right shard and then the document can be looked up in the B-tree.

Note: Sharding is the process of breaking up large tables into smaller chunks.

Copy link to clipboard

RethinkDB Client Drivers

The RethinkDB client drivers are responsible for:

  • Opening a connection.

  • Performing a handshake.

  • Serializing the queries.

  • Sending the message to the server using the ReQL protocol.

  • Receiving response and returning to the calling application.

RethinkDB has several client drivers for different languages, some of which are supported internally by the RethinkDB team, and the others through community support.

Copy link to clipboard

RethinkDB Query Language

The RethinkDB query language offers a way of manipulating JSON documents. It was built on three principles

  • It embeds into your programming language.

  • It is chainable.

  • It executes on the server.

Example of a chainable query:

from rethinkdb import RethinkDB
r = RethinkDB()
conn = r.connect()
r.table('employees').pluck('email').distinct().count().run(conn)
Copy link to clipboard

Concurrency Control

RethinkDB makes use of block-level multi-version concurrency control (MVCC). When a write operation occurs while a read operation is being worked on, RethinkDB takes a snapshot of the B-Tree for each relevant shard and temporarily maintains different versions of the blocks in order to execute read and write operations concurrently.

RethinkDB query execution flow


Join the Ably newsletter today

1000s of industry pioneers trust Ably for monthly insights on the realtime data economy.
Enter your email