The latest release of Cockroach Labs of CockroachDB focuses on making things simpler for developers to implement and manage distributed databases. It provides new SQL syntax that can be entered on the command line for controlling the latency and availability of data for various regions deployments. This is a 21.1 release, and it is announced for general availability today. So the company issues two major releases in a year (spring and fall).
Back in January this year, the company discussed how it would use the USD 160 million E funding round. The new release is a follow-up to the promise made by CEO Spencer Kimball to Big Data Pro Andrew Brust in a post back then.
There are many approaches for relational transaction databases in the cloud, but most rely on a single write master. However, new entrants have arrived, such as Yugabyte in the last few years and, more recently, TiDB.
The story previous to this is that distributed transaction databases are inherently complex. It is the essential and only thing to commit when you have a single master. The major challenge that confronts global reads and writes is that they require extraordinary measures for maintaining ACID. Needless to say, the data will be laid out based on the expected usage patterns.
For instance, if a database is deployed throughout two or more world regions, where do you persist the data? Do you segregate the data by region and keep local data there, or would you have all records replicated worldwide? There is a reason for doing it, either way, depending on the usage. On one side, if most of the writes or updates are confined to data linked with specific regions, then for performance, that data should be partitioned and sustain in its region.
The same rule will apply if the countries of a region require data to reside within the national borders. If read and write patterns are expected to be global in the rest of the cases, data should be replicated to a few or all global regions.
In the new release, developers and DBAs (Database Administrators) can write SQL statements. It will initially specify which clusters and regions the database is to operate. The developer will then identify which region(s) the database will work and geographically where data resides in the other SQL command.
In the end, the developer will mention survivability, which is all about designating regions and clusters for disaster recovery. With the help of new features, the developers who know SQL can lay out CockroachDB without upskilling on particular configuration statements.
A developer-friendly feature is standardizing on standard JSON formats for log data from CockroachDB to be easily ingested into observability tools, namely DataDog or NewRelic. The new release also makes query optimization and debugging simpler with EXPLAIN statements generated by CockroachDB’s query optimizer that developers can use to fix or tweak queries.
Something that came before CockroachDB is Google Cloud Spanner. The distributed databases were mainly the domain of NoSQL key-value stores because of the more relaxed requirements for database consistency.
During the pandemic, the digital business acceleration has led to an increase in new apps and use cases demanding the efficiency of global reads and writes. Now, the dominance and unique features of CockroachDB 21.1 are moving forward to reduce the complexity barriers for developers.
The wait is for the next move when Cockroach Labs makes these new SQL capabilities governing latency and availability into low visual code/no-code tools.