Nmedia - Fotolia

CockroachDB database wagered on for scalability boost, GDPR

CockroachDB is an open source distributed database designed for processing on a global scale, a need that online gambling company Kindred Group hopes it can meet.

Fans of the simple life found relief of late as a once torrential stream of new distributed databases seemed to slow. But the spigot is still on. As new requirements percolate, additional databases continue to bubble up. New requirements can include global transaction support, the Global Data Protection Regulation and more.

Take CockroachDB from Cockroach Labs as an example. Early implementer Kindred Group plc has been testing out this whimsically named distributed SQL database system -- open source software fashioned on the model of Google's Spanner technology -- as it seeks to achieve the high levels of scalability required for an online gambling business that spans the planet.

Like other companies, Kindred is pursuing a global business strategy. That can mean, for example, taking bets in real time on sundry wagers made by punters in Australia who are watching tennis matches in France. The need to support that distant user has implications for cloud architecture and database choices.

"If you are betting on the next point in a tennis match, it's not possible to run that from a single data center," said Will Mace, head of U.K.-based Kindred Futures, an innovation unit within Kindred Group. Instead, he said, you tend to move the database closer to the user.

When Mace came on board four years ago, Kindred was reviewing its technology platform to see how it fit in with the company's global growth plans. The study confirmed the need for new approaches to data handling.

"The business was based out of a single data center," he said. "It became clear it was not going to support the commercial targets."

Mace and his team were aware of globe-spanning Google Spanner, one of a new breed of distributed SQL database systems grouped together under the NewSQL technology umbrella. Initially described by Google in a 2012 research paper, Spanner was built to support large-scale online transaction processing, SQL and transactional consistency across multiple cloud availability regions.

High-speed data engine swap

Google's database seemed like a means to move transactions closer to the users, wherever they might be. But Spanner wasn't commercially available when Mace's team set about to switch out an existing relational database, which he declined to identify. Moreover, Spanner's use is limited to the Google Cloud, where it's offered in the Google Cloud Spanner service that was launched in May 2017.

It's not useful having transactions run around the world and back again. It's better to put the data center closer to the customer.
Will Macehead of Kindred Futures

So, Kindred began considering CockroachDB instead. Cockroach Labs built its database with some similarities to Spanner, open sourced it and has gone about supporting multiple cloud and on-premises implementations.

"It allows us to run our services and optimize the performance for our customers wherever they are in the world," Mace said. "It's not useful having transactions run around the world and back again. It's better to put the data center closer to the customer while still being able to replicate the data across different data centers, and keep the database in sync in the different places."

Mace said his team has been working very closely with the Cockroach Labs crew in something of a partnership to roll out a transactional system that can handle online bets. But the move to a new operational database is a delicate one.

"It's like swapping out the engine of a car while it's going 100 miles per hour," he said, adding that the effort is currently in a test environment.

GDPR, data and jurisdictions

CockroachDB has been in production for over a year, according to Cockroach Labs CEO Spencer Kimball. He described it as a cloud-native technology, and said CockroachDB's development team views the distributed SQL database as an elastically expandable system with a shared-nothing, symmetric architecture that can "heal itself autonomously."

Picture of Spencer Kimball, Cockroach LabsSpencer Kimball

According to Kimball, CockroachDB takes its moniker from the irrepressible members of the urban phylum Arthropoda that are often found in places like New York City, which is Cockroach Lab's home base. The gritty species' enduring resilience is seen as a plus -- thus, the name.

CockroachDB's dialect of SQL is based on the one in PostgreSQL, a widely used open source relational database management system. In addition, CockroachDB uses the PostgreSQL wire protocol to manage communication between front-end clients and back-end servers. That lets developers use PostgreSQL client drivers to connect applications to the distributed database, according to Cockroach Labs. The company said it chose PostgreSQL's protocol over the protocol built into MySQL, another prominent open source database technology that's now owned by Oracle, because the PostgreSQL one is better documented and available under a more liberal open source license.

Kimball said geo-replication, which enables database architects to maintain data closer to users to, in turn, reduce latency, was a key feature of CockroachDB 1.0. The recently released CockroachDB 2.0 offers geo-partitioning, which enables developers to create policies to assign the location of a user's data. It also aligns nicely with one of the hot topics of the day: General Data Protection Regulation (GDPR) compliance requirements in the European Union (EU).

Geo-partitioning of a distributed database system has benefits in this regard, according to Kindred's Mace. It's important as the EU starts to enforce the GDPR requirements that will affect the collection and processing of personal data not only in Europe, but also throughout much of the world.

"We have a desire to keep data in the jurisdiction within which it is given to us," Mace said. With the updated version of CockroachDB, he added, Kindred is able to apply a particular location's specific data restrictions to the relevant data.

Dig Deeper on Database management