Fotolia

ScyllaDB set to improve NoSQL database performance

ScyllaDB has been helpful for big organizations such as Comcast, which was able to trim from nearly 1,000 database nodes to 78 ScyllaDB nodes and now has more room to grow.

Database performance is one of the reasons some large organizations choose the open source ScyllaDB database platform. The startup database vendor introduced new features to accelerate performance and optimize the open source database platform.

While ScyllaDB develops its own technology, one of its primary use cases is as a drop-in replacement for the open source Apache Cassandra NoSQL database, which is used in large scale data deployments.

Since its inception in 2015, ScyllaDB has been offering the promise of better performance at scale, while remaining compatible with Cassandra. The need for improved performance is particularly important as organizations scale out data but want to do it without adding more server or cloud resources. ScyllaDB is also riding the wave of increasing open source database use.

The vendor revealed the new features Nov. 5 at its Scylla Summit 2019 conference in San Francisco.

One of ScyllaDB's users is Comcast, which has used the NoSQL database to replace existing Cassandra deployments, with some dramatic efficiency gains.

Philip Zimich, senior director of development and engineering at Comcast, said his group went from having about 1,000 Cassandra servers to only 78 ScyllaDB servers, while improving overall availability and performance.

"We evaluated a lot of databases over the last few years," Zimich said. 'What we found is that ScyllaDB is the best fit for our real-time operations."

The faster I can reduce the process time, the more snappy the UI feels to the end user.
Philip ZimichSenior director of development and engineering, Comcast

Zimich's team at Comcast is responsible for the X2 DVR scheduler for recording media content, supporting 15 million accounts across the Comcast X1 network.

When a user wants to watch a recording, the listing for everything the user has recorded is saved in the cloud with all its associated metadata. Zimich's team is responsible for maintaining and then serving up data to the users as needed. In terms of scale, the platform processes 2.4 billion transactions per day, making every bit of incremental performance gain important to achieve for Comcast and its users.

"The faster I can reduce the process time, the more snappy the UI feels to the end user," he said.

ScyllaDB incremental compaction

Among the new features ScyllaDB unveiled at Scylla Summit 2019 is Incremental Compaction Strategy, which reduces storage requirements. The capability could be useful to Comcast in the future, but in the short term it's not something Zimich said he needs.

With the Cassandra deployment, the nodes were at full capacity as well, enough that the media company would have needed to add many more nodes in the coming year to support the growing user base, Zimich said.

"With this migration, we have headroom for years," he said.

Image for Scylla Cloud from ScyllaDB
Interface for Scylla Cloud

New ScyllaDB features

In addition to the compaction feature, ScyllaDB introduced new Lightweight Transactions (LWT) capabilities. LWT can help support privacy and security, but the use cases are much broader than just security and privacy, ScyllaDB CTO and co-founder Avi Kivity said.

"LWT essentially ensures that data gets recorded simultaneously on all nodes of a Scylla cluster," Kivity said. "This ensures that any application querying the database receives the latest copy of the data, no matter which node it hits."

Without LWT, or something equivalent, different nodes in a cluster may disagree about the value of a particular record, as updates eventually propagate through the system. Kivity noted that LWT provide a stronger guarantee than a system without LWT that applications querying the database will always receive the latest information.

Another new feature coming to ScyllaDB is Change Data Capture (CDC), a tool to make it easier for applications to write changes to the database, and for technologies such as Kafka for streaming data, to get access to those changes. CDC records changes within the database itself, in a standard Cassandra Query Language (CQL) readable table, which consumers such as Kafka can subscribe to using standard CQL queries.

"Without CDC, an application would have to write an update twice: once to Kafka and once to the database," Kivity said.

ScyllaDB provides multiple versions of its software, including open source, enterprise (on-premises), and cloud editions.

The Incremental Compaction feature only will be in Scylla Enterprise and Scylla Cloud, but LWT and CDC will be released first in Scylla Open Source, then later in Scylla Enterprise.

"Lightweight Transactions, Incremental Compaction and CDC are already committed to the main branch of our codebase, and once they're ready, we'll provide more detail on the specific release versions they'll appear in," Kivity said.

Dig Deeper on Database management