Apache Cassandra 4.0 advances open source database

Among the new features in the open source Cassandra database update are virtual tables that help operators improve performance and optimize operations.

After three years of development, the open source Apache Cassandra 4.0 database is now generally available.

Apache Cassandra is a distributed NoSQL database that was originally developed at Facebook. In recent years, the open source technology has received support from multiple vendors that have also built commercial services for the database.

Cloud service providers that have Apache Cassandra database-as-a-service offerings include AWS and Microsoft, which entered the Cassandra segment in March 2021. Beyond the big public cloud providers, DataStax provides a database-as-a-service that is based on Cassandra. Other vendors that provide Cassandra as a service include Aiven, which raised $100 million in March, and Instaclustr.

Among the new features in Cassandra 4.0, unveiled July 26, is virtual tables, which provides a database table that is defined by an API, rather than data in a regular database table.

One of the applications for virtual tables is to show configuration information or database metrics.

Full Query Logging (FQL) is another new feature; it enables Cassandra database administrators to have a live log of queries from Cassandra Query Language requests. A key benefit of FQL is performance management because it helps administrators to tune the database.

Cassandra, in its pure open source version, is plenty capable of achieving our database performance goals.
Larry RobinsonCTO, Clear Capital

Cassandra 4.0 with real-world users

Real estate valuation company Clear Capital, based in Sacramento, Calif., currently uses Cassandra managed by Instaclustr as a critical data layer component for its property valuation platform.

Larry Robinson, CTO at Clear Capital, explained that the company's platform is a data-intensive system, storing about 2 billion valuations in Cassandra that must be indexed and analyzed on demand. A single new financial services customer might mean 100,000 more loans in a month, and that comes with a high volume of data that needs to be constantly queried.

"Cassandra, in its pure open source version, is plenty capable of achieving our database performance goals," Robinson said. "We've found we have the scalability and performance needed to reliably deliver analytical insights about the real estate market to our customers and continue to scale with confidence."

Looking specifically at Cassandra 4.0, Robinson said he is particularly keen on trying out virtual tables. He noted that instead of requiring Java Management Extensions to access settings and metrics in Cassandra, virtual tables in the 4.0 update enable settings and metrics to be queried directly, which improves management and database operations.

Apache Cassandra 4.0 features improve security

The virtual tables feature is also a highlight of the Cassandra 4.0 update for Stefan Miklosovic, a code committer to the Cassandra project and a senior software engineer at Instaclustr. Miklosovic said he expects virtual tables to be popular because they are specific to each database node and provide better visibility into the runtime internals of that particular node.

Screenshot of Cassandra 4.0 virtual tables.
Admins can use the virtual tables feature in Cassandra 4.0 to create a table for database configuration settings

Miklosovic said he is also interested in the FQL and Audit Logging capabilities in Apache Cassandra 4.0, as they bring new visibility to database operations that can also be important for security.

"Auditing and Full Query Logging are crucial and inseparable features to have in order to comply with security auditing and security policies for a lot of organizations," Miklosovic said.

Next Steps

Apache Cassandra 4.1 extends open source NoSQL database

Dig Deeper on Database management