Gabi Moisa - Fotolia
FaunaDB distributed cloud database hunts transactional NoSQL
Startup Fauna has built a "relational NoSQL" database for distributed transaction processing. Data consistency and speedy builds attracted early user ShiftX.
As cloud platforms leader AWS pushes deeper into distributed cloud databases through added support for transaction processing workloads in DynamoDB, its NoSQL software, smaller vendors such as Fauna are also vigorously pursuing transactional NoSQL.
Led by a group that includes ex-Twitter engineers, Fauna has produced what it calls a "relational NoSQL database" -- FaunaDB.
FaunaDB joins such recent database entrants as Spanner-influenced CockroachDB, Cassandra-compatible ScyllaDB, and Cassandra-, Redis- and PostgreSQL-compatible YugaByte. These all differ in details but share a common mission to provide support for what could be called transactional NoSQL, something missing in many of the first-generation cloud-oriented databases.
Flaws in NoSQL
Relational databases are still the dominant data platforms, particularly for transaction processing. But in many cloud and web applications, developers were ready to jettison Oracle and other relational mainstays. One reason, Fauna CEO Evan Weaver said in an interview, is that relational systems "didn't offer scale" for distributed applications. Cost and development complexity can be challenging as a result, he claimed.
Some of the top NoSQL vendors have tried to make their database technologies more transaction-friendly by embracing the ACID properties -- atomicity, consistency, isolation and durability. For example, MongoDB improved ACID compliance in its MongoDB 4.0 release earlier this year. Amazon similarly enabled transactions in DynamoDB as part of the database updates at AWS re:Invent 2018.
But Weaver also said he sees flaws amid the NoSQL legion seeking to replace SQL databases. Weaver, who formerly was director of infrastructure at Twitter, termed NoSQL as a 'failure' in the sense that "none of the systems really progressed in terms of use."
He said the search for a usable cloud-scale database was constant at Twitter, and this led him in part to form FaunaDB with others.
NoSQL designers of systems like Cassandra or MongoDB made some tradeoffs to achieve traits such as high scalability or consistency, Weaver maintained. FaunaDB, he continued, effectively blends transactional consistency and resilient horizontal scaling.
Fauna CEO questions Google
Evan WeaverCEO, Fauna
Weaver said FaunaDB is intended to support transactions that remain consistent despite the latencies that global data distribution presents.
The customer for this type of operation "wants to move an existing system fully into the cloud," he said, or is "one that wants to create a whole new [compute] stack that takes them where the cloud is going."
Weaver noted that FaunaDB accomplishes its data consistency without using special hardware. That is, in a way, a mild censure of one of the more prominent transactional NewSQL models: Google's Spanner.
Still relatively new in commercial terms, the Google Cloud Spanner system that's based on the Spanner software uses atomic clocks, GPS receivers and timestamps to help achieve its availability and consistency.
In a white paper on the Google Cloud website, Google credits Spanner's combination of hardware and TrueTime clock synchronization for data precision and accuracy. The approach has drawn some reproach from people who say the use of the specific hardware locks users into setups as provided in Google Cloud data centers.
Weaver said FaunaDB, in contrast, does not rely on precise timing for ordering transactions. Instead, it uses a global transaction log that is ordered by consensus in brief 10 millisecond increments.
Atomic clock timing and other aspects of Spanner have come in for pointed criticism by Daniel Abadi, noted computer science professor and data researcher at the University of Maryland, College Park, who is also a Fauna Inc. advisor.
Abadi, whose work lies behind such database technologies as Vertica, VoltDB and HadoopDB, has described an alternative to Spanner approaches, which he and his collaborators dubbed Calvin, after the 16th century theologian and religious reformer. FaunaDB's design incorporates Calvin principles.
'DB nerd' rides the cloud
Eigil Sagafos is a self-proclaimed "DB nerd," but the intricacies of Calvin-inspired data architecture are not what led him to implement FaunaDB as part of a new commercial cloud and on-premises process modeling tool set.
Sagafos, CTO and co-founder at Oslo, Norway-based software startup ShiftX, said he favors FaunaDB because of its ease of development, especially when it comes to querying capabilities.
He said FaunaDB fits well with GraphQL, a Facebook-originated open source query language for APIs that ShiftX is using to build out process modeling services that it intends to offer globally.
He added that FaunaDB's native temporal query capabilities, which show developers how data has changed over time, provide ShiftX developers what amounts to a "time machine mode" that gives end users views of their processes as they appear over time.
Rethink this
Sagafos' personal database journey has included work with PostgreSQL, CouchDB and MongoDB. More recently, he worked with RethinkDB, a distributed document database, whose originator company --which went by the same name -- closed down in 2016.
RethinkDB, he said, experienced errors in production that were hard to decipher, and working through those problems was challenging. When the company behind the database bowed out, that called for a completely new approach.
When he set out to forge the data infrastructure at ShiftX, Sagafos deeply studied FaunaDB and the work of the support team behind it.
For offering the consistency of SQL and the ease of development of NoSQL, Sagafos credited FaunaDB as "definitely the best of both worlds." To that, he added the database's potential to provide distributed data processing around the world.