buchachon - Fotolia

Open source cloud databases battle software 'strip mining'

Cloud giants like AWS have adopted open source databases, causing Confluent, MongoDB and others to guard their assets the best way they know how: licensing.

Cloud computing has proved to be good ground for SQL and NoSQL open source databases, making new technologies readily available as services to users, while benefiting the cloud providers and database makers as well.

But the future of open source cloud databases is in question. That's because the bonds between cloud providers and database makers have been tested of late, with software licensing for cloud services coming into conflict. As a result, users are looking closely at their licenses' fine print.

The most strident conflict has centered on MongoDB and Amazon. Open source tensions surfaced earlier this month as AWS introduced DocumentDB, a NoSQL database that, like Microsoft's Azure Cosmos DB, has MongoDB API compatibility.

Amazon DocumentDB is a homegrown effort that adheres to the Apache 2.0 open source MongoDB 3.6 API. As with Azure Cosmos DB's support of MongoDB, DocumentDB uses emulation to represent responses MongoDB clients would expect from a MongoDB server. Migrating MongoDB databases is one of DocumentDB's goals, as is better runtime data management.

MongoDB Inc., the driving force behind its namesake open source technology, contends that Amazon's approach mires DocumentDB's MongoDB APIs in earlier MongoDB versions, and leaves it on the outside of any improvements going forward. That is because the NoSQL database maker changed some terms in its licensing contract recently.

Chart of recent open source licensing incidents
Some cloud providers and open source database and tools makers are caught in wrestling match over licensing. This timeline shares some recent milestones.

MongoDB machinations

Amazon product news along these lines had been widely anticipated for AWS' re:Invent 2018, in November 2018. Instead, NoSQL news at the time focused on MongoDB's licensing for new versions of the database. The reason to change was to spur "international cloud providers" like Amazon to contribute to open source versions of the database, MongoDB said at the time.

In recent years, MongoDB has been at the head of NoSQL entries associated with rapid application development. Its document-oriented store is coupled to tools that speed programming, and it has increasingly been featured among options available for data development on the cloud.

There is a new and unique dynamic with public cloud providers ... who take open source software and sell it as a service, often without contributing much of anything back to the community.
Neha NarkhedeCTO, Confluent

MongoDB's machinations followed earlier shifts in licensing by Redis Labs, which in October altered its license for Redis modules. The reason the vendor gave: to restrict the modules' resale as part of database-as-a-service offerings using the open source Redis in-memory key-value store database.

Tooling for data streaming on the cloud is also getting rejiggered in the wake of MongoDB's salvos. In December, Confluent relicensed some of its Kafka-related components, moving from an Apache 2.0 to a Confluent Community License. Confluent stated that its goal was to control use of its streaming SQL engine, KSQL and other components as part of cloud services.

"This is an interesting time for open source software," said Neha Narkhede, Confluent CTO, in an email during re:Invent. "There is a new and unique dynamic with public cloud providers, particularly AWS, who take open source software and sell it as a service, often without contributing much of anything back to the community. There is even a name given to this: strip mining."

AWS rejects such criticism. Responding to a request for comment, an AWS spokesperson said that AWS is a significant contributor to and supporter of the open source community, noting contributions to Xen, Linux, KVM, Java, Kubernetes, Lucene and other projects. In a statement, the spokesperson also said it's "entirely inaccurate to claim that AWS has received any benefit from MongoDB's code" because DocumentDB doesn't use any of the licensed code. (Editor's note: The full statement from AWS can be found at the end of this story.)

Recent signs may show further impetus for AWS efforts in the open source realm. During the week of Jan. 28, the cloud vendor upped its sponsorship support for the Apache Software Foundation to the Platinum level.

It's important to prosper

While clouds like AWS have enabled wider use of NoSQL databases such as Cassandra, MongoDB and Redis, and Apache Hadoop components like MapReduce, Spark and Hive -- the cloud provider's growing interest in offering open source cloud databases and tools as AWS-branded cloud services is causing some software companies to rethink their licensing.

Cloud services were not much of a consideration when software licenses like GPL, AGPL or even the Apache License came into being. Now, such services may lead to changes, according to Ashish Thusoo, co-founder and CEO at Qubole, which runs a data platform on clouds from AWS, Microsoft, Oracle and, soon, Google.

Cloud providers in effect offer open source as a service, said Thusoo, who traces his own open source background to as far back as 10 years ago when he was engineering manager for data infrastructure at Facebook. At that time, with colleagues, he began to lead open source development of the Hadoop Hive data warehouse project under the auspices of the Apache Software Foundation.

Now, leading a software company, Thusoo said he sees both sides of the issue.

"What is important for open source companies to prosper is a model where they can monetize open source. Many of those models were an open core model, such as Red Hat with Linux," he said. "In the cloud, the model has changed to some degree."

Open source cloud databases at your service

Cloud's role is vital, as more database innovation begins to occur on the cloud first. The database vendors are acutely aware of this.

"The whole nature of apps had changed," said Eliot Horowitz, CTO and co-founder of MongoDB. "People don't build and run their own stacks in data centers."

Horowitz's view, through the lens of an open source cloud database provider, is skewed, though he is correct in that fewer companies build and support apps in data centers today.

In TechTarget's 2019 IT Priorities Survey, 29.1% of respondents said they planned to deploy hardware or software on premises this year -- down from about 43% in 2018. By comparison, 29.9% said they planned to use an IaaS deployment model in 2019, up from 26.5% in 2018. In that same survey, just over 20% of respondents said they will likely deploy databases in the cloud this year.

Not surprisingly, MongoDB -- the vendor -- has its own cloud database as a service. It is called Atlas, and its infrastructure support was broadened last year when MongoDB purchased cloud database specialist MLab.

Horowitz declined to cite Amazon as the sole culprit when it comes to fairness in database as a service. "AWS is not unique," he said.

Horowitz confirmed MongoDB's overall commitment to open source as a business model.

"We want to adapt [licenses] to the modern world," he said. "A database should be open source at its heart."

Cloud database harmony

For his part, Qubole's Thusoo hopes to see a new set of ethics, adding however, that it is too early to conclude whether open source vendors will be able to force that type of change.

"It is important to contribute things back, and licensing should be maintained in a way that people can use the software to create something bigger," he said. "But swinging the pendulum to the direction where licenses are changed to the degree that no cloud provider can offer it as a service -- that, too, becomes an issue," he said.

Thusoo said he looks forward to a day, as do many users and vendors, when problems with licensing for open source cloud databases find better resolution.

"Some middle ground needs to be found," he said.

Here is the full statement from the AWS spokesperson:

"The longevity and viability of open source is very important to our customers and some AWS services, which is why we are a significant contributor and supporter of the open source community. Over the years, we've made significant contributions to a myriad open source projects, including Xen, Linux, KVM, Java, Kubernetes, Chromium, Robot Operating System, and Lucene, which underpins Elasticsearch, Hadoop, Spark, and Hive. We've also taken a leadership role in important open source projects like s2n, FreeRTOS, AWS Amplify, Apache MXNet, AWS SageMaker NEO, and Firecracker. With regards to MongoDB, it is entirely inaccurate to claim that AWS has received any benefit from MongoDB's code because Amazon DocumentDB does not use any of MongoDB's AGPLv3 or SSPL licensed code."

Dig Deeper on Database management