buchachon - Fotolia
MongoDB Atlas cloud service adds data lake, touts multi-cloud
MongoDB released an S3-compatible data lake its developer legions can quickly query. But, word of MongoDB Atlas use on Google's cloud shows there are clouds to sow beyond AWS.
NoSQL database maker MongoDB continued its push into cloud, today launching Atlas Data Lake, a cloud database service that taps into data stored in AWS S3 buckets.
The vendor unveiled the data lake service in the form of a public beta at its MongoDB World 2019 conference in New York.
Atlas itself has been a multiyear effort by MongoDB to move its data capabilities from the data center to the cloud.
By opening cloud object stores to its Atlas querying capabilities, MongoDB effectively has chosen to compete with cloud data warehousing alternatives such as Hadoop, a technology that has been reeling as its leading independent proponents -- MapR and Cloudera -- struggle financially.
The move to support AWS S3 is a tacit admission that cloud computing goes first through AWS. MongoDB and other open source database vendors have criticized the cloud giant for using open source database software on its cloud. The MongoDB move also is a sign that enterprises have a growing demand to analyze diverse data lake data on the cloud.
While AWS is first up, MongoDB is expected to move quickly to also provide Google Cloud Platform (GCP) and Microsoft Azure support for the MongoDB Atlas Data Lake, which runs as a serverless service. Google, particularly, is looking to promote its MongoDB ties as it vies with AWS and Azure in the cloud market.
Just say NoSQL
Document databases generally, and MongoDB specifically, have become an important part of web and cloud development. MongoDB grew out of web application work in the early 2000s by developers at online advertiser DoubleClick. They were stymied as they tried to build applications to run at the suddenly massive scale of the early internet.
Some of those developers started MongoDB (originally called 10gen) in 2007, using a document-oriented data architecture that did not require developers to create relational data schema.
MongoDB Atlas supports high scalability on GCP, according to Joshua Kelly, co-founder and CTO of Universe, a part of Live Nation Entertainment. Universe uses both MongoDB Atlas and GCP.
"As an e-commerce provider, latency is extremely important to us and the ability to execute extremely fast reads from secondary nodes, which we can scale horizontally, helps us deliver at the speed at which our customers demand," Kelly said in an email. Kelly added that multi-cloud support is important to Universe.
"MongoDB Atlas helps us place our MongoDB resources in the clouds most important to us, including Google Cloud Platform," he said.
Google's value-add to the MongoDB Atlas portfolio includes recent updates that provide integrations with its Cloud Key Management Service on GCP. Meanwhile, adding cloud data centers in Osaka, Japan, and Zurich, Switzerland, brings the number of Google Cloud regions supporting MongoDB Atlas to 20, according to the company. Add in other clouds and MongoDB coverage spans to 69 regions, according to MongoDB.
Document schemas
But scale is a given need for cloud databases. What distinguishes a document database like MongoDB's is its eschewal of upfront schema, according to James Curtis, an analyst at 451 Research.
James Curtis Analyst, 451 Research
"Overall, NoSQL databases effectively scale. But the document architecture, in particular, is something that has great appeal to people for its wide schema flexibility," Curtis said.
Leading cloud vendors such as AWS, Microsoft and Google all offer their own document databases, but MongoDB boasts many developer adherents, and has continued to expand.
Other competitors in the space include Couchbase and MarkLogic, which have been similarly active.
Last month, MarkLogic added embedded machine learning to its MarkLogic Data Hub 5.0 and MarkLogic 10 multi-model database. Also last month, Couchbase updated its software to include a Couchbase Autonomous Operator, a Kubernetes tool for arranging deployment of data-based containers.
Atlas scaled
MongoDB Atlas Data Lake, as the company described it at the conference, is a nod to the fact that S3 and other object stores are displacing file systems and databases in some cloud applications. Avro and Parquet are also among the storage formats to be targeted. Atlas Data Lake will compete with AWS Athena and Azure Data Lake from Microsoft.
Because Atlas Data Lake is designed to work natively with MongoDB, veteran MongoDB developers should be able to rapidly create queries that run on the cloud against S3 and other object formats to come, according to MongoDB.
Another enhancement the document database vendor revealed at its user conference this week was full-text search on MongoDB Atlas. MongoDB Atlas is based on Apache Lucene technology.
Also, an updated Kubernetes Operator interface now provides the same view across on-premises, hybrid and public cloud infrastructure.
Cloud dealings
In April, MongoDB was among several players that forged a partnership to offer cloud data services on GCP. That deal with Google for MongoDB and others that have had licensing squabbles with cloud leader AWS can be seen as a salvo aimed against AWS, which has aggressively marketed its growing collection of cloud databases.
In the wake of a MongoDB-AWS skirmish that coincided with AWS' January release of its DocumentDB and the ascent of new Google Cloud chief Thomas Kurian, Google launched an effort to work closely with several open source development shops, notably including MongoDB.
The deals Google is forming with MongoDB and others are an important part of its strategy. Google is playing catch-up and taking a long view on the cloud market, said Carl Olofson, an analyst at IDC.
"Google is in third position -- so it has to be all about partnerships and relationships. That is Job 1," he said. "It isn't going to get into a petty war over document databases. It is pushing GCP."
Such activity is a kind of hedge against AWS' success. In MongoDB's and others' cases, changes in license arrangements have been one way to try to counter the cloud leader's liberal use of their open source software. Smaller competitors have also targeted Microsoft and its Azure Cloud for the same reason.
"What the smaller database vendors are trying to do is to prevent an online service that resembles what they have to be offered without the online service paying any kind of royalty," Olofson said.
Meanwhile, both Microsoft and Amazon have maintained that they are committed to open source, with Amazon looking to build on its acquisition of open source development mainstay GitHub, and Microsoft joining the influential Apache Software Foundation.