12 top open source databases to consider
Open source databases are viable alternatives to proprietary ones. Here's information on 12 open source and source available technologies for weighing database options.
Databases are fundamental to modern IT -- both traditional on-premises ones and newer cloud databases help facilitate all manner of applications. Initially, the database market was dominated by proprietary technologies owned and controlled by a single vendor. Such products are still widely used, but open source databases have also gained broad user adoption.
Open source software offers user organizations the promise of source code that's openly available and typically developed in a community process. The aim is to expand the number of people involved in the development process and not lock users into a specific vendor's technology. The increased use of open source databases was partly spurred by the rise in popularity of Linux and cloud-based systems, which often rely on the open source OS. It was also driven by the emergence of NoSQL databases, many of which adhere to or align with the open source model.
What are open source databases?
Open source databases are developed and released under an open source license. While open source is sometimes used as a marketing term, it has a very specific definition when it comes to software licenses. To be a bona fide open source technology, a database needs to use a license approved by the Open Source Initiative. The OSI is the governing body that determines whether licenses adhere to the Open Source Definition (OSD), which is the guiding document for open source licensing.
Things have become more complicated, though. A growing number of vendors that created open source databases have adopted licenses that largely adhere to the tenets of the OSD but aren't OSI-approved. Most commonly, that's because they require cloud providers offering database as a service (DBaaS) implementations of a technology to publicly release modified or related source code under the same license. Such licenses are typically referred to as source available ones. Nonetheless, databases licensed under them are often still listed together with technologies that remain fully open source as alternatives to proprietary, closed source databases.
The broad category of open source and source available databases contains various types of database software. That includes SQL-based relational databases, the most widely used type, and the four primary NoSQL technologies -- key-value stores, document databases, wide-column stores and graph databases. Open source versions of special-purpose systems, such as vector databases and time series databases, are also available. In addition, many vendors now offer multimodel databases that support more than one data model.
Potential benefits of using open source databases
Open source databases offer many potential benefits, some of which also apply to source available technologies. The following are among the primary benefits for user organizations:
- Easy to get started. A core premise of the open source approach is that the technology is freely available. As a result, users can easily try out and deploy an open source database without first having to pay for it, although vendors do offer paid support as well as closed source versions of databases with additional features in many cases.
- Community support and engagement. Open source or source available code typically comes with a community of engaged users and contributors who can help new users with the technology. It also enables participation in the code development process. For example, users can submit bug reports and feature requests and become contributors themselves.
- Understandable source code. When source code is open and can be viewed by anyone, there's a better chance to understand how a database works and how it can be used effectively to meet business needs.
- Flexibility and customization. With some open source licenses, developers are free to modify the database software to meet specific custom requirements.
- Improved security. Because the source code is open, developers, users and security researchers can thoroughly scrutinize it to identify vulnerabilities. That enables rapid patching of vulnerabilities after they're discovered.
The technologies listed below are some of the most prominent open source and source available databases. The list was compiled by TechTarget editors based on research of the database market, including vendor rankings by Gartner and database management system (DBMS) popularity rankings on the DB-Engines website. However, the list itself is unranked. It includes five relational open source databases, three NoSQL ones and four source available technologies, organized in that order.
The writeups about each technology provide details on key features, potential use cases, licensing and commercial support options to help organizations choose the right open source database for their application needs.
1. MySQL
MySQL is among the most widely deployed open source databases. It was first released in 1996 as an independent effort led by Michael "Monty" Widenius and two other developers, who co-founded MySQL AB to create the database. The company was acquired in 2008 by Sun Microsystems, which was then bought by Oracle in 2010. MySQL has remained a core part of Oracle's database portfolio ever since while being maintained as open source software.
A relational database, MySQL was originally positioned as an online transaction processing (OLTP) system. It's still primarily geared to transactional uses, although Oracle's MySQL HeatWave cloud database service now also supports analytics and machine learning applications. MySQL gained much of its early popularity as a cornerstone of the LAMP stack of open source technologies -- Linux, Apache, MySQL and PHP, Perl or Python -- that powered the first generation of web development. It continues to be an underlying database on many websites.
Common use cases: Like other relational databases, MySQL complies with the ACID properties -- atomicity, consistency, isolation and durability -- for ensuring data integrity and reliability. Because of that, it supports a wide range of applications. For example, it's commonly used as a web application server and to run cloud applications and content management systems.
Licensing: MySQL is dual-licensed under the GPL version 2 open source license and an Oracle one for organizations looking to distribute the database along with commercial applications.
Source code repository: https://github.com/mysql/mysql-server
Commercial support options: There are numerous commercial implementations of MySQL. Oracle provides multiple options in addition to MySQL Heatwave, including Enterprise and Standard editions and an embedded version. MySQL is also available in the cloud as part of the Amazon Relational Database Service (RDS) from AWS, as well as Google's Cloud SQL and Microsoft's Azure Database services. Vendors such as Aiven, PlanetScale and Percona offer MySQL cloud services, too.
2. MariaDB
MariaDB debuted in 2009 as a fork of MySQL that was created by a team also led by Widenius, who left Sun early that year because he was concerned about the direction and development of MySQL. Work on MariaDB started when he was still at Sun, and it was originally designed to be a drop-in replacement for MySQL. But that was only fully the case until the 5.5 releases of the two databases. After that, new features not in MySQL were added to MariaDB, which has used different numbering on subsequent releases.
Even with newer updates, though, it's relatively easy to migrate from MySQL to MariaDB. The latter's data files are generally binary compatible with MySQL ones, and the client protocols of the databases are also compatible. In many cases, users can simply uninstall MySQL and install MariaDB. MariaDB PLC, which leads development of the software through the MariaDB Foundation, maintains a list of incompatibilities and feature differences with MySQL.
Common use cases: MariaDB is commonly used for the same purposes as MySQL, including in web and cloud applications involving both transaction processing and analytics workloads.
Licensing: The free MariaDB Server software -- referred to by the company as MariaDB Community Server -- is released under the GPLv2 license.
Source code repository: https://github.com/MariaDB/server
Commercial support options: MariaDB PLC sells a MariaDB Enterprise Server version of the database that also supports JSON data and columnar storage. SkySQL, a company that was spun out of MariaDB PLC in late 2023, offers a fully managed DBaaS implementation. MariaDB is also available as part of Amazon RDS and Azure Database, although Microsoft plans to retire its offering in September 2025.
3. PostgreSQL
PostgreSQL got its start as Postgres in 1986 at the University of California, Berkeley. The Postgres project was initiated by relational database pioneer Michael Stonebraker, then a professor at the school, as a more advanced alternative to Ingres, a proprietary RDBMS that he also played a lead role in developing. The software became open source in 1995, when a SQL language interpreter was also added, and it was officially renamed PostgreSQL in 1996. Decades later, though, PostgreSQL and Postgres are still used interchangeably by developers, vendors and users to refer to the database.
PostgreSQL offers full RDBMS features, including ACID compliance, SQL querying and support for procedural language queries to create stored procedures and triggers in databases. Like MySQL, MariaDB and many other database technologies, it also supports multiversion concurrency control (MVCC) so data can be read and updated by different users at the same time. In addition, PostgreSQL supports other types of database objects than standard relational tables, and it's described as an object-relational DBMS on the open source project's website.
Common use cases: PostgreSQL is commonly positioned as an open source alternative to the proprietary Oracle Database. It's widely used to support enterprise applications that require complex transactions and high levels of concurrency, and sometimes for data warehousing.
Licensing: The software is available under the OSI-approved PostgreSQL License.
Source code repository: https://git.postgresql.org/gitweb/?p=postgresql.git;a=summary
Commercial support options: PostgreSQL has a wide range of commercial support and cloud offerings. EDB, formally known as EnterpriseDB, specializes in PostgreSQL and provides both self-managed and DBaaS versions in the cloud. Managed PostgreSQL cloud services are also available from AWS, Google, Microsoft and Oracle, as well as vendors such as Aiven, Percona and NetApp's Instaclustr subsidiary.
4. Firebird
The Firebird open source relational database's technology roots go back to the early 1980s, when the proprietary InterBase database was created. After InterBase was acquired by multiple vendors, commercial product development ended and the final release was made available under an open source license in 2000. Within a week, the Firebird project was created to continue developing a fork of the technology.
Firebird supports ACID-compliant transactions, external user-defined functions and various standard SQL features, and it includes a multi-generational architecture that provides MVCC capabilities. The software has a relatively small footprint and is available in an embedded single-user version, but it can also be used to run multi-terabyte databases with hundreds of concurrent users. It shouldn't be confused with Firestore and Firebase Realtime Database, two commercial NoSQL databases developed by Google.
Common use cases: Firebird can handle both operational and analytics applications. It's used in various types of enterprise applications, including ERP and CRM systems.
Licensing: Firebird is made available under the InterBase Public License (IPL) and the Initial Developer's Public License (IDPL). Both are variants of the Mozilla Public License Version 1.1, which is OSI-approved though now superseded by Version 2.0. The IPL covers the original InterBase source code, while the IDPL applies to added or improved code developed as part of the Firebird project.
Source code repository: https://github.com/FirebirdSQL/firebird
Commercial support options: Firebird is an independent open source project not driven by a particular vendor, and the software is free to use, including for commercial purposes. The Firebird website lists six companies that provide commercial support, consulting and training services. Firebird cloud services running on Windows Server 2019 are available for purchase in the AWS, Azure and Google clouds, although support will end on the Google Cloud one in August 2024.
5. SQLite
SQLite is a lightweight embedded RDBMS that runs inside applications. It was created in 2000 by computer analyst and programmer D. Richard Hipp while he was working as a government contractor in support of a U.S. Navy project, which needed a database that could run without a database administrator (DBA) in environments with minimal resources. Hipp continues to lead development of the software as project architect through Hipp, Wyrick & Company Inc., a software engineering firm commonly known as Hwaci for short.
As an embedded database, SQLite is self-contained, meaning it's fully functional within the application it powers. The software is a library that embeds a full-featured SQL database engine supporting ACID transactions. There are no separate database server processes. Data reads and writes are done directly to ordinary disk files, and a complete SQLite database that includes tables, indices, triggers and views can be contained in a single file.
Common use cases: SQLite is commonly used in mobile applications, web browsers and IoT devices due to its small footprint and ability to operate without a separate server process.
Licensing: The SQLite source code is in the public domain and is free to use, modify and distribute for any purpose without a license. Hwaci does sell a warranty of title with a perpetual right-to-use license to organizations that want one for legal reasons.
Source code repository: https://sqlite.org/src/doc/trunk/README.md
Commercial support options: Hwaci provides paid technical support, maintenance and testing services, and it offers a set of proprietary extensions to SQLite that are sold under separate licenses. As with Firebird, SQLite database services are available on AWS, Azure and Google Cloud, in this case all running on Ubuntu Server 20.04.
6. Apache Cassandra
The Cassandra wide-column store traces its roots back to 2007, when it was originally developed by Facebook to support a new inbox search feature that was being added to the social network. The NoSQL database was open sourced in 2008 and became part of the Apache Software Foundation in 2009, initially as an incubator project before it was elevated to top-level project status the following year.
Cassandra is a fault-tolerant distributed database that can be used to store and manage large amounts of data across a cluster consisting of numerous commodity servers. The software replicates data on multiple server nodes to avoid single points of failure, and it can be scaled dynamically by adding more servers to a cluster based on processing demand. Cassandra currently provides eventual consistency, which can limit its transactional uses due to temporary data inconsistencies, but the Apache project is working to add support for ACID transactions.
Common use cases: Cassandra is designed for uses that require fast performance, scalability and high availability. It's deployed for various applications, including inventory management, e-commerce, social media analytics, messaging systems and telecommunications, among others.
Licensing: The Cassandra software is covered by the Apache License 2.0.
Source code repository: https://github.com/apache/cassandra/tree/trunk
Commercial support options: Multiple vendors provide commercial support for Cassandra and DBaaS versions of the database, including DataStax, Aiven and Instaclustr. Amazon Keyspaces (for Apache Cassandra) and Azure Managed Instance for Apache Cassandra are also available as database services from AWS and Microsoft, respectively.
7. Apache CouchDB
CouchDB is a NoSQL document database that was first released in 2005 by software engineer Damien Katz and became an Apache project in 2008. The Couch part of the name is an acronym for "cluster of unreliable commodity hardware," which stems from the project's original goal: to create a reliable database system that could run efficiently on ordinary hardware. CouchDB can be deployed on one server node but also as a single logical system across multiple nodes in a cluster, which can be scaled as needed by adding more servers.
The database uses JSON documents to store data and JavaScript as its query language. Other key features include support for MVCC and the ACID properties in individual documents, although an eventual consistency model is used for data stored on multiple database servers -- a tradeoff that prioritizes availability and performance over absolute data consistency. Data is synchronized across servers through an incremental replication feature that can be set up for bidirectional tasks and used to support mobile apps and other offline-first applications.
Common use cases: CouchDB is used for various purposes, including data analytics, time series data storage and mobile applications that require offline storage and functionality.
Licensing: CouchDB is licensed under the Apache License 2.0.
Source code repository: https://github.com/apache/couchdb
Commercial support options: The IBM Cloudant cloud database is based on CouchDB with added open source technology that supports full-text search and geospatial indexing. Several other companies also offer support for CouchDB, including packaged instances in the AWS, Azure and Google clouds.
8. Neo4j
Neo4j is a NoSQL graph database that's well suited for representing and querying highly connected data sets. Neo4j uses a property graph database model consisting of nodes, which represent individual data entities, and relationships -- also referred to as edges -- that define how different nodes are organized and connected. Nodes and relationships can also include properties, or attributes, in the form of key-value pairs that further describe them.
First released as open source software in 2007, Neo4j is overseen by database vendor Neo4j Inc. It originally was solely a Java-based graph database, but it has been expanded with additional capabilities, including vector search and data storage. Key features include full ACID compliance, horizontal scalability through an Autonomous Clustering architecture and the Cypher query language. Neo4j Inc. plans to converge Cypher with GQL, a standard graph query language published by the International Organization for Standardization in April 2024 that uses syntax based on both SQL and Cypher.
Common use cases: Typical uses for Neo4j include social networking, recommendation engines, network and IT operations, fraud detection and supply chain management, with generative AI applications also now supported through the vector search feature.
Licensing: Neo4j Community Edition is licensed under the GPL version 3. An open source version of Cypher named openCypher is also available under the Apache License 2.0.
Source code repository: https://github.com/neo4j/neo4j
Commercial support options: Neo4j Inc. provides several supported commercial offerings, including a Neo4j Enterprise Edition with added closed source components and the subscription-based Neo4j Aura cloud database service.
9. Couchbase Server
Couchbase Server is a NoSQL document database with multimodel capabilities for storing data both in JSON documents and as key-value pairs. The technology resulted from the 2011 merger of two open source database companies: CouchOne, which had been founded by CouchDB creator Damien Katz to offer systems based on that database, and Membase, which was set up to build a key-value store by developers of the memcached distributed caching technology. The combined company became Couchbase, leading to the development of Couchbase Server.
Despite their similar names and partly shared origins, Couchbase Server and CouchDB aren't directly related or compatible -- they're different database technologies with their own code and APIs. Couchbase Server supports strong consistency, distributed ACID transactions and SQL++, a SQL-like language for JSON data querying. It also includes both vector and full-text search capabilities plus a multidimensional scaling feature that enables different database functions to be isolated and separately scaled up based on workload demands.
Common use cases: Couchbase Server is often used to support distributed application workloads and for mobile, edge and IoT applications.
Licensing: Originally available under the Apache License 2.0, Couchbase Server was switched in 2021 to the Business Source License (BSL) 1.1, a source available license that restricts commercial use of the software by other vendors. Database releases are converted back to the Apache open source license four years after they become available.
Source code repository: https://github.com/couchbase/manifest
Commercial support options: Couchbase offers an enterprise edition of Couchbase Server for cloud and on-premises deployments, as well as a mobile version of the database and a fully managed DBaaS technology named Couchbase Capella.
10. MongoDB
MongoDB is another NoSQL document database that was initially developed as open source software and is now a source available technology. First released in 2009, MongoDB stores data in a JSON-like document format named BSON, which is short for Binary JSON. As the full name indicates, BSON encodes data in a binary structure that's designed to support more data types and faster indexing and querying performance than JSON provides.
The database is often seen as an attractive option for developers that want to build applications without the constraints of a fixed schema. In addition to its document data model, MongoDB includes native support for graph, geospatial and time series data. MongoDB Atlas, a cloud database service offered by lead developer MongoDB Inc., also provides vector and full-text search features that can be used free of charge for development and testing in local environments. Other key features in MongoDB include multi-document ACID transactions, sharding for horizontal scalability and automatic load balancing.
Common use cases: MongoDB is widely deployed for uses that include AI, edge computing, IoT, mobile, payment and gaming applications, as well as website personalization, content management and product catalogs.
Licensing: Since 2018, new versions of MongoDB Community Server and patches for previous releases have been made available under the Server Side Public License (SSPL) Version 1, a source available license created by MongoDB Inc.
Source code repository: https://github.com/mongodb/mongo
Commercial support options: In addition to MongoDB Atlas, MongoDB Inc. offers a self-managed MongoDB Enterprise Server that also provides additional capabilities beyond what's in the community edition. MongoDB support and managed services are also available from vendors such as Datavail and Percona. Amazon DocumentDB (with MongoDB compatibility) is a fully managed DBaaS offering from AWS that supports versions 4.0 and 5.0 of MongoDB but not the newer 6.0 and 7.0 releases.
11. Redis
Redis is a NoSQL in-memory database that was converted to a source available technology in March 2024. The Redis project was created in 2009 by software programmer Salvatore Sanfilippo, known by the nickname antirez, to help solve a database scaling problem with a real-time website log analysis tool. Short for Remote Dictionary Server, Redis originally was positioned as software that provided a key-value data store as a caching technology to accelerate existing databases and application workloads.
The database caching functionality remains the foundation of Redis, with features that include built-in replication, on-disk data persistence and support for complex data types. But the platform has been expanded to include additional capabilities, such as support for storing JSON documents and both vector and time series data. A graph database module was also added, but lead developer Redis Inc. stopped developing it in 2023.
Common use cases: While Redis can be used as a full database, one of its most common uses is still as a database query caching layer. It's also often used to support real-time notifications through an integrated pub/sub capability and as a session store to help manage user sessions for web and mobile applications.
Licensing: As of March 2024, the core Redis software is dual-licensed under the Redis Source Available License 2.0 and the SSPL v1. The added database modules and a Redis Stack bundle that combines them have been covered by those licenses since 2022.
Source code repository: https://github.com/redis/redis
Commercial support options: Redis Inc. provides a closed source Redis Enterprise offering under a commercial license, as well as a fully managed Redis Cloud service. Microsoft's Azure Cache for Redis is a managed service that includes both the core Redis software and Redis Enterprise as options. Redis managed services are available from Aiven and Instaclustr, too. In addition, AWS, Google and Oracle offer cloud services with Redis compatibility.
12. CockroachDB
CockroachDB is a source available distributed SQL database loosely inspired by Google's proprietary Spanner database. Developed primarily by vendor Cockroach Labs, CockroachDB was first released in 2015, with an initial production version appearing two years later. Just like the insect it's named after, a core design goal of CockroachDB is to be hard to kill. The cloud-native database is built to be a fault-tolerant, resilient and consistent data management platform.
CockroachDB scales horizontally and can survive various types of equipment failures with minimal disruptions to users and no manual intervention required by DBAs, according to its developers. Key features include automated repair and recovery, support for ACID transactions with strong consistency, a SQL API and geo-partitioning of data to boost application performance. It also has a "multi-active availability" model that enables users to read and write data from any cluster node with no conflicts.
Common use cases: CockroachDB is well suited for high-volume OLTP applications and distributed database deployments across multiple data centers and geographic regions.
Licensing: Since 2019, most of CockroachDB's core features have been licensed under a version of the BSL that requires other vendors to buy a license from Cockroach Labs if they want to offer a commercial database service. Other core features are covered by the Cockroach Community License (CCL), which allows source code to be viewed and modified but not reused without an agreement on that with Cockroach Labs. The features licensed under the BPL convert to the Apache License 2.0 and become open source three years after a new database release, a change that doesn't apply to the CCL ones.
Source code repository: https://github.com/cockroachdb/cockroach
Commercial support options: Cockroach Labs provides technical support and additional paid enterprise features that are available in both self-managed and DBaaS deployments.
Sean Michael Kerner is an IT consultant, technology enthusiast and tinkerer. He has pulled Token Ring, configured NetWare and been known to compile his own Linux kernel. He consults with industry and media organizations on technology issues.