Roman Sakhno - Fotolia
Google Cloud Spanner overview: 4 features to consider
Cloud Spanner's ability to offer data consistency and horizontal scalability has helped the relational database service gain traction. Learn more about its architecture.
Purpose-built database management systems that meet a unique set of application requirements continue to capture the attention of the IT community. Google Cloud Spanner is one such DBMS, for applications requiring a massively scalable, worldwide online transaction processing platform that provides data consistency.
System administrators have been able to horizontally scale web and application middle-tier systems for over a decade. But data consistency is a major issue when database administrators attempt to scale a relational database management system (RDBMS) horizontally. To ensure it, relational clusters require that all replicated nodes receive a copy of the data changes before returning a response to the application. Adding additional nodes lengthens transaction times, and the system is dependent on all nodes for availability.
The origins of Google Cloud Spanner
Google describes Cloud Spanner as "the only enterprise-grade, globally distributed and strongly consistent database service built for the cloud specifically to combine the benefits of relational database structure with non-relational horizontal scale." Google built Cloud Spanner as a replacement data store for its Google Adwords and Google Play applications. Google was using MySQL and found that the database was unable to handle the applications' growing worldwide workloads.
Google initially evaluated NoSQL as a potential solution. Although NoSQL could offer horizontal scalability, the architecture was unable to provide the strong level of data consistency Google required. A project began in 2007 to build a globally distributed database that would provide both data consistency and massive horizontal scalability. In 2012, the company published its first research paper on Spanner; and in May 2017, Google Cloud Spanner became generally available to Google Cloud customers.
Cloud Spanner architecture
Cloud Spanner does not run on the internet; it runs on Google's private, high-speed global network. During instance creation, administrators choose the number of nodes and configure the environment as a single or multi-region system. Cloud Spanner automatically configures three read/write replicas per node for each region, to provide data redundancy.
Because Cloud Spanner is a relational/NoSQL hybrid built for a specific purpose, there is a learning curve for fully understanding the product. Although it combines some of the benefits of both architectures, Cloud Spanner provides its own unique feature set:
Relational schema: Google Cloud Spanner uses relational-like schema definitions to logically store data. Administrators used to the wide range of data definition language options in relational databases will find that Cloud Spanner offers a limited set of basic declarations and commands. Cloud Spanner provides a handful of data types and does not support views, check constraints or foreign keys. The differences are substantial and warrant further investigation by those considering using Cloud Spanner as a relational database replacement.
Cloud Spanner does provide primary and secondary indexes. Like NoSQL products, Cloud Spanner uses the primary key to distribute the data and workloads to different nodes. As a result, key selection is critical. Cloud Spanner stores indexes as tables and favors data scans over index lookups. Developers must use hints to ensure that their query uses the index.
SQL language: Although Cloud Spanner's SQL language is fairly comprehensive, it does not provide the same level of functionality as the leading RDBMS vendor offerings. The syntax options will allow developers to create most complex SQL statements to view and manipulate data.
Backup and recovery: Google Cloud Spanner does not provide backup and recovery utilities. The DBMS relies on its highly available architecture to guarantee 99.999% availability. Although a high level of availability ensures that the information will be there, it does not protect data from human error or intentional destruction. Undesirable data changes do occur, and the lack of backup and recovery utilities is an important consideration when evaluating Cloud Spanner. The platform does provide import and export utilities, but using export files for recoveries forces administrators to restore the entire database and prevents them from using change logs to roll the database forward in order to minimize data loss.
Security: Most popular relational and NoSQL DB offerings provide a broad range of granular permissions to control data access. For relational systems, data access can be restricted to specific tables and columns. Cloud Spanner's Identity and Access Management utility allows permissions to be set at only the database level.
In short, although Google Cloud Spanner does provide a unique set of capabilities, the product requires a thorough evaluation to determine if it fully meets all of your application's specific needs.