Getty Images/iStockphoto

New AWS tools simplify access, management of data at scale

The tech giant revealed serverless tools that eliminate limitations on workload size as well as integrations that simplify access to data within the Amazon Web Services ecosystem.

AWS has expanded its suite of data management capabilities with a series of new serverless tools aimed at helping users oversee data at any scale and new integrations to simplify accessing and operationalizing data.

The serverless tools now in preview include Amazon Aurora Limitless Database and Amazon ElastiCache Serverless. AWS also introduced an Amazon Redshift Serverless capability that employs AI to predict workloads and automatically optimize resources to better enable customers to control costs.

The integrations, meanwhile, are between Redshift and Amazon Aurora PostgreSQL, Amazon DynamoDB and Amazon Relational Database Service (RDS) for MySQL. There's also one between DynamoDB and Amazon OpenSearch Service.

All eliminate the need for cumbersome and costly data integration pipelines that require repeated data extract, transform and load workloads.

AWS introduced the data management features Monday at its AWS re:Invent 2023 user conference in Las Vegas.

Going serverless

The new serverless tools are aimed at enabling customers to analyze and manage data no matter its scale.

When tied to servers, databases and other data repositories are limited in scale. When a server can no longer handle the data volume within a database or data warehouse, workloads slow down. Decoupling data management tasks from servers removes that hindrance.

Amazon Aurora Limitless Database is a new version of AWS' Aurora Database that automatically scales beyond the write limits of its existing capabilities. As a result, developers can build applications that go beyond the limits of what previously existing Aurora Databases could accommodate.

Financial transaction processing and online gaming are applications that require petabytes of data that Aurora Database previously forced developers to splinter into smaller subsets, according to AWS. Now, Aurora Limitless Database can meet the needs of those applications.

Amazon ElastiCache Serverless similarly addresses scale.

Organizations often store data in caches to make it easy to access. The new capability is designed to enable customers to quickly develop easily accessible caches that can scale to any size so that frequently used data doesn't have to be apportioned among numerous caches and complicate application development.

Finally, Amazon Redshift Serverless addresses scale by using AI to predict workload demands and automatically scale up or down to optimize resources.

Rather than use the same compute power at all times, the AI technology now in Redshift Serverless powers the platform down when fewer workloads are running and powers up the platform when there is greater demand.

The result is that organizations can reduce cloud computing costs by more accurately paying only for the compute power they need.

AWS recognizes customers' concerns about price-performance, especially as it relates to variable and unpredictable processing requirements. Companies really want to get their cloud costs back under control.
Kevin PetrieAnalyst, Eckerson Group

What stands out most about the new serverless capabilities isn't the technology itself, according to Kevin Petrie, an analyst at Eckerson Group. Instead, it's AWS' acknowledgment that cost control is a problem for customers.

"AWS recognizes customers' concerns about price-performance, especially as it relates to variable and unpredictable processing requirements," Petrie said. "Companies really want to get their cloud costs back under control."

Doug Henschen, an analyst at Constellation Research, similarly highlighted cost control in a blog post he wrote about the serverless offerings.

He noted that Aurora Limitless Database is a catch-up feature, given that IBM introduced similar capabilities as long ago as 2017. But with AWS claiming Aurora Limitless Database is significantly lower in cost than similar databases -- pricing is based on consumption starting at $0.10 per GB, per month -- it is an important addition for the tech giant.

The AI in Redshift Serverless is also an attempt to catch up with Oracle and Snowflake by providing something similar, according to Henschen.

However, a cost-competitive data warehouse that works in concert with the rest of the AWS ecosystem will be appealing to customers, he continued.

Integrations

Unveiled on Nov. 28, the new integrations between Redshift and other AWS data management tools as well as the one between DynamoDB and Amazon OpenSearch Service are designed to help customers simplify their AWS deployments.

By eliminating ETL workloads between Redshift and AWS' databases, users will be able to more easily move and combine structured data.

Meanwhile, the integration between DynamoDB and Amazon OpenSearch Service is aimed at helping customers perform text and vector searches. Such searches enable users to find and operationalize unstructured data that is critical for training and maintaining generative AI models that require as much data as possible to return accurate results.

Previously, customers had to build and manage data pipelines to integrate Redshift with AWS databases such as Amazon Aurora PostgreSQL, Amazon DynamoDB and Amazon RDS for MySQL to combine and analyze different types of data.

Those data pipelines take time and effort to develop and are labor-intensive to manage. As a result, cloud computing costs can easily add up and exceed an organization's expectations.

In addition, with data difficult to move, it can get isolated within a database and ultimately go unused when data models, reports, dashboards and other data products are developed.

To help customers better control costs and make data easier to access and combine, AWS is eliminating the need to build ETL pipelines to connect Redshift with certain AWS databases.

The desired result is a decision-making process informed by more complete data, according to AWS.

Meanwhile, the zero-ETL integrations should lead users who may have been tempted to explore data infrastructures from other vendors to stay with AWS, according to Petrie.

"This definitely simplifies the job of the AWS-focused data engineers and AWS-only environments," he said. "It will make companies a little more likely to stick with AWS and a little less likely to adopt a second cloud provider."

However, the benefits of the zero-ETL integrations apply to only those organizations that use AWS for both data warehousing in Redshift and database storage, Petrie added.

Enterprises that combine AWS' data management tools with those from other vendors, such as cloud rivals Google and Microsoft, will still need to develop and manage ETL pipelines.

"A growing majority of companies, especially larger enterprises, already use multiple cloud providers," Petrie said. "Many of their analytics projects will require integration of data across AWS, Google and/or Azure, as well as heterogeneous on-premises environments. This announcement does not simplify data engineering for complex environments like these."

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Next Steps

Amazon IAM announcements at re:Invent 2023

Compare Amazon Redshift, Athena and EMR for data analysis

Dig Deeper on Data management strategies