Getty Images/iStockphoto

Amazon exec: New AWS S3 capabilities highly adopted by users

AWS Storage VP Andrew Warfield talks about how customers are using the latest feature launches from the Amazon S3 team as part of the vendor's Pi Day 2025 conference.

Data management capabilities took center stage Friday at AWS Pi Day 2025, the hyperscaler's online conference, but AWS's S3 storage remains the backbone of the massive cloud's services.

SageMaker Unified Studio, now generally available, provides AWS customers with a unified platform for data management, analytics and AI development. 

Some of the new S3 features include S3 Tables with Amazon SageMaker Lakehouse, new APIs for connecting with the Apache Iceberg REST Catalog (IRC) and price reductions for S3 object tagging.

The Amazon S3 development team, under the leadership of Andrew Warfield, vice president and distinguished engineer at AWS, has added several new services and capabilities, include S3 Express One Zone, S3 Tables, and S3 Metadata.

In this Q&A, Warfield discusses the user reactions to these services and how he sees them evolving.

Editor's Note: This Q&A has been edited for clarity and brevity.

Now that S3 Tables is widely available, what's the user reaction been like? How are customers using the service?

Andrew Warfield, vice president, AWSAndrew Warfield

Andrew Warfield: S3 Tables had good reception, which was surprising even [compared] to where the team thought it was going to land. We've seen almost immediate adoption. Customers are pulling it into existing analytics pipelines and workloads, [and] asking for new features.

Since [launch in December] the team has been scrambling to get features in. The API service has simplified, allowing people to do table creation and schema establishment from the API. Today, we're broadening what you can do in the console. The console integrates with [AWS] Athena query window so you can interact with what's actually in the tables directly from the console. [Apache] Iceberg over the last year developed this capability called the Iceberg Rest Catalog, which is a little a bit like a data path API into the Iceberg table.

At launch, we'd gone with a different path for integrating with Iceberg, which was that we shipped a little library that let clients talk to S3 Tables. What we found was that the community squarely wanted support for this Iceberg Rest Catalog API as it allows tools to integrate with Iceberg and does not need to do deployments of clients every time a change is made.

Today, we're integrating IRC support [to S3 Tables], and it makes it easier to use anything that talks to Iceberg to [also] talk to S3 Tables.

In most conversations about Iceberg, you're going to hear a [customer] voice that comes from a very analytics background. Iceberg was built as a way of doing a better job with Apache Spark over S3. But as we've taken on [this feature] and built a managed Iceberg and table primitive in S3, we actually want to do much more than that.

What about S3 Metadata? What's the adoption on that service been like?

Warfield: We're seeing [customers] using it to do all sorts of stuff, ranging from asking questions about what happens in the history of their bucket, looking to identify who deleted things and doing operational analysis of their buckets, like where capacity is being spent. [Customers are using it] increasingly as a place to store metadata for higher-level applications.

[Customers] are building adjacent tables. We'll populate the metadata table with all the objects and the basic metadata in the object headers. We're seeing customers build a second table with S3 Tables and as objects land [in those tables], they'll go and do things like run an ML model against video data [to] generate closed-caption summaries and content ratings. They will pump stuff into their own table, but it will share a join key with an S3 Metadata table. They're then able to sit in Athena or in Spark, [where] they're able to ask questions about their content, and that works across those two tables.

We're starting to see metadata act as the basis for pretty interesting workflows around content that's in buckets.

While on the topic of launches, AWS debuted S3 Express One Zone in 2023. How's that currently being used?

Warfield: Its [uptake] wasn't seen as a surprise, but it's interesting to see the workloads it's attracted. There are a lot of cases where customers build applications on top of S3 that take a lot of small updates. [They] then want to do analysis on those small updates at some time in the future. Simple things like writing out logs or CCTV cameras [and] IoT sensors. There's a lot of things that produce infrequent, small amounts of data, and that data is ultimately consumed in big batches. S3 Express One Zone is a very low latency, very direct way of absorbing those little updates and staging it somewhere that has a lot of performance before you do the aggregation and write it as a large object to S3.

With the way our customers hold S3, the team is cautious on every dimension in terms of durability and security.
Andrew WarfieldVice president and distinguished engineer, AWS

How do you see the fundamental AWS S3 service evolving?

Warfield: [S3] is a reasonably mature system and with that maturity, especially in the storage space, there's a tendency to be very careful when we ship things. With the way our customers hold S3, the team is cautious on every dimension in terms of durability and security. They have a crazy high bar for quality.

A thing I'm really pleased with on these two launches is instead of defining the perfect system we eventually want to get to, we focused really hard on a launch that was going to be exciting for most of our customers. If you were to have talked to me about two or three years ago and asked me what S3 is in a word, I would have said objects. Over the past couple of years, [we've come to understand that] objects are central to what S3 is, but customers building things on top of S3 is not so much the data type it supports, but the simplicity of using it. Customers think about it from an API perspective and then immediately give us a hard time over the API getting more complicated over time.

But the thing we've done a ton of work on, maybe without even realizing it, is removing the sharp edges around scaling capacity and performance, [or] making encryption automatic.

Tim McCarthy is a news writer for Informa TechTarget covering cloud and data storage.

Dig Deeper on Cloud storage