Matt Wood talks AWS' AI platform, ethical use
AWS' AI tools have gained traction with users but also criticism from civil rights advocates. Matt Wood, an AWS deep learning and AI exec, discusses AWS' evolving portfolio.
NEW YORK -- AWS spotlighted its evolving AI offerings at AWS Summit this week, with a variety of features and upgrades.
The company incorporated one emerging technology, Amazon Rekognition, into the event's registration process as it scanned consenting attendees' faces and compared them against photos submitted previously during registration.
But despite outlines for customers' use, the AWS AI platform is not immune to growing concerns over potentially unethical usage of these advanced systems. Civil rights advocacy groups worry that technology providers' breakneck pace to provide AI capabilities, such as Rekognition, could lead to abuses of power in the public sector and law enforcement, among others.
Matt Wood, AWS general manager of deep learning and AI, discussed advancements to the AWS AI platform, adoption trends, customer demands and ethical concerns in this interview.
AWS has added a batch transform feature to its SageMaker machine learning platform to process data sets for non-real-time inferencing. How does that capability apply to customers trying to process larger data files?
Matt Wood: We support the two major ways you'd want to run predictions. You want to run predictions against fresh data as it arrives in real time; you can do that with SageMaker-hosted endpoints. But there are tons of cases in which you want to be able to apply predictions to large amounts of data, either that just arrives or gets exported from a data warehouse, or that is just too large in terms of the raw data size to process one by one. These two things are highly complementary.
We see a lot of customers that want to run billing reports or forecasting. They want to look at product sales at the end of a quarter or the end of a month [and] predict the demand going forward. Another really good example is [to] build a machine learning model and test it out on a data set you understand really well, which is really common in oil and gas, medicine and medical imaging.
In the keynote, you cited 100 new machine learning features or services [AWS has developed] since re:Invent last year. What feedback do you get from customers for your current slate [of AI services]?
Wood: What we heard very clearly was a couple things. No. 1, customers really value strong encryption and strong network isolation. A lot of that has to do with making sure customers have good encryption integrated with Key Management Service inside SageMaker. We also recently added PrivateLink support, which means you can connect up your notebooks and training environment directly to DynamoDB, Redshift or S3 without that data ever flowing out over the private internet. And you can put your endpoints over PrivateLink as well. [Another] big trend is around customers using multiple frameworks together. You'll see a lot of focus on improving TensorFlow, improving Apache MXNet, adding Chainer support, adding PyTorch support and making sure ONNX [Open Neural Network Exchange] works really well across those engines so that customers can take models trained in one and run them in a different engine.
What do you hear from enterprises that are reluctant or slow to adopt AI technologies? And what do you feel that you have to prove to those customers?
Wood: It's still early for a lot of enterprises, and particularly for regulated workloads, there's a lot of due diligence to do -- around HIPAA [Health Insurance Portability and Accountability Act], for example, getting HIPAA compliance in place. The question is: 'How can I move more quickly?' That's what we hear all the time.
There's two main pathways that we see [enterprises take] today. The first is: They try and look at the academic literature, [which] is very fast-moving, but also very abstract. It's hard to apply it to real business problems. The other is: You look around on the web, find some tutorials and try to learn it that way. That often gives you something which is up and running that works, but again, it glosses over the fundamentals of how do you collect training data, how do you label that data, how do you build and define a neural network, how do you train that neural network.
To help developers learn, you want a very fast feedback loop. You want to be able to try something out, learn from it, what worked and what didn't work, then make a change. It's kick-starting that flywheel, which is very challenging with machine learning.
What are some usage patterns or trends you've seen from SageMaker adopters that are particularly interesting?
Wood: A really big one is sports analytics. Major League Baseball selected SageMaker and the AWS AI platform to power their production stats that they use in their broadcasts and on [their] app. They've got some amazing ideas about how to build more predictive and more engaging visuals and analytics for their users. [It's the] same thing with Formula 1 [F1]. They're taking 65 years' worth of performance data from the cars -- they have terabytes of the stuff -- to model different performance of different cars but also to look at race prediction and build an entirely new category of visuals for F1 fans. The NFL [is] doing everything from computer vision to using player telemetry, using their position on the field to do route prediction and things like that. Sports analytics drives such an improvement in the experience for fans, and it's a big area of investment for us.
Another is healthcare and medical imaging. We see a lot of medical use cases -- things like disease prediction, such as how likely are you to have congestive heart failure in the next 12 months, do outpatient prediction, readmittance prediction, those sorts of things. We can actually look inside an X-ray and identify very early-stage lung cancer before the patient even knows that they're sick. [And] you can run that test so cheaply. You can basically run it against any chest X-ray.
You partnered with Microsoft on Gluon, the deep learning library. What's the status of that project? What other areas might you collaborate with Microsoft or another major vendor on an AI project?
Wood: Gluon is off to a great start. Celgene, a biotech that's doing drug toxicity prediction, is trying to speed up clinical trials to get drugs to market more quickly. All of that runs in SageMaker, and they use Gluon to build models. That's one example; we have more.
Other areas of collaboration we see is around other engines. For example, we were a launch partner for PyTorch 1.0 [a Python-based machine learning library, at Facebook's F8 conference]. PyTorch has a ton of interest from research scientists, particularly in academia, [and we] bring that up to SageMaker and work with Facebook on the development.
Microsoft President Bradford Smith recently called on Congress to consider federal regulation for facial recognition services. What is Amazon's stance on AI regulation? How much should customers determine ethical use of AI, facial recognition or other cloud services, and what is AWS' responsibility?
Wood: Our approach is that Rekognition, like all of our services, falls under our Acceptable Use Policy, [which] is very clear with what it allows and what it does not allow. One of the things that it does not allow is anything unconstitutional; mass surveillance, for example, is ruled out. We're very clear that customers need to take that responsibility, and if they fall outside our Acceptable Use [Policy}, just like anyone else on AWS, they will lose access to those services, because we won't support them. They need to be responsible with how they test, validate and communicate their use of these technologies because they can be hugely impactful.
The American Civil Liberties Union, among others, has asked AWS to stop selling Rekognition to law enforcement agencies. Will you comply with that request? If not, under what circumstances might that decision change?
Wood: Again, that's covered under our Acceptable Use Policy. If any customer in any domain is using any of our services in a way which falls outside of acceptable use, then they will lose access to that service.
Certainly, the Acceptable Use Policy covers lawful use, but do you think that also covers ethical use? That's a thornier question.
Wood: It is a thornier question. I think it's part of a broader dialogue that we need to have, just as we've had with motor cars and any large-scale technology which provides a lot of opportunity, but which also needs a public and open discussion.