ktsdesign - stock.adobe.com

Cloud computing for machine learning offers on-demand tools

Automated machine learning and MLaaS tools are now being developed for the cloud, and enterprises need better workflows and infrastructure to successfully integrate the technology.

Fueled by the need to expand machine learning, cloud service providers are offering machine leaning as a service, together with extensive storage and processing options, to save time, money and resources. These tools now offer services spanning across model training, data processing, model evaluation and prediction making.

Companies with a cloud-based infrastructure are starting to take advantage of cloud computing for machine learning services for provisioning AI training and inferencing tasks in the cloud. Popular cloud computing tools are deployed through APIs and integrated into existing microservices infrastructure.

How cloud powers AI

Cloud platforms are starting to create better building blocks for setting up and automating many aspects of the machine learning development and deployment pipeline, according to Shekhar Vemuri, CTO of Clairvoyant, an AI consultancy firm.

"Cloud providers are abstracting the infrastructure, giving data scientists a head start on most of the components, the building blocks and providing application developers the ability to consume pretrained models," Vemuri said.

These tools are being offered on demand, so enterprises only pay for what they use, which makes the overall ecosystem more compelling. As the industry sees the growth of two broad categories of AI-related services, machine learning as a service (MLaaS) and automated machine learning tools, providers are starting to roll out their own versions of automated machine learning tooling, such as Google Cloud AutoML, Amazon SageMaker Autopilot and Watson Studio AutoAI. In addition, automated machine learning capabilities can also be provisioned on top of popular open source packages, like auto-sklearn and MLBox, as well as cloud services provided by companies like DataRobot and H20.ai.

There have also been changes in ease of use, and better integration upstream with data wrangling and downstream with model serving capabilities. The aim of cloud platforms and tools are to get better at supporting the entire AI pipeline.

"Even until fairly recently, [automated machine learning] and MLaaS tools were more stand-alone tools that teams would use where data had to be moved in and out, whenever they had to build models and run experiments," Vemuri said.

Deploying machine learning

Deploying AI and cloud computing for machine learning can enable more enterprise adoption without expert data scientists. Jeff Fried, director of product management at InterSystems, sees many enterprises using machine learning to find a sweet spot in self-tuning applications. Instead of having to analyze and tune the parameters of memory use, operations can use machine learning to find the right parameters and adjust them dynamically.

"Not only can you eliminate the analysis work, you can deploy without necessarily engineering for peak load, knowing that the system will adjust as needed," Fried said.

Another application is in anomaly detection. Operational systems create a huge amount of data, and it's hard to figure out what to pay attention to. Developers can use machine learning to identify unusual events -- for example, a transaction or user load that is much higher than expected or a CPU utilization that could correlate with future issues.

"Trying to do that with traditional models and thresholds is a ton of work, especially because things vary a lot normally with time of day, weekends, holidays or seasons. [Machine learning] makes it much easier for DevOps," Fried said.

Fried believes that many enterprises will have to navigate various cultural challenges to make this democratization work in practice. Developers and test teams have traditionally worked with an expectation that the behavior of applications in production is reproducible in a test environment. But machine learning applications can change behavior in production for reasons that may be opaque to the developer and may be difficult, if not impossible, to reproduce in a test environment. As a result, development teams adopting AI in the cloud will need to learn how to deal with scenarios that can only be adjusted and not necessarily debugged, Fried said.

Better workflow is required to succeed in the cloud

For companies to get the most out of cloud computing for machine learning services, they will need to improve their process of provisioning and updating machine learning models, said Alicia Frame, lead project manager of data science at Neo4j, a graph database tools provider.

While it is now much easier to move a model from R&D into production, particularly if you already have the requisite data in the cloud, MLaaS tools are not a panacea.
Christian Selchau-HansenCEO and co-founder, Formation

Similar to what many enterprises have already done for DevOps, development teams need to figure out the right level of services. Although cloud AI services make it easier to augment the amount of capacity for a given workload, if you provision too much, you end up paying for more than you need, and if you provision too little, the application might crash when it gets popular.

Developers also need to start thinking about the process of updating models -- teams need to decide how often a model should be retrained based on time or in response to degradation in its performance.

Both processes of provisioning and updating require forecasting models to estimate growth trajectories for usage and data and frequent monitoring and alerts to make sure you upgrade your infrastructure or retrain your model before it's too late.

"It's meta -- you need machine learning to keep your machine learning models current and performant," Frame said.

Focus on the solution

Although broad cloud-based AI services can simplify many aspects of AI development, it's important to ensure that developers are using them to solve the right problem. Scott Stephenson, CEO and co-founder of speech recognition tools provider Deepgram, believes that many of these services focus on the technology rather than the appropriate goal.

"It's possible to get a good result if the problem that companies are trying to solve fits in a narrow window; otherwise, they end up hooking up five services where each of them are unoptimized for their specific needs," Stephenson said.

This results in teams spinning their wheels, using up engineers' time and spending millions of dollars on tuition for how to do it right. As a result, he sees many companies move away from these cloud AI services or only use them for basic things. Similarly, cloud AI users need to ensure that the resulting model is easy to deploy into production, said Christian Selchau-Hansen, CEO and co-founder of Formation, which develops software for customizing marketing experiences.

"While it is now much easier to move a model from R&D into production, particularly if you already have the requisite data in the cloud, MLaaS tools are not a panacea," he said.

Before deploying MLaaS tools, evaluate and solve the last-mile problem for connecting the raw data, the AI models and the production applications seamlessly. If an enterprise wants to create value from machine learning insights with their customers, it needs to invest in tools, as well processes and personnel, that can easily implement those insights into a variety of customer interactions across the customer journey.

Dig Deeper on AI infrastructure