Sergey Tarasov - stock.adobe.com

How to develop a successful, modern AI infrastructure

Before AI can revolutionize business processes or decision-making, companies need a strong foundation. These tools, platforms and applications help enterprises get started with AI.

Artificial intelligence, enabled by machine learning and cognitive technologies, has taken many industries by storm. Only in the past decade has AI as a concept entered the day-to-day experience of the enterprise. As such, companies are now rushing to implement AI projects of all shapes and sizes. Correspondingly, government, enterprise, and venture capital investment in tools and technology that support this widespread adoption of AI has also increased dramatically.

The shift from experimental research and academic approaches to acceptable, common use systems is rapidly changing the environment of the tools landscape. In this area of continuous development, it's difficult to keep track of the new tools on the market that organizations of all types are using to develop their AI infrastructure.

Data-centric ML development tools

In the early days of AI development, all machine learning model creation and management was done locally on machines owned and operated by data scientists. As such, all the platforms that saw early traction are focused on the individual data scientist or their immediate teams. Open source dominates in this space, especially by offerings in the Python and R ecosystems. Libraries developed for these ecosystems include the vastly popular scikit-learn, Keras, TensorFlow, and PyTorch open source toolkits as well as the popular Jupyter notebook, and Google's Colaboratory built on top of that, as well as a wide range of open source tools and toolkits covered in a previous article on this topic.

However, open source is not the end-all for machine learning model development. The tools alone lack specific requirements for the management of models and data that are needed by serious machine learning-focused data scientists and developers. As a result, over the past decade, tools focused on the immediate needs to build and train machine learning models have emerged. These tools have a focus on algorithm selection, tuning and evaluation, with the final result being Python, R, Java or other objects that can then be directly used to answer any specific ML-related queries or data science needs, or put into operation by production teams to be used in more highly scalable manners.

These tools include those by the major platform vendors including Amazon, Microsoft, Google and IBM, as well as by focused data science and ML vendors including H2O, RapidMiner, DataRobot, Databricks, Anaconda, Dataiku, Domino, KNIME, Alteryx, Ayasdi, SAS and Mathworks. Since data science and ML model development is so data-dependent and data-centric, big data vendors have entered this space with offerings from vendors including Cloudera and SAP. The tools all share a focus on data centricity, with many of the tools having origins in big data or data analytics. As a result, the core features of these systems are algorithm and model focused, and not as much operationalization or consumption focused. However, model operationalization and "ML ops" is rapidly becoming the forefront of evolution for these tools.

The biggest change in the machine learning development space has been the emergence of autoML. Given the lack of data science skills and expertise, many ML modeling and development tools have released capabilities to automatically handle aspects of ML model development that used to require the time and expertise of the user. In particular, data scientists and ML developers would have to clean and process their data, selecting among the wide array of algorithms, configuring and managing the training of that model, tuning the model and selecting the right hyperparameters, handling model evaluation and a variety of additional steps required to operationalize the resultant models. AutoML tools have emerged to handle many, if not all, of those steps. As a result, organizations are finding much greater ability to simply drag and drop their data set into a tool, click a few options, then watch and wait as a suitable model is automatically selected, tuned, configured and set up for operationalization. AutoML vendors include open source solutions such as Auto-sklearn, Auto-WEKA, OptiML AutoML and TPOT, as well as commercial offerings from companies such as Cloudera, DataRobot, Google, H2O.ai, RapidMiner and others.

ML as a service, cloud ML, and model as a service

Given machine learning algorithms' need for data, big cloud vendors have been some of the biggest proponents and supporters of machine learning. Amazon, Google, Microsoft, IBM, Oracle, SAP and others are building substantial portfolios for machine learning development and management. In the world of developer-oriented tools, the cloud-based offerings of Amazon, Google, Microsoft and IBM stand apart from the rest.

Known as machine learning as a service (MLaaS) or cloudML offerings, cloud-based offerings provide the full range of development, management and operationalization tools needed to put machine learning and AI to work for a wide range of organizations' AI infrastructure. The Amazon Web Services (AWS) machine learning solution is offered primarily through AWS SageMaker, but also includes a number of higher-level AI and ML capabilities for computer vision, natural language processing, predictive analytics and other AI application areas. IBM's Watson was one of the first to be commercially available for developers to experiment with ML and put AI into real-world, enterprise settings. Google's Cloud ML Engine brings Google's hosted platform to the fore to enable developers and data scientists to run and develop machine learning models and datasets. Microsoft Azure ML has likewise proven to provide a wide range of tools and solutions for data scientists, developers and administrators looking to put ML into production.

Separate from the MLaaS market is the concept of model as a service. Rather than providing the environment to build, run and manage your own models, model as a service gives you access to prebuilt and trained models specific to individual tasks. Clarifai, Gumgum, Modeldepot, Imagga and SightHound are major companies building and curating ML models for use. As a developer, you can query these models that will provide results as specified. For example, some models might identify specific things in images while others might help you categorize text or process natural language. Many model-as-a-service offerings focus on specific models targeted to image recognition or text analysis but there is an emerging class of companies trying to gather a widely curated set of models applicable to many different domains.

ML ops and the need to manage model usage

One of the newest movements in machine learning development comes with the realization that for many organizations, their challenges start not with producing models, but with using and consuming them. The need to manage the operationalization of machine learning models, or ML "ops" is becoming increasingly urgent as the number of models in production continues to grow exponentially. Not only are companies producing and consuming their own models, but they're increasingly making use of vendor and third-party models.

Using models in production brings up a lot of concerns, including making sure that models are providing reliable, secure and manageable results in an environment of continuous change. An emerging set of ML ops tools provide capabilities for machine learning model governance, version control, security, model discovery, model transparency and model monitoring and management. These tools, like ParallelM, make sure that only qualified users are allowed to make use of certain models, help ensure that new versions of models don't cause unpredictable results, help safeguard models from data poisoning and cybersecurity attacks, and make sure that the models continue to provide results at the required levels of accuracy and precision as needed by their usage constraints.

Fundamental skills still needed

Before an enterprise can get started with AI, there are a few key considerations that must be made. While it is true that an increasingly diverse range of users are able to develop and make use of models, it is still necessary for ML developers and users to have skill sets to effectively use these systems. At the most fundamental levels, organizations still need data scientists with mathematical knowledge and a solid understanding of algorithms in order to build their own models. In order to not only yield effective results but analyze and understand the results being given, it is crucial for a citizen data scientist on the team to be well-versed with probability and statistics, an integral part of working with machine learning.

Since a fair amount of exploration and theorizing goes into determining how to utilize these systems to yield the desired results, having an employee who can think outside the box and is willing to truly explore the extents of these systems is important when it comes to getting the most out of them. Since AI platforms and tools are constantly changing, there is also a need for the ML team to be able to stay up to date on modern methods for ML model creation, autoML capabilities, ML Ops and other rapidly changing technology ecosystem considerations. The artificial intelligence landscape is constantly changing, making it important for those working within this area to understand that today's platform investment in their AI infrastructure might have to change tomorrow. 

Dig Deeper on Artificial intelligence platforms