Kit Wai Chan - Fotolia

Defining enterprise AI: From ETL to modern AI infrastructure

The promise of enterprise AI is built on old ETL technologies, and it relies on an AI infrastructure effectively integrating and processing loads of data.

Let's cut through the marketing hype and define enterprise artificial intelligence. Enterprise AI simulates the cognitive functions of the human mind -- learning, reasoning, perception, planning, problem solving and self-correction -- with computer systems. Enterprise AI is part of a range of business applications, such as expert systems, machine learning, speech recognition and image recognition.

Enterprise AI also encompasses automation tools, including robotic process automation, which perform high-volume, repeatable tasks otherwise handled by human workers, as well as physical robots found in warehouses or on factory production lines. Ultimately, this combination increases efficiencies and helps to perform mundane and dangerous tasks.

In addition, machine learning enables computers to learn without explicit programming. Machine learning evolved from the study of pattern recognition and computational learning theory in AI, and it now involves the construction of algorithms that can learn from and make predictions about data.

True to its label, image recognition enables machines to see. The technology captures and analyzes visual data using a camera, analog-to-digital conversion and digital signal processing. Common applications of image recognition include medical imaging and signature identification.

Organizations want greater business insights and competitive advantages. This factor is driving increased demand for enterprise AI uses and big data. As enterprises gather more and more data from a variety of sources, artificial intelligence can help identify trends and patterns much more efficiently than people.

AI largely impacts enterprise data management strategies and IT infrastructures, and organizations need to take steps to support AI initiatives. For example, an AI infrastructure must support frameworks such as TensorFlow, Caffe, Theano and Torch, as well as GPU environments, to deliver fast computational power.

Many AI infrastructure capabilities come from the cloud and infrastructure-as-a-service offerings. Service providers must continue to support AI functionality and integrate AI as a key component of their infrastructure and services.

Components of AI
Components of AI

AI data evolution: ETL and integration

Effective data integration is critical for enterprise AI. Data is the lifeblood of enterprise AI applications and its extraction and storage must be optimized.

One of the earlier integration methods, ETL (extract, transform and load), refers to three distinct functions of database management combined into one programming tool. Enterprises use ETL to move data from one database to another, often to load it to and from data warehouses.

The extract function reads data from a specific database and extracts a subset of it. The transform function works with the acquired data, using rules or creating combinations with other data to convert it to the desired state. And the load function writes the resulting data to a target database.

Numerous data integration platforms follow ETL. These include enterprise information integration (EII), enterprise application integration (EAI) and cloud-based integration platform as a service (iPaaS).

  • EII consolidates a large group of distinct data sources into a single resource, providing a unified view of an organization. Data can be saved in multiple formats, including relational databases, text, XML, and storage systems with proprietary indexing and data access schemes.
  • With EAI, companies share business processes and data across business applications, including enterprise resource planning, customer relationship management, supply chain management and other resources.
  • iPaaS services provide real-time interoperability between cloud-based applications and databases.

The latest evolution of data integration includes stream processing, data preparation and messaging platforms.

  • Stream processing refers to processing data in motion, or the processing of data as soon as it's produced or received from websites, business applications, mobile devices, sensors or other sources. After receiving an event from a stream, a processing application reacts to the event. Streaming computations can process multiple data streams at the same time.
  • Data preparation involves the collection, cleansing and consolidation of data into a single file or data table, mainly for analytics. Preparation is especially important when data is unstructured or combined from multiple sources.
  • Messaging platforms connect applications, services and devices across various technologies via a cloud-based messaging framework.

The difference between ETL and a similar term, ELT (extract, load, transform), lies within the moment of transformation. With ELT, raw data is loaded into the data warehouse and transformation then occurs, whereas with ETL this transformation occurs before it is loaded.

AI data evolution: Processing and neural networks

Data processing has also evolved over time, and neural networks -- or artificial neural networks -- are a significant development within the context of enterprise AI. Deep learning technologies form the basis of these networks, modeled to some extent after the human brain. By feeding computers loads of data, deep learning technologies enable the computer to learn. For that reason, they are vital to AI initiatives.

A neural network consists of hardware and software designed to work like brain neurons. From a commercial application standpoint, neural networks aim to solve complex signal processing and pattern recognition problems, including speech-to-text transcription, handwriting recognition and facial recognition.

Data quality can determine enterprise AI success.

Typically, a neural network involves many processors that operate in parallel, arranged in highly connected tiers. The first tier receives raw data, and each successive tier receives the output from the tier preceding it. The final tier produces the output.

Neural networks are adaptive and able to alter themselves as they learn over time. Data scientists and engineers train networks with large amounts of information. With each subsequent training, the networks learn the desired output.

Another significant aspect of AI data processing is the need for high-quality data. While data quality has always been important, it's arguably more vital than ever with AI initiatives.

Data quality can determine enterprise AI success. If the incoming data is poor -- for example, if it contains incorrect or out-of-date figures -- the outgoing results will likely reflect that low quality and decrease the value of AI for the organization. As AI data quality improves, so do the insights.

AI myths and the need for a reality check

AI offers clearly present benefits for organizations. But companies need to be aware of misconceptions about the technology and do a reality check when it comes to expectations.

In 2017, research firm Gartner described a few of the myths related to AI. One is that AI is a single entity that companies can buy. In reality, it's a collection of technologies used in applications and systems to add specific functional capabilities, and it requires a robust AI infrastructure and an organizational-wide commitment. Without C-level commitment and proof of ROI, AI investments can fail.

Another myth is that every problem is an optimal use case for AI. There is enough data out there for organizations to understand where to apply artificial intelligence technologies. No longer is it advisable to spread the investment widely; rather, organizations should target their AI programs.

Yet another myth is that AI has human characteristics. AI developers use advanced analytics, special algorithms and large amounts of data to deceive people into thinking that their product learns on its own and understands and empathizes with the user. Organizations need to guard against thinking the technologies are more capable than they really are.

Dig Deeper on AI infrastructure