With AI techniques, don't get stuck in the data quantity rut
Gartner's Erick Brethenoux gives practical advice on how CIOs can get started with AI. One of his tips: Think data quality, not data quantity.
Machine learning algorithms are often characterized as data-hungry, but Gartner's Erick Brethenoux said recently that for some AI use cases, CIOs should consider data quality -- not quantity -- when getting started.
In a recent webinar promoting his new research, Brethenoux seemed off-script compared to most analysts when he asserted, "The size of your enterprise, even the amount of data is not really conducive to whether you should be using these AI techniques."
It was one example of the practical -- and slightly unorthodox -- advice he provided to companies just getting started on the AI front. Here's a closer look at why Brethenoux highlighted data quality over quantity for some use cases, as well as why he believes it's crucial to start with a business problem. He also shared the five questions he asks of all his clients who tell him they're ready for AI.
Clean data is greater than more data
One of Brethenoux's most surprising tips during the webinar: When deciding whether to use AI techniques, data quantity is not a good rule of thumb. He gave the example of a small, regional bank client that's using AI to not only personalize loans, but also to begin providing a new service to customers with microloans.
"The amount of data they had was not great," he said. "However, the quality of that data had to be irreproachable."
In a phone interview afterward, Brethenoux reiterated this point. "The more data you have usually means the better you're able to find interesting correlations, but it's not necessary to have a huge amount of data to find something interesting," he said. If needed, enterprise AI teams can offset an internal data dearth with external data from providers.
Low-quality data, on the other hand, could trigger wild algorithmic goose chases. When an insurance company in Canada wanted to measure the risk of insurers based on how far they lived from work, the analysis produced junk results. It turned out that the distance data was labeled in both miles and kilometers, skewing the results, according to Brethenoux.
While quality is important, he said CIOs should not feel compelled to scrub every piece of data clean and delay getting started.
"What you end up doing is continuously cleaning, and a lot of the data might not be helpful for what you're trying to do from a machine learning or predictive analytics perspective," he said. "That's why I insist you start with a use case."
Focus on business problems
To introduce AI techniques, CIOs should start with a business problem and work toward a solution. The emphasis is on business problem. "A use case is not a good idea. A use case is a business case," Brethenoux said.
He recommended CIOs talk to the business to identify pain points and then "scope down" the problem into something they can deliver on in weeks. That's how CIOs will figure out what they don't know and what they need to look at, Brethenoux said.
The advice came partly from his observation that "asset-centric companies," such as those in manufacturing or energy, have seen more success with AI techniques than "service-centric companies," such as marketing firms.
"Most asset-centric organizations are dealing with engineering-centric cultures that start with a use case and work backward to the data and techniques needed to solve the problem," he said during the webinar. In service-centric companies, Brethenoux said the process often happens in reverse: Data professionals find trends or correlations in the data and then go looking for a problem.
Brethenoux elaborated on this point during a phone interview, adding that service-centric companies can sometimes suffer from a lack of focus, which can become problematic when introducing AI techniques to the enterprise.
"When you start dealing with a lot more data and problems, it becomes the parochial hammer looking for a nail," he said. "It's a little less of a disciplined approach to problem-solving."
Five questions to ask
When clients tell Brethenoux they want to use AI to solve a problem, he runs through a list of five questions to determine if AI is the appropriate course of action and if the client is prepared to go down the AI path. The questions are as follows:
- What is the business use case? Brethenoux said this process should include mapping out expectations and desired results, as well as determining how to measure business value if AI technologies are introduced.
- Do you have the skills? CIOs may not need to hire an army of data scientists. Brethenoux suggested they start by taking stock of internal skills and consider training the competent data engineer or subject-matter expert who is a proven Excel wizard. He also recommended hosting a hackathon to help uncover talent. "Or you may want to borrow those skills as well: Hire consultants, in the short term, to help you out and bootstrap some of these efforts," he said.
- Do you have the data? Companies may not need as much data as they think, but they will need data and it should be in reasonable shape. When Brethenoux asked a large insurance company that wanted to take on-the-spot pictures of accidents and predict the type of claim and the cost, he got a stack of manila folders with photographs paper-clipped to them. The company spent the next two months digitizing, curating and labeling the images. "When you address the use case you're trying to solve, you need to make sure you have the appropriate data," he said.
- What kind of technology do you need? The business problem will drive technology decisions. Brethenoux cited several AI techniques that range in maturity -- from probabilistic reasoning such as machine learning, identified as the most mature technique, to agent-based programming, identified as the least mature technique.
- How do we organize? Once the company has a couple of proofs of concept under its belt, it should consider where such a team should live, who the team should report to as well as how the team should be supported to advance its skills and techniques.