NLP makes augmented data discovery a reality in analytics
BI vendors are increasingly using NLP technology to make their products work more like a web search, with simplified user interfaces and improved ease of use for customers.
Adding natural language processing (NLP) capabilities to business intelligence (BI) and analytics tools makes them easier to use for augmented data discovery. And these vendors are on a mission to democratize complex data analysis.
"The big difference is the use of technologies like deep learning, which allow you to have a much better understanding of language," said Nigel Duffy, global artificial intelligence leader at professional services firm EY. "We have a richer understanding of what words mean."
The current state of the art, at least among BI and analytics vendors, is text-based natural language-querying capabilities. For example, if a user types in a general query such as "all sales for April," typically the tool would provide a type-ahead menu of more specific options, like a Google search, such as "all sales for April by count" or "all sales by dollar value."
Some vendor options with these capabilities include Tableau, Knowi BI Natural Language 2.0 platform, Microsoft Power BI and Arcadia Data. Other vendors like Sisense appear to be laying the groundwork for an NLP search capability with aspirational content. But that doesn't necessarily mean the tool doesn't have natural language capabilities.
For example, Sisense, Qlik and Tableau use Nature Science's Quill natural language generation capabilities to "narrate" data visualizations. This facilitates a common understanding of data visualizations, which is helpful because individuals might interpret the same data visualization differently.
Few BI and analytics vendors offer NLP search today, although Gartner expects that, by 2020, half of all analytical queries will be generated via NLP or voice search -- or they'll be generated automatically.
How BI and analytics vendors get NLP search wrong
Some of the deep learning techniques used to improve NLP search are convolutional neural networks (CNNs). These networks are generally associated with image analysis and recurrent neural networks, and they deal with sequential data.
"There are recurrent neural networks for specific ideas and convolutional neural nets for discovering new categories or decomposing a certain number of dimensions. You can use both," said Erick Brethenoux, senior director analyst at Gartner. "Those techniques have not replaced the others, such as symbolic techniques. They complement each other."
Symbolic techniques are rule-driven, while more modern techniques are statistical.
Gartner expects that the use of graph analytics will grow in the future because enterprises will need to ask complex questions across complex data.
"With knowledge graphs, you represent information in the form of a network of entities that are linked by rich connections that have levels. So, this article is linked to that, this data is linked to that data, this data was derived from that data -- and it is also using that application or that report," Brethenoux said. "It offers a lot of benefits in addition to accelerating search, and the way we can use it at scale now with a very large number of nodes and entities is starting to become very interesting."
Knowledge graphs could be used to discover and show relationships buried in volumes of structured and unstructured data. Like many other approaches, this isn't new. In fact, Brethenoux said the old AltaVista search engine from 1995 used knowledge graphs. Google uses knowledge graphs now.
Context is key in augmented data discovery
Context is essential to rendering a relevant search result. It's not as simple as understanding the context of the words used in a search. There's also the context of the search itself: Who's asking? What do we know about that person (role/persona)? What's the person trying to achieve?
"The semantic can vary depending on the pragmatics that you might be using. Pragmatics is not just context, it's also intention," Brethenoux said. "Maybe I want to look at that sales data because I want to promote one of my salespeople -- or maybe I want to demean that salesperson so that I get the promotion."
Intent must be inferred sometimes; it's not always explicit. For example, what people type or say doesn't always represent what they actually mean. In addition, different people may use different words to convey the same thought.
Another type of context is search histories that can indicate a pattern or patterns of behavior. They can then be used to accelerate the delivery of search results. This is sometimes reflected in type-ahead search results that have been cached in local memory. More enduring results would require greater amounts of remote memory, most likely in the cloud.
Yet other interesting contexts are time and place -- a search result should be able to adapt to what the user is doing in a particular place at a certain time. Are they driving to work? If so, then a search result should probably be a spoken response. If you're looking at a mobile device on a train or in a restaurant, a simple visual and short description works. If you're using a laptop at work, a more complex presentation might be more appropriate.
"It's never as easy as picking up software and making it work," said Beena Ammanath, managing director of AI at Deloitte Consulting. "It's a machine and you need to train it."
The road ahead for natural language search tools
Combining natural language understanding and natural language generation will result in dynamic, bi-directional human-machine communication that will take several forms: text, voice and images. In text and voice scenarios, the BI or analytics solution can converse with the user to render the desired result -- regardless of data-related and query-related search complexity.
Data visualizations also will become more interactive, if not immersive, along the lines of Busby from Oblong Industries. This product focuses on immersive interfaces, not specifically BI or analytics. However, its concepts could have a ripple effect on how people interact with data and thus, augmented data discovery.
"I think the future of BI is no BI," Brethenoux added. "Don't ask me to search and look for things anymore. Give me that piece of information when I need it and if I need it. Come to me when there's something I need to know."