Imagery Majestic - Fotolia
AI-assisted image and video search is next content frontier
You can search text, but it's harder to catalog images and video. New tools are automating those processes at the enterprise level.
The larger the enterprise, the bigger the content lake -- or swamp, depending on the murkiness and degree of difficulty finding stuff. In those metaphorical pools swim vast stores of text, video and images containing content critical to institutional knowledge, employee education and marketing reuse.
The problem is tracking, cataloguing and making the content searchable so employees can find what they need when they need it. Image and video search is especially difficult, because metadata is the only text a conventional search can latch onto in those files. And the metadata is only as good as the human curation describing the contents.
Rudimentary image and video search utilities have existed for years, but today's tech epoch is marked by a convergence of storage capacity, computing power and AI. Add to that ever-more-accurate natural language processing (NLP) and voice-, face- and object-recognition technologies to make non-text content on enterprise servers and clouds eminently more discoverable.
Everyone wants Google-grade search
Enterprise search, in general, can't possibly rival Google search, even though employees might have grown accustomed to that in their personal lives, according to Kim Frehe, manager at SharePoint consultancy Avanade, who spoke earlier this year at SharePoint Fest D.C. "Google employs a lot of people to make their search work like Google," Frehe said during her presentation. "I don't know if your organization has that much money to hire all those people to do that all the time, but it does require a little bit of work."
AI may help push enterprise search toward Google standards, making text, image and video search results more accurate and relevant -- even if that AI-powered search will always lag behind Google and its engineering army. Machine learning search tools are getting increasingly smarter and easier to put into action each year, vendors claim. And judging by near-full attendance at content management AI product roadmap conference sessions this year, there's a lot of customer interest in using such tools.
Humans can't keep up with content proliferation
Image and video search adds a layer of complexity to content compared to text documents; metadata is just a cursory source of information that describes what content resides in those file types. AI tools, however, show promise in automating the curation process of ever-expanding multimedia content stores by writing each file's metadata based on what facial, speech and object recognition identifies.
AI tools are helping OpenText Corp. customers catalog previously unfindable or laborious-to-tag content in notoriously difficult digital assets such as video streams, said Guy Hellier, vice president of product management at OpenText, during the company's 2018 user conference.
OpenText plans to expand the capabilities of its AI tools. "If you're using more digital media in more places," Hellier explained, "you need algorithms that are going to help you better take advantage of and describe that content -- and minimize processing costs." Accuracy is improving every year, and "algorithms are changing very rapidly," he added.
After everything is catalogued, there's another issue AI needs to tackle. Search rankings can be problematic with an organization's frequently used topics. Content management system native search tools, such as those in SharePoint, tend to return thousands more results than the searcher intends.
Results then may be ranked by date or other criteria that don't match the searcher's intent, so the one file the searcher needs could be buried several pages down and derail productivity. Properly tuned AI algorithms that record and analyze searcher behavior can learn to rank and filter results and require less employee search time.
Cloudtenna crashes the scene
At least that's the hope for new search players like Cloudtenna. The startup claims it has 30 beta customers, including large enterprises and universities, and is currently funded by $4 million in venture capital and a strategic investment by Citrix.
Aaron GanekCEO, Cloudtenna
There are other complicating factors. Most enterprises harbor content on both on-premises servers and in cloud services such as Microsoft Office 365, Google Docs, Box and other popular platforms, making all-enterprise searches difficult. In addition, access and identity controls restrict some content to a subset of employees.
"Files are scattered across various applications, some cloud or on-premises servers," Cloudtenna CEO Aaron Ganek said, adding that cloud applications like Salesforce and Microsoft OneDrive pose technical challenges. "Files are living in multiple different containers. Search requires too much effort; it's an absolute pain to find a file."
Cloudtenna claims it can stitch all these disparate sources together, even for large companies spread across cloud and on-premises networks and no matter how accurate the metadata, by using machine learning tools to return relevant results in 400 to 600 milliseconds. That search speed includes unstructured image and video content searches using a combination of widely available tools like the Google Vision API and Google TensorFlow as well as planned upgrades, Ganek said.
Panopto's own NLP for video
Some organizations that need more powerful image and video search may opt for a third-party tool to augment their enterprise search technology. Synaptics, which makes biometric and touchscreen sensors as well as displays for many consumer and business devices, needed such video search firepower to catalog educational, technical and product-training content for internal use by its 2,400 employees.
"Sometimes a video is worth a million words," said Minette Chan, director of global learning solutions at Synaptics. "Imagine we're trying to fix a smartphone display issue; we're trying to describe 'this pixel' and 'this thumb.' ... The video just makes it so much more easy."
Minette Chandirector of global learning solutions, Synaptics
But Chan couldn't possibly catalog all Synaptics' videos by herself, including content trapped on single copies of VHS tapes. After evaluating several vendors, her team selected multimedia platform vendor Panopto as well as APIs connecting to many popular content management platforms. Now, Chan said, the image and video search tool has made much more Synaptics content accessible and quickly findable -- to employees.
Panopto's proprietary NLP engine is customized for video files, according to the vendor's marketing vice president, Ari Bixhorn. Customers, he said, are looking to catalog livestreamed content such as corporate all-hands meetings and training materials for later use as well as field-service support content.
Corporate video content differs from the average three-minute clips consumers typically watch on YouTube, Bixhorn said. Customers typically know what they're getting with a quick YouTube clip, so internal search and image recognition adds little value. Corporate video content, however, is a whole other story.
Ari Bixhornmarketing vice president, Panopto
One Panopto customer that dispatches printer repair techs to offices is replacing paper-printed manuals with video content delivered on Apple iPads. That requires video search that can quickly identify the printer's make, model, problem and supporting details. Previously, the company's field techs had to "hunt and peck in the timeline," Bixhorn said. Now, automated speech recognition and cataloging of the video clip content is perhaps the most crucial component in helping the field techs speed up repairs.
"When you look at our typical customer use cases for video, such as compliance training videos or employee onboarding, those videos are 15 to 60 minutes on average," Bixhorn explained. "Employees don't just want to find a video, they want to find a precise moment somewhere inside of that video."
New tools that apply AI to assistive technologies like object, voice and facial recognition are bringing image and video search up to par with text search, which Yahoo and later Google essentially perfected more than a decade ago.