How speech recognition can augment UC workflows
Speech technology applications can provide new enhancements for collaboration and productivity, from real-time meeting transcriptions to personal virtual assistants.
While speech technology has been around for more than a decade, applications and use cases have recently emerged to provide organizations new ways to address employee communication and productivity. With help from AI, the accuracy of speech technology applications has approached human quality.
As use cases become more advanced, organizations need to have a thorough understanding of the speech technology features offered by their unified communications (UC) vendor, as well as stand-alone vendors. Organizations also need to understand security concerns that come with adopting speech recognition applications and how their business applications will change.
The basics of speech technology
Speech technology enables an electronic device, such as a mobile phone or computer, to recognize, analyze and understand spoken word or audio. The technology uses machine learning to recognize and analyze speech signals. For example, machine learning can identify verbal patterns and predict the outcome of a call.
Machine learning in speech technology uses deep learning models to sort through and process voice data to improve future performance. Speech technology also uses natural language processing (NLP) to understand the complexity of spoken language by analyzing the syntax and semantic meaning of words.
In UC, speech recognition applications include speech to text and interactive voice response for customer service and support; personal digital assistants, like Alexa for Business; and real-time meeting transcriptions.
Speech technology in UC
UC vendors are turning to speech technology to improve workplace collaboration and productivity and to differentiate their services. AI-driven speech technology in UC includes offerings like Cisco Webex Assistant and Alexa for Business.
Organizations can also complement their UC services with purpose-built offerings, such as speech recognition and analytics, from niche vendors, like LumenVox and Speechmatics.
Speech recognition applications in UC come in two flavors: command-and-control apps and AI-driven apps for personal productivity.
Command-and-control apps use speech commands to automate tasks that would otherwise be done through text-based channels, such as searching calendars for meeting availability and document sharing. Speech commands could also enable users to control the meeting room, from adjusting lighting or controlling AV equipment.
AI-driven speech recognition apps are intended to improve personal productivity through speech to text and text to speech. These apps can perform small tasks, like generating an email using voice commands, or larger tasks, like transcribing meetings in real time.
Speech technology security concerns
In the UC space, speech recognition applications are commonly found in the form of chatbots and personal assistants. Chatbots generally operate on a single turn exchange, where the user gives a command or query to perform a specific task, such as turning on a light or checking messages.
Personal virtual assistants, such as Alexa for Business and Microsoft's Cortana, use NLP and machine learning to improve over time. They can perform similar tasks as chatbots, such as launching a meeting in a conference room, but also retain information about a user to create a contextualized experience and better respond to user needs. Some virtual assistants also offer open APIs to integrate with workflows and applications.
Chatbots and virtual assistants do come with some security concerns regarding user privacy, particularly if the chatbot or assistant is part of a device that passively listens to conversations in offices and meeting rooms. But these devices generally don't transmit data until they are activated by a key phrase.
Some organizations, particularly in regulated industries, like healthcare, may have concerns regarding archiving conversations and controlling access to chatbots and virtual assistants if they involve sensitive data.
Speech recognition expectations
Not all users will be comfortable communicating with a chatbot or virtual assistant. But, as speech recognition in the enterprise grows, organizations should educate users on what to expect from speech technology applications.
Users should expect workplace applications to become more voice-driven with the adoption of speech technology. They will be able to use voice commands to complete tasks, from note taking to sending email.
But users should keep in mind that speech technology may never be perfect. While the accuracy of the speech recognition is approaching human quality and improves the more it's used, it may never achieve 100% accuracy.