pio3 - Fotolia
Get started with natural language processing APIs in cloud
As the cloud makes AI technology increasingly accessible to enterprises, more developers look to incorporate natural language processing into their apps.
Applications that can do more with fewer resources are a shoo-in for market disruption, which is why AI has garnered so much attention among developers.
With the popularity of voice assistant technologies, natural language processing APIs and similar services have become one of the most in demand -- and better understood -- subdisciplines of AI. There are decades of research to support the field, and it's used in countless products to analyze speech and text for language and sentiment, improve the ability to search unstructured data and even parse intent from conversations as they happen.
Natural language processing has only recently become affordable enough to productize for the general public. Today, it is so commonplace that the major cloud providers -- as well as a number of smaller players -- offer it as a service. Each vendor has its own feature set to process natural, human-readable text.
Let's review some of the most prominent natural language processing APIs and cloud-based services, as well as ways developers can incorporate them into applications.
Amazon Comprehend
This AWS managed service uses machine learning to extract key phrases and identify the language in a given text. Amazon Comprehend can work with any AWS-supported application, and it has features such as sentiment analysis, tokenization and automated text file organization.
Amazon Comprehend stands out because of its pretrained medical variant: Amazon Comprehend Medical. This service can identify medical information from a given text, which enables advanced analysis of medical records to identify information such as medical conditions and medications from any number of sources.
Microsoft Azure Cognitive Services
The Microsoft Azure portfolio of natural language processing tools is broken out into several different, more targeted services and uses. For example, if developers want to build applications that can analyze the sentiment or identify the language of a given text, they can use the Azure Text Analytics API. Alternatively, Azure Language Understanding Intelligent Services has the ability to understand things such as user intent. This is especially valuable when developers build chatbots, voice-powered products and even customer service platforms.
Google Cloud Natural Language
With Cloud Natural Language, Google also puts a heavy emphasis on entity extraction, sentiment analysis, syntax analysis and categorization. However, this API differs from the others because it is powered by Google's own deep learning modules -- the same ones that drive the query comprehension behind Google Search and the language understanding system behind Google Assistant.
Third-party options
There are countless natural language processing APIs and services on the market. For example, companies such as Diffbot offer features that let users extract data specifically from websites, while other vendors, like MonkeyLearn, provide services to automate workflows based on unstructured data. Don't be afraid to hunt around for a more appropriate alternative if a particular service offers far more functionality than you might need.
It can be easy to automatically select the cloud provider you use for the rest of your infrastructure, but each vendor has its own strengths and weaknesses in its particular training sets. So, before you choose a provider, experiment with its services.
Dig in
Regardless of the service you choose, it can all feel a little overwhelming at first. Machine learning is a complicated field, and while natural language processing APIs are more accessible than ever, that doesn't necessarily make the technology any less confusing.
Speech to text
You also can use natural language processing to analyze speech. Most providers don't offer direct processing of the spoken word, but they do generally offer other services that convert speech to text. This can serve as an excellent intermediary step to act off of spoken data, because your models can interact with those words like any other document once they're converted to text.
The terminology alone can be difficult to wrap your head around. What exactly is an entity or tokenization or sentiment? It's not necessary to master all of these terms, but it will help you identify other areas where you can improve an application. Get your hands dirty with the API or service you choose, as this enables you to control the input data and better understand how the output was created.
Train
Not all training sets are created equal, which becomes obvious when you use nonstandard data. For example, sentiment analysis essentially identifies how positive or negative a piece of text is. This is frequently used in social media trending, online review analysis and customer support triage.
However, many beginners mistakenly assume the engine that powers their sentiment analysis is tuned to their data. In many circumstances, a pretrained sentiment analysis model does an excellent job. But if you have data that is highly nuanced, such as a lot of slang or colloquialisms, you will have far better results if you train the algorithm to identify the sentiment of the specific content that you plan to analyze.