peshkova - Fotolia
An introduction to intelligent document processing for CIOs
With IDP, enterprises can bring documents into automation workflows, which can help reduce document processing time and save on operational costs. Read more about its value here.
In some ways, documents are the lingua franca of enterprise integration, since they provide a relatively standard way of passing information between individuals, enterprises and institutions. This also involves a lot of manual work. Intelligent document processing (IDP) promises to make it easier to automate these workflows through a combination of document capture, language understanding and intelligent automation capabilities.
"IDP provides AI-driven automation for classifying documents, extracting data from them and exporting this data into corporate workflows, which helps streamline a company's business processes," said Artsiom Patotski, lead SharePoint architect and document management consultant at ScienceSoft, an IT consulting and software development company.
IDP is of particular importance for enterprises that operate with large amounts of unstructured data from documents coming from different devices and in various formats. They can combine IDP with any system within the enterprise that relies on documents, which in turn helps avoid human error, reduces document processing time and trims operational costs.
IDP a gateway to process excellence
The unstructured data buried in documents can make it difficult to develop better business processes that use it. Even relatively standard documents like purchasing orders or invoices can vary by company or department. As a result, organizations have often developed Ad-hoc and unstructured processes to manage the various types of documents, which can be challenging to connect to automated business processes.
"With intelligent document processing, your automated workflows and automatic business processes can be kicked off with actionable data to manage change, whether that is in response to a customer sending in a query or invoices being processed," said Burley Kawasaki, chief product officer at K2, a process automation tools provider.
The linking of ad-hoc and unstructured processes with business apps and fully automated workflows can deliver significant benefits that span an enterprise's end-to-end business processes.
Turning paper into a digital document is only the first step. The application of AI and machine learning can liberate the actionable business data embedded within analog and digital documents in a way that is accurate, cost-effective and highly scalable.
"The value derived from enterprise applications and digital transformation technology platforms such as business process management [BPM], [customer relationship management] CRM, and robotic process automation (RPA) is heavily reliant on the availability of actionable structured business information," said Carl Hillier, senior research director at Deep Analysis, a process management advisory service.
Improving access to this data elevates the level of intelligent automation, thus maximizing the benefits realized in transforming an organization's business operations. The more structured data available, the more valuable these systems become.
Different kinds of transformation
IDP transforms the content, context, relationships and entities buried in documents into meaningful structured data, said Bruce Orcutt, senior vice president of product marketing at ABBYY, an IDP vendor.
According to Orcutt, IDP enables three kinds of intelligence: vision, understanding and insight. Vision relates to the ability to digitize text in a document, apply image analysis to optimize readability of images and analyze the structural makeup of a document by segmenting words, phrases, sentences and paragraphs. Understanding involves using machine learning to properly classify document types and extract all relevant data, including relationships, context and entities. Insight relates to applying structure and meaning around the text to mimic human judgment like establishing the relationship between data (i.e. buyer vs. seller), dates, accounts and line items within tables.
Applications like robotic process automation (RPA), CRM, ERP and business process management (BPM) all need assistance in automating processes involving documents, emails and other unstructured data. IDP can apply vision, understanding and insight to integrate documents like an invoice, purchase order or bill of lading into other types of enterprise applications.
Applications where IDP shows value
According to Hillier, common examples where IDP provides significant value in the financial sector include accounts payable and receivable, supply chain and logistics, know your customer and anti-money laundering.
Orcutt noted that they are seeing a lot of traction in the insurance industry, where companies need to make sense of complex and highly variable documents relating to notification of loss, reports, estimates, invoices and supporting documents. These are full of free-form text, nested tables and variable information in unpredictable forms from variable sources.
IDP can also make it easier for employees to remotely collaborate on the authoring, editing and processing of documents. "IDP allows enterprises to maintain visibility and end-to-end tracking on documents throughout their lifecycle from anywhere and anytime," Kawasaki said.
For example, many insurance companies have annual triage processes. In this case, IDP can help to automatically triage claims to the correct team and only involves a human when there is not enough information to make a decision. Claims are then dealt with faster and customer satisfaction is improved as a result.
Healthcare organizations are also starting to use IDP to free up medical staff required for managing a variety of types of documents. For example, the UK National Health Services (NHS) used Automation Anywhere's IQ Bot to help process more than 100,000 COVID-19 test forms it receives daily.
IDP has also helped the financial industry quickly respond to the new demands in the wake of COVID-19 emergency measures. For example, Sunrise Bank used the Anvil IDP platform to process $127 million in paycheck protection program forgivable loans for 1,600 small businesses in just five days.
Training required
It's critical to train the IDP system to work with poorly scanned documents or ones with printing issues to avoid severe mistakes in document registration and data interpretation.
"Left unnoticed, weaknesses in IDP technologies can cause systematic mistakes, which translate into real business issues such as disordered customer data, incorrect payments and broken workflows," said Ivan Kot, senior manager at Itransition.
CIOs need to budget some time for training. A good practice is to utilize machine-human collaboration, where the IDP works alongside humans to learn from their decisions in order to drive lower error rates and improve automation over time, said Charlie Newark-French, chief operating officer at Hyperscience, an automation platform.
Hillier recommends companies start with a focus on how IDP will impact the achievement of an agreed business objective. Time must be allocated to ensure the AI component's training meets or exceed a predefined benchmark to maximize the likelihood of positive business outcomes. "Continuous monitoring of the performance of these components is an essential part of routine business operations," Hiller said.
IDP processing steps
Kashif Mahbub, vice president of product marketing at Automation Anywhere, said common IDP steps include:
- Pre-processing: The pre-processing step improves the quality of the documents by applying techniques such as noise reduction, binarization and de-skewing.
- Classification: Documents can contain multiple pages with different formats. Intelligent document classification uses AI-based technologies to automatically classify and separate multipage documents to pull out the relevant pages of information before extraction.
- Extraction: This step uses optical character recognition to digitize documents and machine learning technologies to extract specific data. Typically, IDP offerings include a library of pretrained extraction models, pre-populated with the right fields for extraction. Relevant information is extracted from the documents before it's validated for accuracy.
- Post-processing: Once data is extracted it goes through a series of validation rules and AI-driven techniques to improve the extraction results. Personnel overseeing this can further validate the data, which allows the process to continuously learn and improve over time.