KnowledgeLake Blog

The History of Capture: From Paper to IDP

Written by Brian Posnanski | Apr 7, 2022 6:51:54 PM

Document capture has a long and impressive heritage. From the early days of microfilm in the 1930s to the invention of computer-based optical character recognition (OCR) in the 1970s, capture technology has significantly evolved. For that we can thank changing business requirements and the arrival of new technologies such as artificial intelligence (AI).

Initially capture revolved around scanning and ingesting paper-based documents. But the primary objective of capture solutions has changed from the digitization of paper to the extraction of information from any type of unstructured content (such as documents, spreadsheets, reports, etc.) Along the way, capture has continued to get better and better.

This blog follows the history of capture, from the days of early digitization to the onset of intelligent automation. Interested to know how we got here? Read on!

Capture 1.0: Capture Hits the Mainstream

 

Capture has very humble origins. In the beginning it had one priority: the digitization of paper. As more organizations leaned towards paperless initiatives, electronic capture became an essential part of large enterprises needing to digitize previously manual processes.

The first capture solutions simply focused on scanning documents and turning them into digital images in TIFF format. Early digitization did not include data extraction from the scanned images. It was enough to simply store digital versions of paper documents.

Organizations quickly wanted more. In particular they wanted to tap the unstructured data stored in documents and turn it into structured data that could feed business processes. For instance, taking a digital picture of an invoice was of limited value. Automatically extracting the supplier’s name and the total invoice value was exponentially more beneficial.

This desire to extract data spawned a number of new technologies. Optical character recognition (OCR) recognized typed information. Intelligent character recognition (ICR) interpreted handwriting. And forms processing handled information entered on regular business forms. Each of these technologies increased the value of capture significantly, especially where the information continued downstream into other parts of a business process.

In many organizations, the output from scanning and data recognition was fed directly into Enterprise Content Management (ECM) systems such as SharePoint. These integrations allowed the enterprise to remove manual data entry for many processes dealing with large amounts of similar documents. The bulk processing of documents such as remittance slips and application forms became widespread. And the union of capture and ECM became standard in corporate America.

Capture 2.0: The Birth of Multi-channel Capture

 

As the world became more digital, the three “V”s of the information explosion presented difficulties for capture. The Volume of content was not necessarily a problem. Nor was the Velocity at which content arrived at the enterprise, as capture solutions were well suited for fast and accurate processing of large volumes of content. However, the third V, Variety, was a bigger challenge. Capture 1.0 was primarily focused on converting paper to digital files. The new world saw documents and information in a range of formats, including emails, faxes and EDI. 

Business needs also were shifting. As opposed to capturing documents at the end of business processes, simply for archiving, organizations were increasingly looking to digitize documents and data at the beginning of processes, with the goal of making those processes more efficient, streamlined, and centralized.

In an AIIM study from 2019 regarding the difficulty of automating capture and information management, 35% of participating organizations predicted that the process would be “very difficult.” In the same study, 50% of participants recognized the importance of automation and saw it as a “highly important” feature to integrate into their systems. The study revealed that organizations wanted to reap the full benefits of capture and automation, but they were unsure how to achieve it.

Capture needed a new model to address the new needs of the enterprise. Enter Capture 2.0.

With Capture 2.0, organizations were no longer restricted to physical documents. Capture 2.0 incorporated digital content via a technique known as multichannel capture.

Multichannel capture accommodated both physical and digital documents, ingesting them from a variety of sources and channels. Soon, multichannel capture became an essential tool for organizations to streamline and simplify how they received and processed information.

Capture 2.0 introduced automation as well. The rise of workflow engines and business process management (BPM) tools enabled capture vendors to provide comprehensive capabilities to structure and control business processes within the enterprise. The capture engine would transform information into digital data and the process engine would then push that data seamlessly through workflows and into business systems.

The advanced abilities introduced by Capture 2.0 provided much-needed assistance to organizations. It also proved to be a golden age of capture for many large vendors. However, over time a series of new technologies emerged that would take capture to the next level, and get it closer to the holy grail of capture: end-to-end process document automation without the need for humans.

Intelligent Capture

 

The term Capture 2.0 was coined by capture industry veteran Harvey Spencer and defined three specific areas of interest in relation to capture.

  • Capture, which transforms physical information into digital data that can be utilized downstream.
  • Business Process Management (BPM), which enables digital data to flow seamlessly throughout systems and workflows.
  • Robotic Process Automation (RPA), which bridges the gap between capture and BPM (i.e., minimizes and eliminates the manual clicks needed to classify and route content).

At the time, these three areas were distinct entities. Not anymore. In recent years capture, BPM, and RPA have become closely connected with Artificial Intelligence (AI) as the tools that every organization needs to automate business processes.

The evolution of capture to encompass AI and RPA is an obvious and logical move. Capture and AI tools do the hard work of classifying and extracting data from incoming content. RPA acts as the traffic cop, routing the documents and data to the relevant people, processes, and systems.

Organizations can use stand-alone tools for capture, AI, and RPA. A platform that combines them all is of more practical use.

Intelligent Document Processing platforms do exactly that. The cognitive learning capabilities of AI and the routing aspects of RPA allow IDP to extract, classify, and organize documents faster and more effectively than ever before. IDP amalgamates the best technologies available to automate document-centric business processes. It’s the next evolution of capture.

In the eyes of analyst firm Deep Analysis, IDP has three core aspects: acquire, understand, and integrate. IDP acquires content such as emails, PDFs, and physical mail using intelligent capture tools like OCR, AI, and natural language processing (NLP). The platform then reads and classifies data with advanced understanding of context and meaning, which is only possible with modern AI and ML capabilities. Finally, IDP integrates seamlessly with and feeds data and documents to existing business applications and processes. This third step is where the real business value of IDP lives.

IDP sets a high bar for document capture and management. Traditional capture solutions were isolated from other systems and only had one objective — scanning documents. IDP fully integrates with other applications and can provide a platform to quickly and efficiently automate document-heavy business processes, including invoice processing, loan processing, case management, records management, and onboarding procedures.

The New Standard: Intelligent Document Processing

 

Capture has dramatically evolved since its birth decades ago. Scanners, data entry, multifunctional devices, and information ingestion are still at the core of document capture. But the expectations of what capture can and should do has skyrocketed. Traditional capture now works hand in hand with modern technologies such as RPA and AI. For many organizations, capture has become mission critical.

Modern organizations must tame their incoming information chaos to advance their digital transformation journey. Modern capture in the form of intelligent document processing has set the standard, at least for now. No doubt future changes lie ahead as AI and ML tools get better and better.