The document iceberg problem
Documents play a much bigger role in digital transformation than we would like. Numerous studies show that the vast majority of digital processes involve at least one document at some point. Almost always, these documents are neither properly digitized nor automated. That must change.
Handbrake for digital transformation
This circumstance significantly slows down digital processes and thus, in many cases, prevents the rapid digital transformation of individual business areas. I experience this every day in discussions with decision-makers who are entrusted with taking the digital transformation forward for their company.
Readout documents, that has been solved for a long time, hasn’t it?
One might think that the reading of documents has long since been solved.
But although we have been electronically capturing and reading documents for more than 30 years, this is clearly not yet the case in practice. At least not in the way it should be in 2021. We have still not reached the point where any document can be easily read. That’s because today’s solutions are set up to handle primarily high-volume document types. And this is only the case if a setup and/or configuration project has set up the individual case. We call such a high-volume document type a document type that occurs in large numbers in the company. Typical examples are delivery bills, invoices, or order confirmations.
Proven solutions have been available for these document types for a long time. These are often sold and implemented as classic software in projects lasting several months. The configurations used are usually rigid and, if they need to be adapted, require a change request.
This works well because the sheer volume of documents processed allows for high savings. These high savings, in turn, allow companies to also make high investments in such systems.
If we look at the total volume of documents in a company, we quickly notice that these high-volume document types usually represent only 20% of the total document volume of the organization.
The lion’s share, around 80% of all documents, is broken down into low-volume document types. However, since the effort required to set up these document types is largely the same with existing systems and the volume, as the name suggests, is small, these document types lack the savings potential that would justify the initial investment.
Ergo, these small-volume document types are generally not digitized and thus 80% of the documents are processed analogously and manually.
In this respect, companies today have the choice between two bad options: Either they invest vast sums of money to enable seamless digitization (which defacto does not pay off) or they continue to rely on manual processing, which results in digital processes that are only half digital.
This is because the technology that makes Intelligent Document Processing universal is simply not yet a reality. In this context, we speak of the “technological waterline”: 20% of the document volume is visible for digitization. The remaining 80% remains hidden below the “technological waterline”. This is, at its core, what we call the document iceberg problem.
Universal Intelligent Document Processing
“Intelligent Document Processing” is a relatively new term – not so the discipline.
While new technologies such as machine learning or deep learning and, quite significantly, much cheaper computing capacities have ensured that better results can be achieved quickly, I think we are at the very beginning of a development that will become increasingly important for everyone in the field of digital transformation.
“From many conversations with digital leaders over the past few years, I know the frustration that slowly seeps in as soon as you try to turn digital visions into digital realities.”
Because there’s always something going on. And more often than not, it’s documents that play a role in digitized processes. The fact that we as an entire digitization industry are tackling the “document iceberg problem” is of overarching importance for digitization. This since documents, contrary to the prevailing narrative, are anything but disappearing. The volume of documents is constantly increasing, and despite all the prophecies of doom, no reversal of the trend is in sight. The only thing that is decreasing are documents printed on paper.
Intelligent Document Processing must become a “commodity”. It must become universal. It’s as simple as that.
An incredibly complex problem
I have spent the last 4 years of my life dedicated to solving this problem with my team. At the core of our vision is an API to which the entire industry can send documents for reading and structuring document data and consume structured data streams without setup, without manual intervention. For literally every document on the planet. We’re not there yet, but we see a clear, if comparatively long, path to that point. Our current product is used by hundreds of customers and brings a lot of value to their processes – invaluably, it lays the foundation for this path to the universal Intelligent Document Processing API.
What I’ve learned about this is that this challenge is massively underestimated. It’s one thing to build a system that can cover a document type or the document types of an industry. Building an API that is as universal as possible is orders of magnitude more difficult. And multiplies with each additional dimension like language, geographic coverage, etc. We’re dealing with lowering the “technology waterline,” so to speak – towards which more and more of the iceberg for digitization becomes visible.
I’ll be happy to explain how we approach this at Parashift, why we see ourselves as a tech infrastructure company, and why we benefit massively from the “data network effect” in another article.
The potential for digitization is simply huge
The closer we get to universal Intelligent Document Processing, the clearer the incredible potential of this technology in a business context will become. You will be able to realize automated business processes that you cannot even imagine today.
Because it will not only be possible to drastically reduce costs and processing times by reducing manual work. New possibilities will emerge that we don’t even know about yet.
The fact that all of humanity spends so much time every day reading data from documents and manually entering it into systems is simply insane. We only accept this because we have no alternatives. It’s time to change that. Bit by bit. Document type by document type.