Generative vs. Discriminative AI: How Parashift makes Document Automation smarter

Generative vs. Discriminative AI: How Parashift makes Document Automation smarter

This article was written by our ML team.

Document classification in the Parashift Platform
Page separation in the Parashift Platform

Embeddings

RGB as example

 
 
AI Extraction in the Parashift Platform
  • By predicting a specific label for each token in the document, it is straight forward to pinpoint the corresponding tokens in the document that correspond to the extracted information.
  • Every prediction comes with a ‘confidence score’ which reflects how sure the model is for having found the correct value. By calibrating the confidences with actual benchmarks on any given task one can get a reliable measure of how trustworthy a model’s predictions actually are.
  • Discriminative models are usually on the small side, especially when comparing them to the current LLM models. They typically range from a few million to a few hundreds of million parameters and can be trained in a few minutes to hours on a single GPU. This makes it feasible to train multiple specialized models tailored to the specific task.
  • The small size of these models also comes with the advantage of ‘speed’ where we measure speed as the time it takes for a model to generate an interpretable answer. Typical values here in terms of document understanding would be in the order of milliseconds for a handful of pages. So small documents can be processed < 1 second.
  • Privacy: By the very nature of those models it is not possible for these models to reveal sensitive training data between customers. The only thing that these models do is add a label on existing text / tokens in a document. So there is no way a model could directly reveal for example addresses that were in the training data.
  • A lot of the ‘heavy lifting’ can be done by the embedding models and then share the embeddings for multiple downstream tasks.
  • A fundamental limitation is that they cannot naturally generalize their answers to unseen problems, even if the underlying embedding model is powerful enough to generalize to more problems than we trained our downstream model on. For each new problem we have to either extend the training of an existing model or to train a new one specialized in the new task. Since the advent of LLMs and their astonishing generalization capabilities this approach seems almost antiquated. We would not want to train a new model from scratch if we now also wanted to extract the phone numbers of documents.
  • Assigning labels is a fundamentally less general way of solving problems than the generative model approach. There are many use cases that simply cannot be tackled with purely discriminative models.

This is just the start of the conversation. If you’d like to explore how Parashift can support your automation journey, don’t hesitate to contact us.

 

Related Posts