We have analysed 7 most popular AI document detection models to test how well they work “out-of-the-box” on a set of digital invoices and have assessed how well they process invoices of various layouts and languages.
Service |
Invoice Detection Accuracy Without Items |
Invoice Detection Accuracy With Items |
Processing duration Per 1 Page, s |
Cost, per 1000 pages |
---|---|---|---|---|
85,8% |
85,7% |
4.3 ± 0.2 |
$10 |
|
GPT-4o using 3d party OCR (Prebuilt Layout model by Azure AI) |
90,8% |
86,5% |
33.0 ± 2.3 |
$8,8 1 |
88,3% |
89,2% |
16.9 ± 1.9 |
$8,8 |
|
83,8% |
68,1% |
3.8 ± 0.2 |
$10 |
|
91,3% |
91,1% |
2.9 ± 0.2 |
$10 2 |
|
Gemini 2.0 Pro | 90% | 90,2% | 8 ± 1.5 | $4,5 3 |
DeepSeek v3 API (Prebuilt Layout model by Azure AI) | 93,3% | 88,1% | 69 | 11$ |
1 — Additional $10 per 1000 pages from using a text recognition model
2 — Additional $0.008 per page after one million
3 — $1.25, input prompts ≤ 128k tokens, $2.50, input prompts > 128k tokens; $5.00, output prompts ≤ 128k tokens, $10.00, output prompts > 128k tokens
To achieve exceptional accuracy in extracting data from invoices, we combined the power of multiple large language models (LLMs). We use advanced matching algorithms to compare the outputs of each model and select the final results using a majority-vote principle.
This ensemble approach allows us to leverage the unique strengths of each LLM, providing robust and scalable invoice data extraction for real-world business needs.
As a result, we have drastically increased the average extraction accuracy from 85% to 97%.