Project: The study "document/content capture in small and medium-sized Slovenian enterprises"
Leader: Andrej Dobrovoljc
Duration: 1.2.2013 – 30.6.2014


  1. Mikrografija, d.o.o., Novo mesto, Slovenia.
  2. Faculty of information studies in Novo mesto, Novo mesto, Slovenia.

The business process, which starts with invoice receipt and ends with its payment, is present in every company. The main difference between the companies is how they confirm them. All the other parts of the process are very similar. The same applies to the form and content of received invoices. They include more or less the same data, but they differ in the layout. The bulk of the data to be captured in this process is the same. We can talk about the standard set of meta-data, which are regularly captured in this process (e.g. invoice number, costs amount, payment date, VAT number etc.). Most companies today are interested in automation of business processes, where a large amount of data from input documents are captured. The main goal of our customer is to develop the application for capturing data and optical character recognition, which would functionally better suit the small and medium-sized companies, and be more affordable for them. With such a solution our customer will become more competitive on the market. Basic overview of the market shows that there is a high potential for such an application. The core components of such products are OCR engines and related functionalities. In addition to commercial libraries (SDK) for OCR, there are also some open source solutions. Decision about it can have long-term impact. Before making final decision about the most suitable OCR and other related functionalities, our customer ordered us the following researches:

      • identification of the most important OCR products and related SDK on the market (commercial ones as well as open source), which allow the development of own applications;
      • comparison of identified OCR solutions;
      • the Slovenian market research on the use of business applications (ERP, CRM);
      • research about the most frequently used metadata on invoices (expected result is a standardized set of meta data);
      • development of the prototype of application;
      • research on cost-effectiveness of such a product.