Office Automation - Document recognition

OData support
Dr. Max Gyula
Department of Automation and Applied Informatics

Due to the evolution of the computers, today it has become more and more inportant to automate the workflow in every area of the industry. Thus it is important for an average accounting firm to bypass the human factor from the simple data-analyzing processes. In most cases of the electronic data storage methods they just store the image of the scanned documents, Nowdays the high reslolution scanners are widespread and can be purchased by these firms for a relatively good price. These devices give the oppurtunity to scan the invoices with good quality. Later, the clerks have to process these and type in the data found on them.

The purpose of my thesis work is to create an automated document recognition process. The result of this process is a displayable and editable format of the cruical data found on the input documents. The topic includes the scanning of the invoices available in printed version and it involves the recognition of the text and exportation of them to a table.

During the implementation of the thesis work I scanned each invcoices with a digital scanner and with the use of an optical character recognition program analyzed the text found on them. After that with the help of the software I implemented, I recognized and stored the most important data from the invoices then exported them into a table.

I would like to describe the used character recognition software as well as the used functions of it and to introduce and evaluate the finished program. Lastly I would like to talk about the possible improvements.


