OCR Software Design and Development
OCR Software Design and Development
Project overview

The project aims to automate the OCR process of the client’s document archives and provide a solution to store and work with the information collected from the documents. The project’s main point is to give an AI OCR solution that would allow for quick and precise text and image recognition and minimize human work on the text proof. Also, the vital part of the project is to give a solution to then navigate through the scanned text and find the requested serial/part numbers.

OCR Software Design and Development

Client
The client is a German manufacturer. This company operates in industrial engineering and serves various branches. It provides oil and gas, renewable energy, water transport solutions, and others. Furthermore, the firm produces pumps, motors, engines, etc.
Key Features
  • Seamless work of the OCR tool once the document scans have been uploaded to the s3 bucket, providing precise and accurate text recognition.
  • Document digitization software has an easy-to-use and navigate web interface that allows for a convenient search by keywords or numbers for the exact result highlighted on the scanned page.
Project Facts
Technologies: Angular, TypeScript, SQL, AWS Textract, S3 Bucket
Location: Germany
Project duration: 600 man-days
Team: Senior developer, middle web developer, designer, QA engineer, project manager
Software development process: Waterfall
Project overview

The project aims to automate the OCR process of the client’s document archives and provide a solution to store and work with the information collected from the documents. The project’s main point is to give an AI OCR solution that would allow for quick and precise text and image recognition and minimize human work on the text proof. Also, the vital part of the project is to give a solution to then navigate through the scanned text and find the requested serial/part numbers.

Business Challenge

The customer sought a partner to help with his long-term project. The main issue that they were facing was that when their client returned with a request to provide them with the additional parts or replace the old pumps, they would need to manually search for the product specifications and BOMs (bill of materials) in their physical archive, which can take days if not weeks. The customer wanted to digitalize the process and provide a tool for a sales department to navigate the old documents quickly and efficiently.

Solution

The Chudovo team approached this task with great attention to detail and client requests. Upon performing the discovery phase and discussions, it was agreed to divide the project into several milestones for each of the major functionalities requested: 

  • OCR software core development
  • Web interface development
  • Automatization of data flow
  • Storage and text recognition optimization
  • Management tools for the project, admin panel

Chudovo specialists researched the subject of OCR software solutions on the market. It was decided to use AWS Textract since it provided optimal operational costs and synergy with storage solutions, such as S3 bucket. It allowed for seamless integration of the products and smooth work of the service. The web interface was designed and developed to include easy access to all the processed pages of the documents, as well as a search function that allows to search content by serial number, part number, part name, etc. The web interface was developed using Angular and TypeScript. With the use of these technologies, Chudovo achieved a fast and optimized tool that helps its users quickly find the desired text or serial numbers using forms.

Business Impact

With the provided OCR software solution, the client not only achieved the desired effect of increasing the digitization of old archived documents but also improved sales and aftermarket departments’ ability to provide the best solution for their customers. The client also applied this tool to government programs, facilitating the use of AI tools in software development, which allowed the project to receive additional funding.

OCR Software Design and Development
Client
The client is a German manufacturer. This company operates in industrial engineering and serves various branches. It provides oil and gas, renewable energy, water transport solutions, and others. Furthermore, the firm produces pumps, motors, engines, etc.
Key Features
  • Seamless work of the OCR tool once the document scans have been uploaded to the s3 bucket, providing precise and accurate text recognition.
  • Document digitization software has an easy-to-use and navigate web interface that allows for a convenient search by keywords or numbers for the exact result highlighted on the scanned page.
Project Facts
Technologies: Angular, TypeScript, SQL, AWS Textract, S3 Bucket
Location: Germany
Project duration: 600 man-days
Team: Senior developer, middle web developer, designer, QA engineer, project manager
Software development process: Waterfall
Contact us