
The project aims to automate the OCR process of the client’s document archives and provide a solution to store and work with the information collected from the documents. The project’s main point is to give an AI OCR solution that would allow for quick and precise text and image recognition and minimize human work on the text proof. Also, the vital part of the project is to give a solution to then navigate through the scanned text and find the requested serial/part numbers.
OCR Software Design and Development
- Seamless work of the OCR tool once the document scans have been uploaded to the s3 bucket, providing precise and accurate text recognition.
- Document digitization software has an easy-to-use and navigate web interface that allows for a convenient search by keywords or numbers for the exact result highlighted on the scanned page.
The project aims to automate the OCR process of the client’s document archives and provide a solution to store and work with the information collected from the documents. The project’s main point is to give an AI OCR solution that would allow for quick and precise text and image recognition and minimize human work on the text proof. Also, the vital part of the project is to give a solution to then navigate through the scanned text and find the requested serial/part numbers.
Business Challenge
The customer sought a partner to help with his long-term project. The main issue that they were facing was that when their client returned with a request to provide them with the additional parts or replace the old pumps, they would need to manually search for the product specifications and BOMs (bill of materials) in their physical archive, which can take days if not weeks. The customer wanted to digitalize the process and provide a tool for a sales department to navigate the old documents quickly and efficiently.
Solution
The Chudovo team approached this task with great attention to detail and client requests. Upon performing the discovery phase and discussions, it was agreed to divide the project into several milestones for each of the major functionalities requested:
- OCR software core development
- Web interface development
- Automatization of data flow
- Storage and text recognition optimization
- Management tools for the project, admin panel
Chudovo specialists researched the subject of OCR software solutions on the market. It was decided to use AWS Textract since it provided optimal operational costs and synergy with storage solutions, such as S3 bucket. It allowed for seamless integration of the products and smooth work of the service. The web interface was designed and developed to include easy access to all the processed pages of the documents, as well as a search function that allows to search content by serial number, part number, part name, etc. The web interface was developed using Angular and TypeScript. With the use of these technologies, Chudovo achieved a fast and optimized tool that helps its users quickly find the desired text or serial numbers using forms.
Business Impact
With the provided OCR software solution, the client not only achieved the desired effect of increasing the digitization of old archived documents but also improved sales and aftermarket departments’ ability to provide the best solution for their customers. The client also applied this tool to government programs, facilitating the use of AI tools in software development, which allowed the project to receive additional funding.

- Seamless work of the OCR tool once the document scans have been uploaded to the s3 bucket, providing precise and accurate text recognition.
- Document digitization software has an easy-to-use and navigate web interface that allows for a convenient search by keywords or numbers for the exact result highlighted on the scanned page.