An AI-based solution for translating technical documentation for a major Chinese company that manufactures power plant equipment. The translation is performed from Chinese to English while preserving the structure of the documents.
Our client is a major Chinese company specializing in the production of power plant equipment. While the manufacturing is based in China, the equipment is supplied worldwide. As a result, the client faced the challenge of translating technical documentation into English. Relying solely on human translators proved to be expensive and time-consuming, especially given the extensive product range.
The primary goal of the project was to automate the translation process from Chinese to English while preserving the document structure and processing text within diagrams and images. This solution significantly reduces the time and costs associated with translation.
This project has one key distinguishing feature. The client required a confidential solution that would not send sensitive data to third parties or use it to train language models on open datasets.
To meet this requirement, we utilized a combination of two solutions: Azure Intelligence and GPT-4. Both Azure and GPT-4 offer flexible security settings, providing corporate clients with special conditions that guarantee data will not be used to improve or train models or shared with third parties.
Texts in the documentation appear in various formats: paragraphs, headers, footers, and tables. GPT-4 handles all these elements.
We applied the following algorithm for working with them:
A particular challenge was managing text volume, as Chinese and English differ significantly, and sentence lengths can vary greatly. To address this, we adapted the document's structure based on the length of the translated text.
Additionally, the client provided a glossary, which significantly improved the accuracy of the translations.
Since images and diagrams also contain textual data, they cannot be overlooked when creating new documentation for a different market. We used Azure Intelligence to recognize text within images.
The workflow for image processing was as follows:
Once the images and text were ready, we saved the new version of the documentation in PDF format.
The issue of text volume was particularly challenging for images, as we could not simply adapt the document structure. To ensure the images looked correct, we modified the text itself (including fonts, font sizes, and text positioning). This approach allowed us to achieve high quality without altering the structure of the images.
Even at the prototype testing stage, we achieved impressive results. The developed prototype was presented to the client, who noted the high quality of translation and the preservation of the documentation's structure.
The cost of translation amounted to just $0.03 per page, making the solution economically efficient and freeing up a significant portion of the client's budget originally allocated for translation.
Do you want to know the total cost of development and realization of the project? Tell us about your requirements, our specialists will contact you as soon as possible.