Our aim is to perform translation verification efficiently within the constraints of budget and limited manpower. This project describes our approach to translation verification. Starting from January 2019, it is part of the SSHOC project (WP 4.3c).
We build up a program that reads the outcome provided by translators, stores translation and metadata, performs sanity checks, i.e. empty field or wrong indexation, and a content related check.
For the latter we rely on the use of bilingual word embedding technique to rate translations. The bilingual word embedding allows us to compare the source and the target language. The final outcome is a report with flagged text to be re-checked.
Our approach processes high volume of data/text efficiently. In this paper we measure the incidence of the false positives, i.e. flagged items that were translated properly. This program still needs further improvement to be reliable for long sentences and out-of-the context situations. Using bilingual phrase embedding is the next step in order to improve the performance of our checking activitiy.