Analysis of predictive typing methods at translating natural languages

OData support
Supervisor:
Dr. Juhász Sándor
Department of Automation and Applied Informatics

The topic of this thesis is about giving automatic typing completions (predictive typing) for translators within computer assisted translation tools. The aim is to reduce the number of keystrokes needed to produce the translation text. The algorithmic foundations of the topic is explored.

The first part of the thesis summarizes the related works in the field of predictive typing, including both predictive typing on devices with limited text input capabilities and the field of machine translation.

Then, the operating environment of the predictive typing implementation, a translation tool is discussed, showing how this feature fits in among the others. The algorithms showed in this thesis create dictionaries from large bilingual corpus, that, later on can be used to give typing suggestions to the user.

For comparison, a monolingual algorithm was created, that shows similarities with the implementation running on mobile devices. It ignores the source language side of the input corpus and gathers information only from the typing language. Then a bilingual algorithm is also explained, having many parameters and extension points, and taking into account both languages of the input corpus.

For comparing the different methods discussed in this thesis, an automatic measurement framework was created. It creates many dictionaries by varying the parameters of the algorithms, and evaluates the resulting dictionaries against given metrics by simulating tranlator work.

Measurements were carried out for English, German and Hungarian languages (both directions) with various text unit types, thresholds and scoring methods. Dictionaries are compared according to the amount of help they provide for the translator, and that is measured best by the decrease in the amount of keystrokes needed to produce the translation text.

Downloads

Please sign in to download the files of this thesis.