Implementing a filter testing framework for a CAT system

OData support
Supervisor:
Benedek Zoltán
Department of Automation and Applied Informatics

There is a demand from people to read writings written in other languages, translated to their own. This demand is about as old as writing itself. Nowadays many companies deal with document translation.

There is a demand from these companies for software that can help their employees with their work: they need an application to import translatable content from digital documents, translate it, and export back to the original file format, to do their jobs more efficiently. To satisfy this demand, Kilgray started the development of memoQ about 8 years ago. memoQ has many features. For example, you can store your already translated sentences in a translation memory, and it will give them back whenever you need them. Or you can use term bases to get possible translations for words and expressions. There are many other features, but listing all of them is not possible within the limits of this abstract.

memoQ supports more than 20 document formats (e.g. txt, docx, pdf). For all of them there is a filter. A filter’s task is to recognize translatable content within a document, import it, and, after the translation, export it to its original place, so we get a document in the original file format, but with the translated content. All of these filters use a filter framework. This framework can change over time. These changes sometimes corrupt one or more of the filters, even if it cannot be revealed compile-time.

One of the goals of my thesis is to create a tool which can be used to define and configure automatized tests on the filters, run these tests, and notify the competent people in case of an error. These people can use the detailed description of the errors to find and fix them.

The other goal is to create a filter which can process the extended version of the XLIFF bilingual file format, named SDLXLIFF. This format is used by SDL’s product SDL Trados Studio.

Downloads

Please sign in to download the files of this thesis.