Multilingual hyphenation using deep neural networks

OData support
Ács Judit
Department of Automation and Applied Informatics

Hyphenation algorithms are the computer based ways of syllabification and mostly used in typesetting and document formatting. In addition, the hyphenation of words affects the poetry as well as text-to-speech and speech recognition algorithms.

The rise of the deep learning paradigms increased the demand for solving natural language processing problems by machine learning. The amount of online available corpora facilitates the use of these paradigms in problems like hyphenation.

In the following thesis, the author shows a new hyphenation algorithm. After a short summary of the currently used algorithms, the study describes three neural networks used in Natural Language Processing (feedforward neural network, convolutional neural network, recurrent neural network) and shows a way of their application in the field of hyphenation.

The thesis also implements a fourth model (sequence-to-sequence model) for non-standard hyphenation extensions and introduces a multilingual hyphenation algorithm.


Please sign in to download the files of this thesis.