One essential step of research is discovering studies related to the problem of the
research. In case of a popular research area, it might be difficult to find the most
Researchers assign keywords to their studies to make it easier to get introspection of
the content of the study and keywords also facilitate the searchability of the studies.
Although in some cases the keywords might not be relevant, and in other cases the
set of assigned keywords might be deficient. Deep learning has became one of the
main technologies of natural language processing, therefore it gives us promising
opportunities to automate keyword assignment.
The main topic of this thesis is keyword extraction based on abstracts of scientific
studies. First I introduce the technical and theoretical background of the problem. I
introduce the main concepts, some related techniques, definitions and metrics, and
the following three applications of natural language processing:
• Keyword extraction
• Summary generalization
• Title suggestion
I describe how machine learning — and more specifically deep learning — techniques
can handle natural language texts, and what types of deep neural networks are the
most suitable for this problem and why. Evaluation and measuring the results is
an important part of deep learning or other approaches of classification tasks. I
introduce the topic of evaluation in two main parts: first I introduce classification
evaluation measurements in general, and then I introduce the background and some
of the evaluation measurements of multi-label classification, as keyword extraction
is handled in this document as a multi-label classification task.
As the dataset that I could use was an important part of my work, I introduce it in
its own chapter.
I describe the details of the elaborated work of the previous two semesters, and what
further features can be applied on the implemented solutions, how the results can
be improved, what other approaches can be used.
Finally I sum up the topics described in this document, and I try to draw the final