Analysing Company Relations by Means of Natural Language Processing Methods

OData support
Supervisor:
Gáspár Csaba
Department of Telecommunications and Media Informatics

Business relationships play an important role in the operation of firms, companies and corporations. The companies mainly communicate with their partners, they also sign business agreements and administer financial transactions with them. That is, economic relations established between partners provide important information about companies and highly influence their operations, so interesting and useful conclusions can be drawn from them. However, business relationships are usually not public. In order to analyze the network of inter-company connections, each relation has to be individually identified.

This thesis presents a possible approach for discovering business relations between companies and their employees. Connections between organizations and people related to them are determined based on the analysis of articles of online news portals. The references of firms, companies, organizations and office-holders are identified in the articles, then the assessed information is used to build the network of economic actors. Since an immense number of documents that were written in natural language are processed and analyzed, solving the subtasks requires the application of data mining, natural language processing and machine learning techniques.

In order to display the discovered information a web application is developed, which provides an interface for the users to search among corporations and people. Each corporation and person has a profile page, that aggregates all the information that was collected on the particular firm or person. This information includes the relationships with other organizations and employees, and also the segments of the articles that were used to identify the employees of the companies. In addition to browsing among corporations and office-holders, the application offers the possibility of correcting wrongly identified office-holders. These corrections are used in the subsequent modeling processes, which leads to more accurate results.

Downloads

Please sign in to download the files of this thesis.