Automatic method for generation of sememe knowledge bases from machine readable dictionaries
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
BUIR Usage Stats
views
downloads
Series
Abstract
The minimal semantic units of natural languages are defined as sememes. Sememe Knowledge Bases (SKBs) are organized word collections annotated with appro-priate sememes. As external knowledge bases, SKBs have successful applications in multiple high-level language processing tasks. However, the construction of mainstream SKBs is performed by linguistic experts over extended periods, which restricts their prevalent usage. We present MRD4SKB as an automatic SKB generation method from readily available Machine Readable Dictionaries (MRDs). Construction of MRDs is more straightforward than SKBs, and many prominent MRDs are present in various forms. Consequently, the presented MRD4SKB is viable as a fast, flexible, and extendable method for SKB construction. Several variants of MRD4SKB, based on matrix factorization and topic modeling, are proposed to generate SKBs automatically. The performance of the automatically generated SKBs is evaluated and compared with that of other SKBs, which are constructed manually or semi-manually.