Automatic method for generation of sememe knowledge bases from machine readable dictionaries
Date
Authors
Advisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Volume
Issue
Pages
Language
Type
Journal Title
Journal ISSN
Volume Title
Abstract
The minimal semantic units of natural languages are defined as sememes. Sememe Knowledge Bases (SKBs) are organized word collections annotated with appro-priate sememes. As external knowledge bases, SKBs have successful applications in multiple high-level language processing tasks. However, the construction of mainstream SKBs is performed by linguistic experts over extended periods, which restricts their prevalent usage. We present MRD4SKB as an automatic SKB generation method from readily available Machine Readable Dictionaries (MRDs). Construction of MRDs is more straightforward than SKBs, and many prominent MRDs are present in various forms. Consequently, the presented MRD4SKB is viable as a fast, flexible, and extendable method for SKB construction. Several variants of MRD4SKB, based on matrix factorization and topic modeling, are proposed to generate SKBs automatically. The performance of the automatically generated SKBs is evaluated and compared with that of other SKBs, which are constructed manually or semi-manually.