Automatic construction of sememe knowledge bases from machine readable dictionaries

Date

2023-12-28

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats
3
views
19
downloads

Citation Stats

Series

Abstract

Sememes are the minimum semantic units of natural languages. Words annotated with sememes are organized into Sememe Knowledge Bases (SKBs). SKBs are successfully applied to various high-level language processing tasks as external knowledge bases. However, existing SKBs are manually or semi-manually constructed by linguistic experts over long periods, inhibiting their widespread utilization, updating, and expansion. To automatically construct an SKB from Machine-Readable Dictionaries (MRDs), which are readily available, we propose MRD2SKB as an automatic SKB generation approach. Well-established MRDs exist, and their construction is much simpler than SKBs. Therefore, the proposed MRD2SKB allows for fast, flexible, and extendable generation of SKBs. Building upon matrix factorization and topic modeling, we proposed several variants of MRD2SKB and constructed SKBs fully automatically. Both quantitative and qualitative results of extensive experiments are presented to demonstrate that the performances of the proposed automatically created SKBs are on par with manually and semi-manually prepared SKBs.

Source Title

IEEE-ACM Transactions on Audio, Speech, and Language Processing

Publisher

Institute of Electrical and Electronics Engineers

Course

Other identifiers

Book Title

Keywords

Sememes, Machine readable dictionary, Sememe knowledge bases, SKB, Machine learning

Degree Discipline

Degree Level

Degree Name

Citation

Published Version (Please cite this version)

Language

English