Learning interpretable word embeddings via bidirectional alignment of dimensions with semantic concepts

Limited Access
This item is unavailable until:
2024-03-22
Date
2022-03-22
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Information Processing & Management
Print ISSN
0306-4573
Electronic ISSN
1873-5371
Publisher
Elsevier Ltd
Volume
59
Issue
3
Pages
102925- 1 - 102925- 17
Language
English
Journal Title
Journal ISSN
Volume Title
Series
Abstract

We propose bidirectional imparting or BiImp, a generalized method for aligning embedding dimensions with concepts during the embedding learning phase. While preserving the semantic structure of the embedding space, BiImp makes dimensions interpretable, which has a critical role in deciphering the black-box behavior of word embeddings. BiImp separately utilizes both directions of a vector space dimension: each direction can be assigned to a different concept. This increases the number of concepts that can be represented in the embedding space. Our experimental results demonstrate the interpretability of BiImp embeddings without making compromises on the semantic task performance. We also use BiImp to reduce gender bias in word embeddings by encoding gender-opposite concepts (e.g., male–female) in a single embedding dimension. These results highlight the potential of BiImp in reducing biases and stereotypes present in word embeddings. Furthermore, task or domain-specific interpretable word embeddings can be obtained by adjusting the corresponding word groups in embedding dimensions according to task or domain. As a result, BiImp offers wide liberty in studying word embeddings without any further effort.

Course
Other identifiers
Book Title
Citation
Published Version (Please cite this version)