Learning interpretable word embeddings via bidirectional alignment of dimensions with semantic concepts

Available
The embargo period has ended, and this item is now available.

Date

2022-03-22

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Information Processing & Management

Print ISSN

0306-4573

Electronic ISSN

1873-5371

Publisher

Elsevier Ltd

Volume

59

Issue

3

Pages

102925- 1 - 102925- 17

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

We propose bidirectional imparting or BiImp, a generalized method for aligning embedding dimensions with concepts during the embedding learning phase. While preserving the semantic structure of the embedding space, BiImp makes dimensions interpretable, which has a critical role in deciphering the black-box behavior of word embeddings. BiImp separately utilizes both directions of a vector space dimension: each direction can be assigned to a different concept. This increases the number of concepts that can be represented in the embedding space. Our experimental results demonstrate the interpretability of BiImp embeddings without making compromises on the semantic task performance. We also use BiImp to reduce gender bias in word embeddings by encoding gender-opposite concepts (e.g., male–female) in a single embedding dimension. These results highlight the potential of BiImp in reducing biases and stereotypes present in word embeddings. Furthermore, task or domain-specific interpretable word embeddings can be obtained by adjusting the corresponding word groups in embedding dimensions according to task or domain. As a result, BiImp offers wide liberty in studying word embeddings without any further effort.

Course

Other identifiers

Book Title

Citation