Semantic change detection with gaussian word embeddings

buir.contributor.authorYüksel, Arda
buir.contributor.authorUğurlu, Berke
buir.contributor.authorKoç, Aykut
buir.contributor.orcidKoç, Aykut|0000-0002-6348-2663
dc.citation.epage3361en_US
dc.citation.spage3349en_US
dc.citation.volumeNumber29en_US
dc.contributor.authorYüksel, Arda
dc.contributor.authorUğurlu, Berke
dc.contributor.authorKoç, Aykut
dc.date.accessioned2022-01-27T12:43:10Z
dc.date.available2022-01-27T12:43:10Z
dc.date.issued2021-10-20
dc.departmentDepartment of Electrical and Electronics Engineeringen_US
dc.departmentNational Magnetic Resonance Research Center (UMRAM)en_US
dc.description.abstractDiachronic study of the evolution of languages is of importance in natural language processing (NLP). Recent years have witnessed a surge of computational approaches for the detection and characterization of lexical semantic change (LSC) due to the availability of diachronic corpora and advancing word representation techniques. We propose a Gaussian word embedding (w2g)-based method and present a comprehensive study for the LSC detection. W2g is a probabilistic distribution-based word embedding model and represents words as Gaussian mixture models using covariance information along with the existing mean (word vector). We also extensively study several aspects of w2g-based LSC detection under the SemEval-2020 Task 1 evaluation framework as well as using Google N-gram corpus. In the Sub-task 1 (LSC binary classification) of the SemEval-2020 Task 1, we report the highest overall ranking as well as the highest ranks for the two (German and Swedish) of the four languages (English, Swedish, German and Latin). We also report the highest Spearman correlation in the Sub-task 2 (LSC ranking) for Swedish. Our overall rankings in the LSC classification and ranking sub-tasks are 1st and 7th , respectively. Qualitative analysis has also been presented.en_US
dc.description.provenanceSubmitted by Evrim Ergin (eergin@bilkent.edu.tr) on 2022-01-27T12:43:10Z No. of bitstreams: 1 Semantic_change_detection_with_gaussian_word_embeddings.pdf: 1860756 bytes, checksum: d694826eca16c2c9be9506caee0419db (MD5)en
dc.description.provenanceMade available in DSpace on 2022-01-27T12:43:10Z (GMT). No. of bitstreams: 1 Semantic_change_detection_with_gaussian_word_embeddings.pdf: 1860756 bytes, checksum: d694826eca16c2c9be9506caee0419db (MD5) Previous issue date: 2021-10-20en
dc.identifier.doi10.1109/TASLP.2021.3120645en_US
dc.identifier.eissn2329-9304
dc.identifier.issn2329-9290
dc.identifier.urihttp://hdl.handle.net/11693/76841
dc.language.isoEnglishen_US
dc.publisherIEEEen_US
dc.relation.isversionofhttps://doi.org/10.1109/TASLP.2021.3120645en_US
dc.source.titleIEEE/ACM Transactions on Audio, Speech, and Language Processingen_US
dc.subjectDiachronic embeddingsen_US
dc.subjectSemantic change computationen_US
dc.subjectSemantic change detectionen_US
dc.subjectLexical semantic changeen_US
dc.subjectDiachronic NLPen_US
dc.subjectWord embeddingsen_US
dc.subjectWord2gaussen_US
dc.titleSemantic change detection with gaussian word embeddingsen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Semantic_change_detection_with_gaussian_word_embeddings.pdf
Size:
1.77 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: