Measuring cross-lingual semantic similarity across European languages

buir.contributor.authorŞenel, Lütfü Kerem
buir.contributor.authorÇukur, Tolga
dc.citation.epage363en_US
dc.citation.spage359en_US
dc.contributor.authorŞenel, Lütfü Keremen_US
dc.contributor.authorYücesoy, V.en_US
dc.contributor.authorKoç, A.en_US
dc.contributor.authorÇukur, Tolgaen_US
dc.coverage.spatialBarcelona, Spainen_US
dc.date.accessioned2018-04-12T11:46:58Z
dc.date.available2018-04-12T11:46:58Z
dc.date.issued2017en_US
dc.departmentDepartment of Electrical and Electronics Engineeringen_US
dc.departmentNational Magnetic Resonance Research Center (UMRAM)en_US
dc.departmentInterdisciplinary Program in Neuroscience (NEUROSCIENCE)en_US
dc.departmentAysel Sabuncu Brain Research Center (BAM)en_US
dc.descriptionDate of Conference: 5-7 July 2017en_US
dc.descriptionConference Name: 40th International Conference on Telecommunications and Signal Processing, TSP 2017en_US
dc.description.abstractThis paper studies cross-lingual semantic similarity (CLSS) between five European languages (i.e. English, French, German, Spanish and Italian) via unsupervised word embeddings from a cross-lingual lexicon. The vocabulary in each language is projected onto a separate high-dimensional vector space, and these vector spaces are then compared using several different distance measures (i.e., correlation, cosine etc.) to measure their pairwise semantic similarities between these languages. A substantial degree of similarity is observed between the vector spaces learned from corpora of the European languages. Null hypothesis testing and bootstrap methods (by resampling without replacement) are utilized to verify the results.en_US
dc.description.provenanceMade available in DSpace on 2018-04-12T11:46:58Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 179475 bytes, checksum: ea0bedeb05ac9ccfb983c327e155f0c2 (MD5) Previous issue date: 2017en
dc.identifier.doi10.1109/TSP.2017.8076005en_US
dc.identifier.urihttp://hdl.handle.net/11693/37656
dc.language.isoEnglishen_US
dc.publisherIEEEen_US
dc.relation.isversionofhttps://doi.org/10.1109/TSP.2017.8076005en_US
dc.source.titleProceedings of the 40th International Conference on Telecommunications and Signal Processing, TSP 2017en_US
dc.subjectCross-lingual semantic similarityen_US
dc.subjectLanguage modelsen_US
dc.subjectNatural language processingen_US
dc.subjectSemantic similarityen_US
dc.subjectWord embeddingen_US
dc.subjectNatural language processing systemsen_US
dc.titleMeasuring cross-lingual semantic similarity across European languagesen_US
dc.typeConference Paperen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
08076005.pdf
Size:
181.61 KB
Format:
Adobe Portable Document Format
Description:
Full printable version