Disclosing Zipfian regularities in semantic breadth of words via multimodal Gaussian embeddings

buir.advisorKoç, Aykut
dc.contributor.authorŞahinuç, Furkan
dc.date.accessioned2021-11-30T09:52:49Z
dc.date.available2021-11-30T09:52:49Z
dc.date.copyright2021-11
dc.date.issued2021-11
dc.date.submitted2021-11-26
dc.departmentDepartment of Electrical and Electronics Engineeringen_US
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (Master's): Bilkent University, Department of Electrical and Electronics Engineering, İhsan Doğramacı Bilkent University, 2021.en_US
dc.descriptionIncludes bibliographical references (leaves 59-69).en_US
dc.description.abstractBeing one of the most common empirical regularities, Zipf's law for word frequencies is a power-law relation between word frequencies and frequency ranks of words. In this thesis, the semantic uncertainty (i.e., semantic coverage) of words is quantitatively studied through non-point distribution-based word embeddings and a new Zipfian regularity is revealed. Uncertainty or semantic coverage of a word can increase due to several reasons such as polysemy, having a broad meaning (such as the relation between broader emotion and narrower exasperation) or a combination of both. Although there are studies that touch upon measuring the generality-specificity levels of words, Zipfian patterns of these features are not shown quantitatively with a theoretical background. Main aim of this thesis is to bridge this gap in the Zipfian literature. To this end, variances of Gaussian embeddings are utilized to quantify to what extent a word can be used in di erent senses or contexts. Using the variance information embedded in the non-point Gaussian embeddings, Zipfian patterns which exist in the semantic breadth of words are quantitatively shown when polysemy is controlled. This outcome is complementary to Zipf's law of meaning distribution and the related meaning-frequency law by indicating the existence of Zipfian patterns: more frequent words tend to be generic and uncertain. In contrast, less frequent ones tend to be specific. To verify the generalization of our findings, Zipfian patterns are investigated in the scope of the polysemy neutralization, various language properties and several languages from di erent language families: English, German, Spanish, Russian, and Turkish. Such regularities provide valuable information to extract and understand relationships between semantic properties of words and word frequencies. In various applications, performance improvements can be obtained by employing these fundamental regularities. A method is also proposed to leverage the Zipfian regularity to improve the performance of baseline lexical entailment detection algorithms. To the best of our knowledge, this thesis is the first quantitative study that uses Gaussian embeddings to examine the relationships between word frequencies and semantic breadth.en_US
dc.description.degreeM.S.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2021-11-30T09:52:49Z No. of bitstreams: 1 Disclosing Zipfian regularities in semantic breadth of words via multimodal Gaussian embeddings.pdf: 1328028 bytes, checksum: 930841ac2da5a7b6d1383bc711f57cb9 (MD5)en
dc.description.provenanceMade available in DSpace on 2021-11-30T09:52:49Z (GMT). No. of bitstreams: 1 Disclosing Zipfian regularities in semantic breadth of words via multimodal Gaussian embeddings.pdf: 1328028 bytes, checksum: 930841ac2da5a7b6d1383bc711f57cb9 (MD5) Previous issue date: 2021-11en
dc.description.statementofresponsibilityby Furkan Şahinuçen_US
dc.format.extentxv, 69 leaves : charts, tables ; 30 cm.en_US
dc.identifier.itemidB133878
dc.identifier.urihttp://hdl.handle.net/11693/76703
dc.language.isoEnglishen_US
dc.publisherBilkent Universityen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectWord variancesen_US
dc.subjectWord frequenciesen_US
dc.subjectZipf's lawen_US
dc.subjectMeaning-frequency relationen_US
dc.subjectZipfian regularitiesen_US
dc.subjectWord entailmenten_US
dc.subjectSemantic breadthen_US
dc.titleDisclosing Zipfian regularities in semantic breadth of words via multimodal Gaussian embeddingsen_US
dc.title.alternativeÇok modlu Gauss kelime temsilleri ile sözcüklerin anlamsal genişliğindeki Zipf'sel düzenliliklerin ortaya çıkarımıen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Disclosing Zipfian regularities in semantic breadth of words via multimodal Gaussian embeddings.pdf
Size:
1.27 MB
Format:
Adobe Portable Document Format
Description:
Full printable version
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: