Browsing by Subject "Meaning-frequency relation"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Open Access Disclosing Zipfian regularities in semantic breadth of words via multimodal Gaussian embeddings(2021-11) Şahinuç, FurkanBeing one of the most common empirical regularities, Zipf's law for word frequencies is a power-law relation between word frequencies and frequency ranks of words. In this thesis, the semantic uncertainty (i.e., semantic coverage) of words is quantitatively studied through non-point distribution-based word embeddings and a new Zipfian regularity is revealed. Uncertainty or semantic coverage of a word can increase due to several reasons such as polysemy, having a broad meaning (such as the relation between broader emotion and narrower exasperation) or a combination of both. Although there are studies that touch upon measuring the generality-specificity levels of words, Zipfian patterns of these features are not shown quantitatively with a theoretical background. Main aim of this thesis is to bridge this gap in the Zipfian literature. To this end, variances of Gaussian embeddings are utilized to quantify to what extent a word can be used in di erent senses or contexts. Using the variance information embedded in the non-point Gaussian embeddings, Zipfian patterns which exist in the semantic breadth of words are quantitatively shown when polysemy is controlled. This outcome is complementary to Zipf's law of meaning distribution and the related meaning-frequency law by indicating the existence of Zipfian patterns: more frequent words tend to be generic and uncertain. In contrast, less frequent ones tend to be specific. To verify the generalization of our findings, Zipfian patterns are investigated in the scope of the polysemy neutralization, various language properties and several languages from di erent language families: English, German, Spanish, Russian, and Turkish. Such regularities provide valuable information to extract and understand relationships between semantic properties of words and word frequencies. In various applications, performance improvements can be obtained by employing these fundamental regularities. A method is also proposed to leverage the Zipfian regularity to improve the performance of baseline lexical entailment detection algorithms. To the best of our knowledge, this thesis is the first quantitative study that uses Gaussian embeddings to examine the relationships between word frequencies and semantic breadth.Item Open Access Zipfian regularities in “non-point” word representations(Elsevier Ltd, 2021-05) Şahinuç, Furkan; Koç, AykutBeing one of the most common empirical regularities, the Zipf’s law for word frequencies is a power law relation between word frequencies and frequency ranks of words. We quantitatively study semantic uncertainty of words through non-point distribution-based word embeddings and reveal the Zipfian regularities. Uncertainty of a word can increase due to polysemy, the word having “broad” meaning (such as the relation between broader emotion and narrower exasperation) or a combination of both. Variances of Gaussian embeddings are utilized to quantify the extent a word can be used in different senses or contexts. By using the variance information embedded in the non-point Gaussian embeddings, we quantitatively show that semantic breadth of words also exhibits Zipfian patterns, when polysemy is controlled. This outcome is complementary to Zipf’s law of meaning distribution and the related meaning-frequency law by indicating the existence of Zipfian patterns: more frequent words tend to be generic while less frequent ones tend to be specific. Results for two languages, English and Turkish that belong to different language families, are also provided. Such regularities provide valuable information to extract and understand relationships between semantic properties of words and word frequencies. In various applications, performance improvements can be obtained by employing these regularities. We also propose a method that leverages the Zipfian regularity to improve the performance of baseline textual entailment detection algorithms. To the best of our knowledge, our approach is the first quantitative study that uses Gaussian embeddings to examine the relationships between word frequencies and semantic breadth.