Browsing by Author "Sevim, Nurullah"
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item Open Access Analysis of gender bias in legal texts using natural language processing methods(2023-07) Sevim, NurullahWord embeddings have become important building blocks that are used profoundly in natural language processing (NLP). Despite their several advantages, word embed-dings can unintentionally accommodate some gender- and ethnicity-based biases that are present within the corpora they are trained on. Therefore, ethical concerns have been raised since word embeddings are extensively used in several high level algorithms. Furthermore, transformer-based contextualized language models constitute the state-of-the-art in several natural language processing (NLP) tasks and applications. Despite their utility, contextualized models can contain human-like social biases as their training corpora generally consist of human-generated text. Evaluating and re-moving social biases in NLP models have been an ongoing and prominent research endeavor. In parallel, the NLP approaches in the legal area, namely legal NLP or computational law, have also been increasing recently. Eliminating unwanted bias in the legal domain is doubly crucial since the law has the utmost importance and effect on people. We approach the gender bias problem from the scope of legal text processing domain. In the first stage of our study, we focus on the gender bias in traditional word embeddings, like Word2Vec and GloVe. Word embedding models which are trained on corpora composed by legal documents and legislation from different countries have been utilized to measure and eliminate gender bias in legal documents. Several methods have been employed to reveal the degree of gender bias and observe its variations over countries. Moreover, a debiasing method has been used to neutralize unwanted bias. The preservation of semantic coherence of the debiased vector space has also been demonstrated by using high level tasks. In the second stage, we study the gender bias encoded in BERT-based models. We propose a new template-based bias measurement method with a bias evaluation corpus using crime words from the FBI database. This method quantifies the gender bias present in BERT-based models for legal applications. Furthermore, we propose a fine-tuning-based debiasing method using the European Court of Human Rights (ECtHR) corpus to debias legal pre-trained models. We test the debiased models on the LexGLUE benchmark to confirm that the under-lying semantic vector space is not perturbed during the debiasing process. Finally, overall results and their implications have been discussed in the scope of NLP in legal domain.Item Open Access Gender bias in legal corpora and debiasing it(Cambridge University Press, 2022-03-30) Koç, Aykut; Sevim, Nurullah; Şahinuç, FurkanWord embeddings have become important building blocks that are used profoundly in natural language processing (NLP). Despite their several advantages, word embeddings can unintentionally accommodate some gender- and ethnicity-based biases that are present within the corpora they are trained on. Therefore, ethical concerns have been raised since word embeddings are extensively used in several high-level algorithms. Studying such biases and debiasing them have recently become an important research endeavor. Various studies have been conducted to measure the extent of bias that word embeddings capture and to eradicate them. Concurrently, as another subfield that has started to gain traction recently, the applications of NLP in the field of law have started to increase and develop rapidly. As law has a direct and utmost effect on people’s lives, the issues of bias for NLP applications in legal domain are certainly important. However, to the best of our knowledge, bias issues have not yet been studied in the context of legal corpora. In this article, we approach the gender bias problem from the scope of legal text processing domain. Word embedding models that are trained on corpora composed by legal documents and legislation from different countries have been utilized to measure and eliminate gender bias in legal documents. Several methods have been employed to reveal the degree of gender bias and observe its variations over countries. Moreover, a debiasing method has been used to neutralize unwanted bias. The preservation of semantic coherence of the debiased vector space has also been demonstrated by using high-level tasks. Finally, overall results and their implications have been discussed in the scope of NLP in legal domain.Item Restricted Halis Turgut Cinlioğlu ve Tokat(Bilkent University, 2018) Bakır, Alihan; Somtürk, Melike; Sevim, Nurullah; Atalay, Can; Seke, BatuhanItem Open Access Türkçe kelime temsillerinde cinsiyetçi ön yargının incelenmesi(IEEE, 2021-07-19) Sevim, Nurullah; Koç, AykutDoğal Dil İşleme uygulamalarında cinsiyetçi ön yargının incelenmesi, olası bir cinsiyetçi yaklaşımın olumsuz sonuçlarından dolayı son zamanlarda önem kazanmıştır. Özellikle İngilizce kelime temsillerinde bu tür ön yargılar çeşitli bağlamlarda incelenerek birçok araştırma yapılmıştır. Bu çalışmada Türkçe kelime temsillerinin cinsiyetçi ön yargılar açısından durumu incelenmiştir ve Türkçe dil yapısı İngilizce dil yapısı ile cinsiyetçi ön yargılar kapsamında karşılaştırılmıştır. Kelime temsillerinde yapılan cinsiyetçi ön yargıların ölçümü sonucunda Türkçe’nin İngilizce’ye kıyasla dil yapısında cinsiyetçi ön yargıyı daha az barındırdığı sonucuna varılmıştır.