Gender bias in legal corpora and debiasing it

Koç, Aykut; Sevim, Nurullah; Şahinuç, Furkan

Gender bias in legal corpora and debiasing it

buir.contributor.author	Koç, Aykut
buir.contributor.author	Sevim, Nurullah
buir.contributor.author	Şahinuç, Furkan
dc.citation.epage	34	en_US
dc.citation.spage	1	en_US
dc.contributor.author	Koç, Aykut
dc.contributor.author	Sevim, Nurullah
dc.contributor.author	Şahinuç, Furkan
dc.date.accessioned	2023-02-16T07:02:36Z
dc.date.available	2023-02-16T07:02:36Z
dc.date.issued	2022-03-30
dc.department	Department of Electrical and Electronics Engineering	en_US
dc.department	National Magnetic Resonance Research Center (UMRAM)	en_US
dc.description.abstract	Word embeddings have become important building blocks that are used profoundly in natural language processing (NLP). Despite their several advantages, word embeddings can unintentionally accommodate some gender- and ethnicity-based biases that are present within the corpora they are trained on. Therefore, ethical concerns have been raised since word embeddings are extensively used in several high-level algorithms. Studying such biases and debiasing them have recently become an important research endeavor. Various studies have been conducted to measure the extent of bias that word embeddings capture and to eradicate them. Concurrently, as another subfield that has started to gain traction recently, the applications of NLP in the field of law have started to increase and develop rapidly. As law has a direct and utmost effect on people’s lives, the issues of bias for NLP applications in legal domain are certainly important. However, to the best of our knowledge, bias issues have not yet been studied in the context of legal corpora. In this article, we approach the gender bias problem from the scope of legal text processing domain. Word embedding models that are trained on corpora composed by legal documents and legislation from different countries have been utilized to measure and eliminate gender bias in legal documents. Several methods have been employed to reveal the degree of gender bias and observe its variations over countries. Moreover, a debiasing method has been used to neutralize unwanted bias. The preservation of semantic coherence of the debiased vector space has also been demonstrated by using high-level tasks. Finally, overall results and their implications have been discussed in the scope of NLP in legal domain.	en_US
dc.identifier.doi	10.1017/S1351324922000122	en_US
dc.identifier.eissn	1469-8110
dc.identifier.issn	1351-3249
dc.identifier.uri	http://hdl.handle.net/11693/111389
dc.language.iso	English	en_US
dc.publisher	Cambridge University Press	en_US
dc.relation.isversionof	http://dx.doi.org/10.1017/S1351324922000122	en_US
dc.source.title	Natural Language Engineering	en_US
dc.subject	Bias	en_US
dc.subject	NLP in law	en_US
dc.subject	Legal text processing	en_US
dc.subject	Law	en_US
dc.subject	Computational law	en_US
dc.title	Gender bias in legal corpora and debiasing it	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Gender_bias_in_legal_corpora_and_debiasing_it.pdf
Size:: 1.47 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.69 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Publications - UMRAM
Scholarly Publications - Electrical and Electronics Engineering