Browsing by Subject "NLP in law"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Unknown Gender bias in legal corpora and debiasing it(Cambridge University Press, 2022-03-30) Koç, Aykut; Sevim, Nurullah; Şahinuç, FurkanWord embeddings have become important building blocks that are used profoundly in natural language processing (NLP). Despite their several advantages, word embeddings can unintentionally accommodate some gender- and ethnicity-based biases that are present within the corpora they are trained on. Therefore, ethical concerns have been raised since word embeddings are extensively used in several high-level algorithms. Studying such biases and debiasing them have recently become an important research endeavor. Various studies have been conducted to measure the extent of bias that word embeddings capture and to eradicate them. Concurrently, as another subfield that has started to gain traction recently, the applications of NLP in the field of law have started to increase and develop rapidly. As law has a direct and utmost effect on people’s lives, the issues of bias for NLP applications in legal domain are certainly important. However, to the best of our knowledge, bias issues have not yet been studied in the context of legal corpora. In this article, we approach the gender bias problem from the scope of legal text processing domain. Word embedding models that are trained on corpora composed by legal documents and legislation from different countries have been utilized to measure and eliminate gender bias in legal documents. Several methods have been employed to reveal the degree of gender bias and observe its variations over countries. Moreover, a debiasing method has been used to neutralize unwanted bias. The preservation of semantic coherence of the debiased vector space has also been demonstrated by using high-level tasks. Finally, overall results and their implications have been discussed in the scope of NLP in legal domain.Item Unknown Named-entity recognition in Turkish legal texts(Cambridge University Press, 2022-07-11) Çetindağ, Can; Yazıcıoğlu, Berkay; Koç, AykutNatural language processing (NLP) technologies and applications in legal text processing are gaining momentum. Being one of the most prominent tasks in NLP, named-entity recognition (NER) can substantiate a great convenience for NLP in law due to the variety of named entities in the legal domain and their accentuated importance in legal documents. However, domain-specific NER models in the legal domain are not well studied. We present a NER model for Turkish legal texts with a custom-made corpus as well as several NER architectures based on conditional random fields and bidirectional long-short-term memories (BiLSTMs) to address the task. We also study several combinations of different word embeddings consisting of GloVe, Morph2Vec, and neural network-based character feature extraction techniques either with BiLSTM or convolutional neural networks. We report 92.27% F1 score with a hybrid word representation of GloVe and Morph2Vec with character-level features extracted with BiLSTM. Being an agglutinative language, the morphological structure of Turkish is also considered. To the best of our knowledge, our work is the first legal domain-specific NER study in Turkish and also the first study for an agglutinative language in the legal domain. Thus, our work can also have implications beyond the Turkish language.Item Unknown Retrieving Turkish prior legal cases with deep learning(Bilkent University, 2023-06) Öztürk, Ceyhun EmreThis study utilizes deep learning models to retrieve prior legal cases in the Court of Cassation in Turkey. Given the vast legal databases that legal professionals need to navigate and the ability of computers to handle large amounts of text quickly, information retrieval algorithms prove beneficial for legal practitioners. In this thesis, we introduce our legal recurrent neural network (RNN) models and the BERTurk-Legal model. We also introduce dense word embeddings for the Turkish legal domain. Moreover, we employ RNN autoencoders, Legal RNN autoencoders, combinations of RNN autoencoders with BM25 algorithms, and BERTurk-Legal to retrieve prior legal cases. We obtain the best results with the BERTurk-Legal model.