Detection and elimination of systematic labeling bias in code reviewer recommendation systems

buir.contributor.authorTüzün, Eray
buir.contributor.authorDibeklioğlu, Hamdi
buir.contributor.orcidTüzün, Eray|0000-0002-5550-7816
dc.citation.epage191en_US
dc.citation.spage181en_US
dc.contributor.authorTecimer, K. Ayberk
dc.contributor.authorTüzün, Eray
dc.contributor.authorDibeklioğlu, Hamdi
dc.contributor.authorErdoğmuş, Hakan
dc.coverage.spatialTrondheim, Norwayen_US
dc.date.accessioned2022-02-03T06:08:29Z
dc.date.available2022-02-03T06:08:29Z
dc.date.issued2021-06-21
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionConference Name: EASE 2021: Evaluation and Assessment in Software Engineeringen_US
dc.descriptionDate of Conference: June 2021en_US
dc.description.abstractReviewer selection in modern code review is crucial for effective code reviews. Several techniques exist for recommending reviewers appropriate for a given pull request (PR). Most code reviewer recommendation techniques in the literature build and evaluate their models based on datasets collected from real projects using open-source or industrial practices. The techniques invariably presume that these datasets reliably represent the “ground truth.” In the context of a classification problem, ground truth refers to the objectively correct labels of a class used to build models from a dataset or evaluate a model’s performance. In a project dataset used to build a code reviewer recommendation system, the recommended code reviewer picked for a PR is usually assumed to be the best code reviewer for that PR. However, in practice, the recommended code reviewer may not be the best possible code reviewer, or even a qualified one. Recent code reviewer recommendation studies suggest that the datasets used tend to suffer from systematic labeling bias, making the ground truth unreliable. Therefore, models and recommendation systems built on such datasets may perform poorly in real practice. In this study, we introduce a novel approach to automatically detect and eliminate systematic labeling bias in code reviewer recommendation systems. The bias that we remove results from selecting reviewers that do not ensure a permanently successful fix for a bug-related PR. To demonstrate the effectiveness of our approach, we evaluated it on two open-source project datasets —HIVE and QT Creator— and with five code reviewer recommendation techniques —Profile-Based, RSTrace, Naive Bayes, k-NN, and Decision Tree. Our debiasing approach appears promising since it improved the Mean Reciprocal Rank (MRR) of the evaluated techniques up to 26% in the datasets used.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2022-02-03T06:08:29Z No. of bitstreams: 1 Detection_and_Elimination_of_Systematic_Labeling_Bias_in_Code_Reviewer_Recommendation_Systems.pdf: 1283429 bytes, checksum: 32ef700fe268d1cd66681494646b82ef (MD5)en
dc.description.provenanceMade available in DSpace on 2022-02-03T06:08:29Z (GMT). No. of bitstreams: 1 Detection_and_Elimination_of_Systematic_Labeling_Bias_in_Code_Reviewer_Recommendation_Systems.pdf: 1283429 bytes, checksum: 32ef700fe268d1cd66681494646b82ef (MD5) Previous issue date: 2021-06-21en
dc.identifier.doi10.1145/3463274.3463336en_US
dc.identifier.isbn978-145039053-8en_US
dc.identifier.urihttp://hdl.handle.net/11693/76981en_US
dc.language.isoEnglishen_US
dc.publisherAssociation for Computing Machineryen_US
dc.relation.isversionofhttps://doi.org/10.1145/3463274.3463336en_US
dc.source.titleEASE 2021: Evaluation and Assessment in Software Engineeringen_US
dc.subjectModern code reviewen_US
dc.subjectGround truthen_US
dc.subjectLabeling bias eliminationen_US
dc.subjectSystematic labeling biasen_US
dc.subjectData cleaningen_US
dc.subjectCode review recommendationen_US
dc.titleDetection and elimination of systematic labeling bias in code reviewer recommendation systemsen_US
dc.typeConference Paperen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Detection_and_Elimination_of_Systematic_Labeling_Bias_in_Code_Reviewer_Recommendation_Systems.pdf
Size:
1.22 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: