Investigating the validity of ground truth in code reviewer recommendation studies

Doğan, EmreTüzün, ErayTecimer, Kazım AyberkGüvenir, Halil Altay2020-01-272020-01-27201997817281296931949-3770http://hdl.handle.net/11693/52823Date of Conference: 19-20 September 2019Conference Name: 13th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2019Background: Selecting the ideal code reviewer in modern code review is a crucial first step to perform effective code reviews. There are several algorithms proposed in the literature for recommending the ideal code reviewer for a given pull request. The success of these code reviewer recommendation algorithms is measured by comparing the recommended reviewers with the ground truth that is the assigned reviewers selected in real life. However, in practice, the assigned reviewer may not be the ideal reviewer for a given pull request.Aims: In this study, we investigate the validity of ground truth data in code reviewer recommendation studies.Method: By conducting an informal literature review, we compared the reviewer selection heuristics in real life and the algorithms used in recommendation models. We further support our claims by using empirical data from code reviewer recommendation studies.Results: By literature review, and accompanying empirical data, we show that ground truth data used in code reviewer recommendation studies is potentially problematic. This reduces the validity of the code reviewer datasets and the reviewer recommendation studies. Conclusion: We demonstrated the cases where the ground truth in code reviewer recommendation studies are invalid and discussed the potential solutions to address this issue.EnglishReviewer recommendationGround truthCognitive biasAttribute substitutionSystematic noiseThreats to validityInvestigating the validity of ground truth in code reviewer recommendation studiesConference Paper10.1109/ESEM.2019.887019097817281296861949-3789