Leveraging file significance in bus factor estimation

Date

2025-01

Editor(s)

Advisor

Tüzün, Eray

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats
1
views
3
downloads

Series

Abstract

Software projects often face developer turnover for various reasons. Since develop-ers are key sources of knowledge in these projects, their absence inevitably leads to some degree of knowledge loss. The Bus Factor (BF) is a metric used to assess the impact of this knowledge loss on a project’s continuity. Traditionally, BF is defined as the smallest group of developers whose departure would result in a loss of more than half of the project’s knowledge. Current state-of-the-art methods calculate developers’ knowledge based on the number of files they have authored, using data from version control systems (VCS). However, numerous studies have highlighted that not all files in software projects hold the same level of significance. In this study, we investigate the impact of weighting files based on their significance on the performance of two widely used BF estimators. Significance scores are calculated using five established graph metrics derived from the project’s De-pendency Graph: PageRank, In-/Out-/All-Degree, and Betweenness Centralities. Additionally, we introduce BFSig, a prototype implementing our approach. Lastly, we present a new dataset featuring BF scores reported by software practitioners from five prominent GitHub repositories. Our findings show that BFSig surpasses the baseline methods, achieving up to an 18% reduction in Normalized Mean Absolute Error (NMAE). Additionally, BFSig reduces False Negatives by 18%when identifying potential risks linked to low BF. Furthermore, our respondents validated BFSig’s versatility, highlighting its capability to evaluate the BF of individual project subfolders. In conclusion, we believe that when estimating BF from authorship, software components of greater significance should be given higher weight.

Source Title

Publisher

Course

Other identifiers

Book Title

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Citation

Published Version (Please cite this version)

Language

English

Type