Abstract:
Background. Human Resource departments hire employees based on behavioral and technical assessments, but sometimes these decisions can be biased. Firms incur losses in terms of time and hiring costs when an employee resigns. Machine learning algorithms can help alleviate this problem if applied correctly. Such algorithms can process bulk data and automate processes that would have been otherwise slow and tedious.
Objectives. This study aims to explore factors that influence attrition and thus help reduce the cost a company incurs. We use machine learning algorithms to predict employee turnover, analyze the factors that lead to employee attrition, and group them by their impact on the number of years an employee is willing to stay in the company.
Methods. We test and compare several machine learning algorithms applied on our (fictitious) dataset such as Random Forest, Decision Tree, Naïve Bayes, Logistic Regression, Adaptive Boosting, Support Vector Machine, K-nearest neighbors, and Artificial Neural Networks. We also evaluate the features that contribute to attrition by ranking them from the most to least important. Moreover, we build a regression model and highlight the features mostly correlated with employee attrition whether positively or negatively. Finally, we present a retention plan to avoid attrition based on our collective results deduced from the analysis.
Results. Random Forest gave the best results on our dataset in terms of AUROC and other evaluation measures. The most important features that influenced attrition were overtime, total satisfaction score, marital status, stock options level, and monthly income. Moreover, age and monthly income showed a positive correlation with the number of years an employee stayed at the company, whereas the distance from work to home and the number of companies an employee had worked in showed a negative correlation.
Conclusion. The thesis findings highlight the reasons behind employee attrition. We provide detailed recommendations based on our results for reducing attrition and lowering attrition costs. The approach and methodology followed in this work can be elaborated and applied to real-world HR datasets.