Feature Selection for Bankruptcy Prediction
The prediction of company bankruptcy is a critical concern for stakeholders due to its potential financial implications. This project addresses the challenge of identifying the key attributes that contribute to bankruptcy, given the complexity and the significant class imbalance in the data.
To tackle this, I employed a novel combination of advanced techniques:
-
Variance Threshold
-
Mutual Information Technique
-
Pearson Correlation
-
SMOTE (Synthetic Minority Oversampling Technique)
Using a Taiwanese dataset, I applied these pre-processing methods to enhance the data quality. Subsequently, I implemented four classifiers—Random Forest, Decision Tree, AdaBoost, and XGBoost—to assess their performance on the refined data.
The results demonstrated improved predictive accuracy, providing a robust comparative analysis of these algorithms in the context of bankruptcy prediction. This comprehensive approach not only identified the essential attributes but also addressed the data imbalance issue, offering valuable insights for more reliable bankruptcy prediction models.

