TY - JOUR
T1 - Bayesian network-based over-sampling method (BOSME) with application to indirect cost-sensitive learning
AU - Delgado de la Torre, Rosario
AU - Núñez-González, José-David
N1 - Publisher Copyright:
© 2022, The Author(s).
PY - 2022/5/24
Y1 - 2022/5/24
N2 - Traditional supervised learning algorithms do not satisfactorily solve the classification problem on imbalanced data sets, since they tend to assign the majority class, to the detriment of the minority class classification. In this paper, we introduce the Bayesian network-based over-sampling method (BOSME), which is a new over-sampling methodology based on Bayesian networks. Over-sampling methods handle imbalanced data by generating synthetic minority instances, with the benefit that classifiers learned from a more balanced data set have a better ability to predict the minority class. What makes BOSME different is that it relies on a new approach, generating artificial instances of the minority class following the probability distribution of a Bayesian network that is learned from the original minority classes by likelihood maximization. We compare BOSME with the benchmark synthetic minority over-sampling technique (SMOTE) through a series of experiments in the context of indirect cost-sensitive learning , with some state-of-the-art classifiers and various data sets, showing statistical evidence in favor of BOSME, with respect to the expected (misclassification) cost.
AB - Traditional supervised learning algorithms do not satisfactorily solve the classification problem on imbalanced data sets, since they tend to assign the majority class, to the detriment of the minority class classification. In this paper, we introduce the Bayesian network-based over-sampling method (BOSME), which is a new over-sampling methodology based on Bayesian networks. Over-sampling methods handle imbalanced data by generating synthetic minority instances, with the benefit that classifiers learned from a more balanced data set have a better ability to predict the minority class. What makes BOSME different is that it relies on a new approach, generating artificial instances of the minority class following the probability distribution of a Bayesian network that is learned from the original minority classes by likelihood maximization. We compare BOSME with the benchmark synthetic minority over-sampling technique (SMOTE) through a series of experiments in the context of indirect cost-sensitive learning , with some state-of-the-art classifiers and various data sets, showing statistical evidence in favor of BOSME, with respect to the expected (misclassification) cost.
KW - Algorithms
KW - Bayes Theorem
UR - https://www.mendeley.com/catalogue/3882b0a9-ac83-3787-9bc1-6142bc27603a/
UR - https://portalrecerca.uab.cat/en/publications/86784661-8cf3-4b07-b904-87885df42e95
UR - http://www.scopus.com/inward/record.url?scp=85130718844&partnerID=8YFLogxK
U2 - 10.1038/s41598-022-12682-8
DO - 10.1038/s41598-022-12682-8
M3 - Article
C2 - 35610323
SN - 2045-2322
VL - 12
SP - 8724
JO - SCIENTIFIC REPORTS
JF - SCIENTIFIC REPORTS
IS - 1
M1 - 8724
ER -