-تشخیص سرطان سینه با استفاده از طبقه‌بند‌های ترکیبی جهت بهبود دقت

شمسی, محبوبه; کریمیان, محدثه; کریمیان, مرضیه

-تشخیص سرطان سینه با استفاده از طبقه‌بند‌های ترکیبی جهت بهبود دقت

نویسندگان

¹ استادیار دانشکده برق و کامپیوتر، دانشگاه صنعتی قم، قم، ایران. رایانامه: shamsi@qut.ac.ir

² کارشناسی ارشد مهندسی کامپیوتر، دانشکده مهندسی برق و کامپیوتر، دانشگاه شهاب دانش، قم، ایران. رایانامه: m.karimian90@gmail.com

³ کارشناسی ارشد مهندسی کامپیوتر، دانشکده مهندسی برق و کامپیوتر، دانشگاه شهاب دانش، قم، ایران. رایانامه: m.karimian64@gmail.com

چکیده

تشخیص زود‌هنگام سرطان سینه نقش بسیار کلیدی در درمان بیمار ایفا می‌کند. امروزه الگوریتم‌های داده‌کاوی می‌توانند روش‌های هوشمندی در نظام سلامت ارائه دهند که با دقت بالایی سرطان سینه را تشخیص دهند. هدف از انجام این مطالعه، تشخیص سرطان سینه با استفاده از طبقه‌بندهای ترکیبی بر روی پایگاه‌ داده‌ی آماده‌سازی شده‌ی WBC و WDBC می‌باشد. مدل پیشنهادی ما در پایگاه داده‌ی WBC (کاهش ویژگی‌ها با CFS+ بهینه کردن نمونه ها با روش Resample+ طبقه بند ترکیبی (kstar+ جنگل تصادفی+ شبکه‌ی بیز و بیزین ساده))، دارای بهترین دقت تشخیص (% 100)، زمان پیاده‌سازی (0 ثانیه) و بدون هیچ خطایی می‌باشد و در پایگاه داده‌ی WDBC (کاهش ویژگی‌ها با CFS+ بهینه کردن نمونه ها با روش Resample+ طبقه بند ترکیبی (الگوریتم IBK+ شبکه‌ی بیز، بیزین ساده و kstar))، دارای دقت %99.29، زمان پیاده‌سازی 0 ثانیه و میانگین خطای مطلق 0.007 می‌باشد. نتایج این مطالعه نشان می‌دهد که با توجه به روش‌های طبقه‌بند ترکیبی بر روی پایگاه‌‌داده‌ی آماده‌سازی شده می‌توان سیستم‌های نوینی برای کمک به پزشکان طراحی نمود که موجب تسهیل در فرآیندهای تشخیصی و درمانی شوند.

کلیدواژه‌ها

عنوان مقاله [English]

Breast Cancer Detection Using Ensemble Classifiers for Accuracy Improvement

نویسندگان [English]

Mahboubeh Shamsi ¹
Mohadaseh Karimian ²
Marziyeh Karimian ³

¹ Assistant Prof. faculty of Electrical and Computer, Qom University of Technology, Qom, Irany. Email: shamsi@qut.ac.ir

² Msc. of Computer Engineering, Faculty of Electrical and Computer Engineering, Shahab Danesh University, Qom, Iran. Email: m.karimian90@gmail.com

³ Msc. of Computer Engineering, Faculty of Electrical and Computer Engineering, Shahab Danesh University, Qom, Iran. Email: m.karimian64@gmail.com

چکیده [English]

Early diagnosis of breast cancer plays a crucial role in treating the patient. Nowadays, data mining algorithms can provide intelligent methods in the health and treatment system that accurately detect breast cancer. The purpose of this study is breast cancer detection using ensemble classifier based on WBC and WDBC prepared databasesa. Our proposed model in the WBC database (reducing features by cfs+ optimizing samples using Resample+ ensemble classifier using data mining algorithms (kstar + random forest + Naïve Bayes and Bayes network)) has the best detection accuracy ( 100%), implementation time (0 seconds) and without any errors and on the WDBC database (reducing features by cfs+ optimizing samples using Resample+ ensemble classifier using data mining algorithms (IBK algorithm+ Naïve Bayes, Bayes network and kstar)) has an accuracy of 99/29, the implementation time is 0 seconds, and the mean absolute error is 0/007. The results of this study show that according to the ensemble classifier methods using data mining algorithms on the prepared database, new systems can be designed to help physicians that facilitate treatment processes.

کلیدواژه‌ها [English]

Accuracy Improvement
Data Mining
Ensemble Classifiers
Feature Selection
Sampling

مراجع

Abdullah, M., Al-Anzi, F., & Al-Sharhan, S. (2018). Hybrid Multistage Fuzzy Clustering System for Medical Data Classification. Computing Sciences and Engineering (ICCSE), 2018 International Conference On, 1–6. IEEE. DOI: https://doi.org/10.1109/ICCSE1.2018.8374213

Adegoke, V. F., Chen, D., Banissi, E., & Barikzai, S. (2017). Prediction of breast cancer survivability using ensemble algorithms. Smart Systems and Technologies (SST), 2017 International Conference On, 223–231. IEEE. DOI: https://doi.org/10.1109/SST.2017.8188699

Alickovic, E., & Subasi, A. (2017). Breast cancer diagnosis using GA feature selection and Rotation Forest. Neural Computing and Applications, 28(4), 753–763. DOI: https://doi.org/10.1007/s00521-015-2103-9

Alyami, R., Alhajjaj, J., Alnajrani, B., Elaalami, I., Alqahtani, A., Aldhafferi, N., … Olatunji, S. O. (2017). Investigating the effect of Correlation based Feature Selection on breast cancer diagnosis using Artificial Neural Network and Support Vector Machines. Informatics, Health & Technology (ICIHT), International Conference On, 1–7. IEEE. DOI: https://doi.org/10.1109/ICIHT.2017.7899011

Ani, R., Jose, J., Wilson, M., & Deepa, O. S. (2018). Modified Rotation Forest Ensemble Classifier for Medical Diagnosis in Decision Support Systems. In Progress in Advanced Computing and Intelligent Engineering (pp. 137–146). Springer. DOI: https://doi.org/10.1016/j.jisa.2023.103541

Arach, S., & Bouden, H. (2019). Performance Analysis on Three Breast Cancer Datasets using Ensemble Classifiers Techniques. Computer Science, 14(4), 935–952. DOI: https://doi.org/10.1016/j.eswa.2023.122641

Avinash, K., Bijoy, M. B., & Jayaraj, P. B. (2020). Early Detection of Breast Cancer Using Support Vector Machine With Sequential Minimal Optimization. In Advanced Computing and Intelligent Engineering (pp. 13–24). Springer DOI: https://doi.org/10.1007/978-981-15-1081-6_2

Chaurasia, V., & Pal, S. (2014). Data mining techniques: to predict and resolve breast cancer survivability. International Journal of Computer Science and Mobile Computing IJCSMC, 3(1), 10–22.

Chaurasia, V., & Pal, S. (2017b). Performance analysis of data mining algorithms for diagnosis and prediction of heart and breast cancer disease.

Chawla, N. V, Japkowicz, N., & Kotcz, A. (2004). Special issue on learning from imbalanced data sets. ACM Sigkdd Explorations Newsletter, 6(1), 1–6. DOI: https://doi.org/10.1145/1007730.1007733

Cleary, J. G., & Trigg, L. E. (1995). K*: An Instance-based Learner Using an Entropic Distance Measure. ICML, 108–114. DOI: https://doi.org/10.1016/B978-1-55860-377-6.50022-0

El-Baz, A. H. (2015). Hybrid intelligent system-based rough set and ensemble classifier for breast cancer diagnosis. Neural Computing and Applications, 26(2), 437–446 DOI: https://doi.org/10.1007/s00521-014-1731-9

Fenton, N. E., & Ohlsson, N. (2000). Quantitative analysis of faults and failures in a complex software system. Software Engineering, IEEE Transactions On, 26(8), 797–814. DOI: https://doi.org/10.1109/32.879815

Gbenga, D. E., Christopher, N., & Yetunde, D. C. (2017). Performance Comparison of Machine Learning Techniques for Breast Cancer Detection. Nova, 6(1), 1–8 DOI: https://doi.org/10.20286/nova-jeas-060105

Gupta, P., & Shalini, L. (2018). Analysis of Machine Learning Techniques for Breast Cancer Prediction. International Journal Of Engineering And Computer Science, 7(05), 23891–23895. DOI: https://doi.org/10.31033/ijemr.11.1.12

Hall, M. A. (1999). Correlation-based feature selection for machine learning. DOI: https://doi.org/10.4236/ojbm.2021.92030

Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques. Elsevier DOI: https://doi.org/10.4236/als.2019.74012

Hazra, A., Mandal, S. K., & Gupta, A. (2016). Study and Analysis of Breast Cancer Cell Detection using Naïve Bayes, SVM and Ensemble Algorithms. International Journal of Computer Applications, 145(2). DOI: https://doi.org/10.5120/ijca2016910595

Huang, M.-W., Chen, C.-W., Lin, W.-C., Ke, S.-W., & Tsai, C.-F. (2017). SVM and SVM ensembles in breast cancer prediction. PloS One, 12(1), e0161501 DOI: https://doi.org/10.1371/journal.pone.0161501

Jensen, F. V. (1996). An introduction to Bayesian networks (Vol. 210). UCL press London. DOI: https://doi.org/10.1016/j.ifacol.2018.07.024

Joshi, A., & Mehta, A. (2018a). ANALYSIS OF K-NEAREST NEIGHBOR TECHNIQUE FOR BREAST CANCER DISEASE CLASSIFICATION. Machine Learning, 98, 13. DOI: https://doi.org/10.47611/jsrhs.v12i4.5577

Joshi, A., & Mehta, A. (2018b). BREAST CANCER DATA CLASSIFICATION USING NEURAL NETWORK AND DEEP NEURAL NETWORK TECHNIQUES. Int J Recent Sci Res, 9(4), 25788–25792. DOI: https://doi.org/10.1504/IJISDC.2020.10037864

Khuriwal, N., & Mishra, N. (2018). Breast cancer diagnosis using adaptive voting ensemble machine learning algorithm. 2018 IEEMA Engineer Infinite Conference (ETechNxT), 1–5. IEEE. DOI: https://doi.org/10.1109/ETECHNXT.2018.8385355

Kittler, J., Hatef, M., Duin, R. P. W., & Matas, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 226–239. DOI: https://doi.org/10.1109/34.667881

Koru, A. G., & Liu, H. (2005). Building effective defect-prediction models in practice. Software, IEEE, 22(6), 23–29. DOI: https://doi.org/10.1109/MS.2005.149

Krawczyk, B. (2015). One-class classifier ensemble pruning and weighting with firefly algorithm. Neurocomputing, 150, 490–500. DOI: https://doi.org/10.1016/j.neucom.2014.07.068

Kumar, U. K., Nikhil, M. B. S., & Sumangali, K. (2017). Prediction of breast cancer using voting classifier technique. Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM), 2017 IEEE International Conference On, 108–114. IEEE DOI: https://doi.org/10.1109/ICSTM.2017.8089135

Mandal, S. K. (2017). Performance Analysis Of Data Mining Algorithms For Breast Cancer Cell Detection Using Naïve Bayes, Logistic Regression and Decision Tree. International Journal Of Engineering And Computer Science, 6(2) DOI: https://doi.org/10.1088/1742-6596/1577/1/012051

Menzies, T., Greenwald, J., & Frank, A. (2007). Data mining static code attributes to learn defect predictors. Software Engineering, IEEE Transactions On, 33(1), 2–13. DOI: https://doi.org/10.1109/TSE.2007.256941

Michalak, K., & Kwasnicka, H. (2006). Correlation-based feature selection strategy in neural classification. Intelligent Systems Design and Applications, 2006. ISDA’06. Sixth International Conference On, 1, 741–746. IEEE. DOI: https://doi.org/10.1109/ISDA.2006.128

Newman, D. J., Hettich, S., Blake, C. L., Merz, C. J., & Aha, D. W. (1998). UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA. 1998 of Conference, Http://Archive. Ics. Uci. Edu/Ml/Datasets. Html. DOI: https://doi.org/10.4236/me.2013.410068

Nilashi, M., bin Ibrahim, O., Ahmadi, H., & Shahmoradi, L. (2017). An analytical method for diseases prediction using machine learning techniques. Computers & Chemical Engineering, 106, 212–223. DOI: https://doi.org/10.1016/j.compchemeng.2017.06.011

Peng, C.-Y. J., Harwell, M., Liou, S.-M., & Ehman, L. H. (2006). Advances in missing data methods and implications for educational research. Real Data Analysis, 3178 DOI: https://doi.org/10.1007/s42979-022-01249-z

Rachman, G. H., Khodra, M. L., & Widyantoro, D. H. (2017). Rhetorical Sentence Categorization for Scientific Paper Using Word2Vec Semantic 36Representation. Journal of Physics: Conference Series, 801(1), 12070. IOP Publishing DOI: https://doi.org/10.1088/1742-6596/801/1/012070

Rohan, T. I., Siddik, A. B., Islam, M., & Yusuf, M. S. U. (2019). A Precise Breast Cancer Detection Approach Using Ensemble of Random Forest with AdaBoost. 2019 International Conference on Computer, Communication, Chemical, Materials and Electronic Engineering (IC4ME2), 1–4. IEEE. DOI: https://doi.org/10.1109/IC4ME247184.2019.9036697

Salama, G. I., Abdelhalim, M., & Zeid, M. A. (2012). Breast cancer diagnosis on three different datasets using multi-classifiers. Breast Cancer (WDBC), 32(569), 2

Siegel, R. L., Miller, K. D., & Jemal, A. (2017). Cancer statistics, 2017. CA: A Cancer Journal for Clinicians, 67(1), 7–30. DOI: https://doi.org/10.3322/caac.21387

Teh, Y.-C., Tan, G.-H., Taib, N. A., Rahmat, K., Westerhout, C. J., Fadzli, F., … Yip, C.-H. (2015). Opportunistic mammography screening provides effective detection rates in a limited resource healthcare system. BMC Cancer, 15(1), 405 DOI: https://doi.org/10.1186/s12885-015-1419-2

West, D., Mangiameli, P., Rampal, R., & West, V. (2005). Ensemble strategies for a medical diagnostic decision support system: A breast cancer diagnosis application. European Journal of Operational Research, 162(2), 532–551 DOI: https://doi.org/10.1016/j.ejor.2003.10.013

Witten, I. H., & Frank, E. (2005). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann ISBN:978-0-12-374856-0

Wozniak, M., Grana, M., & Corchado, E. (2014). A survey of multiple classifier systems as hybrid systems. Information Fusion, 16, 3–17. DOI: https://doi.org/10.1016/j.inffus.2013.04.006

Zhang, H., & Su, J. (2008). Naive Bayes for optimal ranking. Journal of Experimental & Theoretical Artificial Intelligence, 20(2), 79–93 DOI: https://doi.org/10.1080/09528130701476391

نام و نام خانوادگی *

پست الکترونیکی *

وابستگی سازمانی *

توضیحات *

شناسه امنیتی *

دوره 8، شماره 2 - شماره پیاپی 15
مهر 1401
صفحه 92-109

تعداد مشاهده مقاله: 890
تعداد دریافت فایل اصل مقاله: 705

-تشخیص سرطان سینه با استفاده از طبقه‌بند‌های ترکیبی جهت بهبود دقت

Breast Cancer Detection Using Ensemble Classifiers for Accuracy Improvement

مراجع

ارسال نظر در مورد این مقاله

دوره 8، شماره 2 - شماره پیاپی 15
مهر 1401
صفحه 92-109

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

-تشخیص سرطان سینه با استفاده از طبقه‌بند‌های ترکیبی جهت بهبود دقت

Breast Cancer Detection Using Ensemble Classifiers for Accuracy Improvement

مراجع

ارسال نظر در مورد این مقاله

دوره 8، شماره 2 - شماره پیاپی 15مهر 1401صفحه 92-109

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

دوره 8، شماره 2 - شماره پیاپی 15
مهر 1401
صفحه 92-109