Prediction the Choice of Financing for Start-ups using Machine Learning Algorithms and Behavioral Biases

Niazi, Naimeh; Razavi, Hamideh

doi:10.22091/jemsc.2024.11203.1200

Prediction the Choice of Financing for Start-ups using Machine Learning Algorithms and Behavioral Biases

Document Type : Original Article

Authors

Department of Industrial Engineering, Faculty of Engineering, University of Ferdowsi of Mashhad, Mashhad, Iran

10.22091/jemsc.2024.11203.1200

Abstract

The aim of this paper is to predict financing methods to support decision-making for startup founders and their investors. Initially, factors influencing the choice of financing methods, including structural, demographic, and behavioral factors, were identified. These factors were then assessed using a questionnaire consisting of 32 items, which was sent online to startup founders. Based on 70 responses received and using algorithms including binary matching, classification chains, label power set, K-nearest neighbors, extreme gradient boosting, cluster boosting algorithm and random forest, the financing methods chosen by startups were predicted. Comparison of the results from the algorithms shows that the boosting ensemble algorithm, with an F1 score of 89 and precison of 85%, predicts the selected financing methods on the test dataset better than other algorithms. Additionally, data analysis indicates that startups are more inclined towards personal funding methods, which aligns with the prevalence of loss aversion bias among entrepreneurs. Following loss aversion, overconfidence, anchoring, and illusion of control biases were the most frequent among entrepreneurs.

Keywords

Main Subjects

Decision Support System (DSS)

References

Azoulay, P., Jones, B. F., Kim, J. D., & Miranda, J. (2020). Age and high-growth entrepreneurship. American Economic Review: Insights, 2(1), 65-82. https://dx.doi.org/10.3386/w24489

Bailly, A., Blanc, C., Francis, É., Guillotin, T., Jamal, F., Wakim, B., & Roy, P. (2022). Effects of dataset size and interactions on the prediction performance of logistic regression and deep learning models. Computer Methods and Programs in Biomedicine, 213, 106504. http://dx.doi.org/10.1016/j.cmpb.2021.106504 .

Bazerman, M. H., & Moore, D. A. (2012). Judgment in managerial decision making. John Wiley & Sons. http://dx.doi.org/10.4324/9780203141939-11

Bolarinwa, O. A. (2015). Principles and methods of validity and reliability testing of questionnaires used in social and health science researches. Nigerian Postgraduate Medical Journal, 22(4), 195. http://dx.doi.org/10.4103/1117-1936.173959

Brownlee, J. (2020). Data preparation for machine learning: data cleaning, feature selection, and data transforms in Python. Machine Learning Mastery.

Cassar, G. (2004). The financing of business start-ups. Journal of Business Venturing, 19(2), 261-283. https://doi.org/10.1016/S0883-9026(03)00029-6

Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I., & Zhou, T. (2015). Xgboost: extreme gradient boosting. R package version 0.4-2, 1(4), 1-4. http://dx.doi.org/10.32614/cran.package.xgboost

Ding, H., Sun, Y., Wang, Z., Huang, N., Shen, Z., & Cui, X. (2023). RGAN-EL: A GAN and ensemble learning-based hybrid approach for imbalanced data classification. Information Processing & Management, 60(2), 103235. http://dx.doi.org/10.1016/j.ipm.2022.103235

Dominic, C., & Gupta, A. (2020). Psychological factors affecting investors decision making. Journal of Xi’an University of Architecture and Technology, 7(6), 169-181. http://dx.doi.org/10.55041/ijsrem30872 .

Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14, 241-258 . http://dx.doi.org/10.1007/s11704-019-8208 .

Dorogush, A. V., Ershov, V., & Gulin, A. (2018). CatBoost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363. https://doi.org/10.48550/arXiv.1810.1136 .

Elston, D. M. (2021). Survivorship bias. Journal of the American Academy of Dermatology. http://dx.doi.org/10.1016/j.jaad.2021.06.84 .

Fotouhi, S., Asadi, S., & Kattan, M. W. (2019). A comprehensive data level analysis for cancer diagnosis on imbalanced data. Journal of biomedical informatics, 90, 103089. https://doi.org/10.1016/j.jbi.2018.12.003

Franco, S., Cappa, F., & Pinelli, M. (2021). Founder Education and Start-Up Funds Raised. IEEE Engineering Management Review, 49(3), 42-48. https://doi.org/10.1109/EMR.2021.3077966

Frid, C. J., Wyman, D. M., Gartner, W. B., & Hechavarria, D. H. (2016). Low-wealth entrepreneurs and access to external financing. International Journal of Entrepreneurial Behavior & Research. http://dx.doi.org/10.1108/ijebr-08-2015-0173

Ganda, D., & Buch, R. (2018). A survey on multi label classification. Recent Trends in Programming Languages, 5(1), 19-23

Gong, J., & Kim, H. (2017). RHSBoost: Improving classification performance in imbalance data. Computational Statistics & Data Analysis, 111, 1-13. http://dx.doi.org/10.1016/j.csda.2017.01.005

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139-144. https://doi.org/10.1145/3422622

Han, J., Kamber, M., & Pei, J. (2012). Data mining concepts and techniques third edition. University of Illinois at Urbana-Champaign Micheline Kamber Jian Pei Simon Fraser University.

Hebert, C. (2020). Gender stereotypes and entrepreneur financing. 10th Miami Behavioral Finance Conference.

Israel, G. D. (1992). Determining sample size. https://dx.doi.org/10.2139/ssrn.3318245

Jafari, R. (2022). Hands-On Data Preprocessing in Python: Learn How to Effectively Prepare Data for Successful Data Analytics. Packt Publishing. https://books.google.com/books?id=nzmnzgEACAAJ .

Kahneman, D. (2011). Thinking, fast and slow. Macmillan.

Krawezik, G. P., Kogge, P. M., Dysart, T. J., Kuntz, S. K., & McMahon, J. O. (2018). Implementing the jaccard index on the migratory memory-side processing emu architecture. 2018 IEEE High Performance extreme Computing Conference (HPEC).

Langer, E. J. (1975). The illusion of control. Journal of personality and social psychology, 32(2), 311. http://dx.doi.org/10.1037//0022-3514.32.2.311

Lybaert, N., & Umans, I. (2022). Start-up Performance: Looking for an Explanation in Entrepreneurial Characteristics and Financing Choice. European Conference on Innovation and Entrepreneurship, https://doi.org/10.34190/ecie.17.1.833

Marsland, S. (2014). Machine Learning: An Algorithmic Perspective, Second Edition. CRC Press. https://books.google.com/books?id=6GvSBQAAQBAJ .

McCallum, Q. E. (2012). Bad data handbook: cleaning up the data so you can get back to work. " O'Reilly Media, Inc.".

Myung, I. J. (2000). The importance of complexity in model selection. Journal of mathematical psychology, 44(1), 190-204. https://doi.org/10.1006/jmps.1999.1283

Ouimet, P., & Zarutskie, R. (2014). Who works for startups? The relation between firm age, employee age, and growth. Journal of financial Economics, 112(3), 386-407. https://doi.org/10.1016/j.jfineco.2014.03.003

Pavlov, Y. L. (2019). Random forests. In Random Forests. De Gruyter.

Pushpa, M., & Karpagavalli, S. (2017). Multi-label classification: problem transformation methods in Tamil phoneme classification. Procedia Computer Science, 115, 572-579. https://doi.org/10.1016/j.procs.2017.09.116

Ramalakshmi, V., Pathak, V. K., & Mary, C. (2019). Impact of Cognitive Biases on investment decision making. Journal of Critical Reviews, 6(6), 59-64

Rivolli, A., Read, J., Soares, C., Pfahringer, B., & de Carvalho, A. C. (2020). An empirical analysis of binary transformation strategies and base algorithms for multi-label learning. Machine Learning, 109, 1509-1563. https://doi.org/10.1007/s10994-020-05879-3

Rosenfeld, A., & Kraus, S. (2018). Predicting human decision-making: From prediction to action. Synthesis lectures on artificial intelligence and machine learning, 12(1), 1-150. https://doi.org/10.1007/978-3-031-01578-6_3

Rosyidah, U., & Pratikto, H. (2022). The role of behavioral bias on financial decision making: a systematic literature review and future research agenda. Journal of Enterprise and Development (JED), 4(1), 156-179. https://doi.org/10.20414/jed.v4i1.5102

Simon, M., & Houghton, S. M. (2003). The relationship between overconfidence and the introduction of risky products: Evidence from a field study. Academy of management journal, 46(2), 139-149. https://doi.org/10.5465/30040610

Tanha, J., Abdi, Y., Samadi, N., Razzaghi, N., & Asadpour, M. (2020). Boosting methods for multi-class imbalanced data classification: an experimental review. Journal of Big Data, 7, 1-47. https://doi.org/ 10.1186/s40537-020-00349-y

Tech, R. P. (2018). Financing high-tech startups. Springer. https://doi.org/10.1007/978-3-319-66155-1

Tsoumakas, G., Katakis, I., & Vlahavas, I. (2010). Mining multi-label data. Data mining and knowledge discovery handbook, 667-685. https://doi.org/10.1007/978-0-387-09823-434

Ul Abdin, S. Z., Qureshi, F., Iqbal, J., & Sultana, S. (2022). Overconfidence bias and investment performance: A mediating effect of risk propensity. Borsa Istanbul Review, 22(4), 780-793. https://doi.org/10.1016/j.bir.2022.03.001

Vo, D. H. (2019). Patents and Early‐Stage Financing: Matching versus Signaling. Journal of small business management, 57(4), 1252-1279. https://doi.org/10.1111/jsbm.12414

Wu, G., & Zhu, J. (2020). Multi-label classification: do Hamming loss and subset accuracy really conflict with each other? Advances in Neural Information Processing Systems, 33, 3130-3140. https://doi.org/10.48550/arXiv.2011.07805

Zahera, S. A., & Bansal, R. (2018). Do investors exhibit behavioral biases in investment decision making? A systematic review. Qualitative Research in Financial Markets. https://doi.org/10.1108/QRFM-04-2017-0028

Zhang, S. X., & Cueto, J. (2017). The study of bias in entrepreneurship. Entrepreneurship theory and Practice, 41(3), 419-454. https://doi.org/10.1111/etap.12212

Zhang, S. X., Foo, M.-D., & Vassolo, R. S. (2021). The ramifications of effectuation on biases in entrepreneurship–Evidence from a mixed-method approach. Journal of Business Venturing Insights, 15, e00238. https://doi.org/10.1016/j.jbvi.2021.e00238

Zhang, Y., & Thorburn, P. J. (2022). Handling missing data in near real-time environmental monitoring: A system and a review of selected methods. Future Generation Computer Systems, 128, 63-72. https://doi.org/10.1016/j.future.2021.09.033

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Engineering Management and Soft Computing

Volume 10, Issue 1 - Serial Number 18
August 2024
Pages 238-261

Article View: 822
PDF Download: 390

Prediction the Choice of Financing for Start-ups using Machine Learning Algorithms and Behavioral Biases

References

Send comment about this article

Volume 10, Issue 1 - Serial Number 18
August 2024
Pages 238-261

Files

Share

How to cite

Statistics

Prediction the Choice of Financing for Start-ups using Machine Learning Algorithms and Behavioral Biases

References

Send comment about this article

Volume 10, Issue 1 - Serial Number 18August 2024Pages 238-261

Files

Share

How to cite

Statistics

Volume 10, Issue 1 - Serial Number 18
August 2024
Pages 238-261