Portfolio Optimization using Deep Reinforcement Learning

Namdari-Birgani, Somayeh; Seddighi, Amir Hossein; Molla-Alizadeh-Zavardehi, Saber

doi:10.22091/jemsc.2025.11158.1192

Portfolio Optimization using Deep Reinforcement Learning

Document Type : Original Article

Authors

¹ Department of management, Masjed-Soleiman Branch, Islamic Azad University, Masjed -Soleiman, Iran

² Department of Management, Masjed- Soleiman Branch, Islamic Azad University, Masjed -Soleiman, Iran

³ Department of Industrial Engineering, Masjed -Soleiman Branch, Islamic Azad University, Masjed-Soleiman, Iran

10.22091/jemsc.2025.11158.1192

Abstract

This research aims to train an intelligent trader by using artificial intelligence concepts that can help to make optimal decisions for investing in the stock portfolio. For this purpose, a method based on Q deep reinforcement learning is presented for portfolio optimization. In this method, the policy network and the target policy network are used to learn the actions, and the learning network and the target network are used to estimate the optimal Q. The data related to the companies constituting the Dow Jones Industrial Average (DJIA) from March 2008 to October 2021 are used to evaluate the proposed method. Moreover, the performance of the proposed method is compared with conventional investment strategies and two deep reinforcement learning algorithms, PPO and SAC. The results indicate that the proposed method has the best performance on the test data with a total profit of 35.6% compared to other investigated methods. On the other hand, the Sharpe ratio of the proposed method is the highest value, which implies this strategy performs better in balancing profit and risk.

Keywords

Main Subjects

Applications of Artificial Intelligence (AI)

References

Alfonso-Sánchez, S., Solano, J., Correa-Bahnsen, A., Sendova, K. P., & Bravo, C. (2024). Optimizing credit limit adjustments under adversarial goals using reinforcement learning. European Journal of Operational Research, 315(2), 802-817. https://doi.org/10.1016/j.ejor.2023.12.025

Ang, A. (2012). Mean-variance investing. Columbia Business School Research Paper, no. 12/49. https://dx.doi.org/10.2139/ssrn.2131932

Aroussi, R. (2024). yfinance. Download market data from Yahoo! Finance's API. https://github.com/ranaroussi/yfinance

Chen, L., & Gao, Q. (2019). Application of deep reinforcement learning on automated stock trading. In Proceedings of the 10th International Conference on Software Engineering and Service Science (ICSESS), 29-33. https://doi.org/10.1109/ICSESS47205.2019.9040728

Du, X., Zhai, J., & Lv, K. (2009). Algorithm trading using q-learning and recurrent reinforcement learning. Stanford University, 1-7

Duerson, S., Khan, F., Kovalev, V., & Malik, A. H. (2005). Reinforcement learning in online stock trading systems. Georgia Institute of Technology.

Filos, A. (2018). Reinforcement learning for portfolio management. MEng Thesis. Imperial College London. https://doi.org/10.48550/arXiv.1909.09571

Gao, X., & Chan, L. (2000). An Algorithm for Trading and Portfolio Management Using Q-learning and Sharpe Ratio Maximization. In Proceedings of the 7th International Conference On Neural Information Processing (ICONIP 2000), 832-837.

Graesser, L., & Keng, W. L. (2019). Foundations of Deep Reinforcement Learning: Theory and Practice in Python. Addison-Wesley Professional.

Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. arXiv, 1-14. https://doi.org/10.48550/arXiv.1801.01290

Jiang, Z., Xu, D., & Liang, J. (2017). A deep reinforcement learning framework for the financial portfolio management problem. arXiv, 1-31. https://doi.org/10.48550/arXiv.1706.10059

Jin, O., & El-Saawy, H. (2016). Portfolio management using reinforcement learning. Stanford University, 1-6.

Kabbani, T., & Duman, E. (2022). Deep Reinforcement Learning Approach for Trading Automation in the Stock Market. IEEE Access, 10, 93564-93574. https://doi.org/10.1109/ACCESS.2022.3203697

Kingma, D., & Ba, J. (2015). Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), 1-15. https://doi.org/10.48550/arXiv.1412.6980

Li, S. E. (2023). Reinforcement Learning for Sequential Decision and Optimal Control.Singapore: Springer Verlag. https://doi.org/10.1007/978-981-19-7784-8

Liang, Z., Chen, H., Zhu, J., Jiang, K., & Li, Y. (2018). Adversarial deep reinforcement learning in portfolio management. arXiv, 1-11. https://doi.org/10.48550/arXiv.1808.09940

Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2019). Continuous control with deep reinforcement learning. arXiv, 1-14. https://doi.org/10.48550/arXiv.1509.02971

Malkiel, B. G. (2003). Passive investment strategies and efficient markets. European Financial Management, 9(1), 1-10. https://doi.org/10.1111/1468-036X.00205

Markowitz, H. M. (1991). Portfolio Selection: Efficient Diversification of Investments, 2nd Edition. New York: Wily.

Martin, R. A. (2021). PyPortfolioOpt: portfolio optimization in Python. Journal of Open Source Software, 6(61), 3066. https://doi.org/10.21105/joss.03066

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning. arXiv, 1-9. https://doi.org/10.48550/arXiv.1312.5602

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518, 529–533. https://doi.org/10.1038/nature14236

Ngo, V. M., Nguyen, H. H., & Van Nguyen, P. (2023). Does reinforcement learning outperform deep learning and traditional portfolio optimization models in frontier and developed financial markets?. Research in International Business and Finance, 65, 101936. https://doi.org/10.1016/j.ribaf.2023.101936

Poole, D. L., & Mackworth, A. K. (2023). Artificial Intelligence: Foundations of Computational Agents, 3rd edition. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781009258227

Quantopian Inc. (2019). pyfolio. Portfolio and risk analytics in Python. https://github.com/quantopian/pyfolio

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv, 1-12. https://doi.org/10.48550/arXiv.1707.06347

Sharpe, W. F. (1994). The Sharpe Ratio. The Journal of Portfolio Management, 21, 49-58. https://doi.org/10.3905/jpm.1994.409501

Soleymani, F., & Paquet, E. (2021). Deep graph convolutional reinforcement learning for financial portfolio management – DeepPocket. Expert Systems with Applications, 182, 115-127. https://doi.org/10.1016/j.eswa.2021.115127

Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction, Second edition. Cambridge: MIT press.

Tolouie Eshlaghy, A., & Haghdoust, S. (2007). Modelling of Prediction Stock Price by Using Neural Networks and Compare it with Mathematical Prediction Methods. Economics Research, 7(25), 237-251. (In Persian)

van Otterlo, M., & Wiering, M. (2012). Reinforcement Learning and Markov Decision Processes. In: Wiering, M., van Otterlo, M. (eds) Reinforcement Learning. Adaptation, Learning, and Optimization, vol 12. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27645-3_1

Varga, B., Kulcsár, B., & Chehreghani, M. H. (2023). Deep Q-learning: A robust control approach. International Journal of Robust and Nonlinear Control, 33(1), 526-544. https://doi.org/10.1002/rnc.6457

Vergara, G., & Kristjanpoller, W. (2024). Deep reinforcement learning applied to statistical arbitrage investment strategy on cryptomarket. Applied Soft Computing, 153, 111255. https://doi.org/10.1016/j.asoc.2024.111255

Wang, Y., Wang, D., Zhang, S., Feng, Y., Li, S., & Zhou, Q. (2017). Deep Q-trading. Technical Report-20160036. Center for Speech and Language Technologies (CSLT), Tsinghua University, 1-9.

Wang, Z., Schaul, T., Hessel, M., van Hasselt, H., Lanctot, M., & de Freitas, N. (2016). Dueling Network Architectures for Deep Reinforcement Learning. arXiv, 1-15. https://doi.org/10.48550/arXiv.1511.06581

Xiong, Z., Liu, X.-Y., Zhong, S., Yang, H., & Walid, A. (2022). Practical deep reinforcement learning approach for stock trading. arXiv, 1-7. https://doi.org/10.48550/arXiv.1811.07522

Ye, Y., Pei, H., Wang, B., Chen, P.-Y., Zhu, Y., Xiao, J., & Li, B. (2020). Reinforcement-Learning Based Portfolio Management with Augmented Asset Movement Prediction States. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, 34(01), 1112-1119. https://doi.org/10.1609/aaai.v34i01.5462

Zhang, Z., Zohren, S., & Roberts, S. (2020). Deep reinforcement learning for trading. The Journal of Financial Data Science, 2(2), 25-40. https://doi.org/10.3905/jfds.2020.1.030

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Engineering Management and Soft Computing

Volume 10, Issue 2 - Serial Number 19
March 2025
Pages 1-22

Article View: 404
PDF Download: 326

Portfolio Optimization using Deep Reinforcement Learning

References

Send comment about this article

Volume 10, Issue 2 - Serial Number 19
March 2025
Pages 1-22

Files

Share

How to cite

Statistics

Portfolio Optimization using Deep Reinforcement Learning

References

Send comment about this article

Volume 10, Issue 2 - Serial Number 19March 2025Pages 1-22

Files

Share

How to cite

Statistics

Volume 10, Issue 2 - Serial Number 19
March 2025
Pages 1-22