Development of a Deep Reinforcement Learning Algorithm in a Dynamic Cellular Manufacturing System Considering Order Rejection, Case Study: Stone Paper Factory

Akbari, Amir Hossein; Jafari, Mostafa

doi:10.22091/jemsc.2025.11853.1230

Development of a Deep Reinforcement Learning Algorithm in a Dynamic Cellular Manufacturing System Considering Order Rejection, Case Study: Stone Paper Factory

Document Type : Original Article

Authors

¹ PhD Student, Department of Industrial Engineering, Iran University of Science & Technology, Tehran, Iran, Email: akbari_amir@ind.iust.ac.ir

² Associate Professor, Department of Industrial Engineering, Iran University of Science & Technology, Tehran, Iran

10.22091/jemsc.2025.11853.1230

Abstract

In this research, a deep reinforcement learning algorithm is proposed for the cellular manufacturing system problem considering the costs of delay and rejection of orders. Orders with different characteristics including revenue, lead time, delivery date, and delay cost are dynamically entered into the system at different times. Due to the limited capacity of the system, it is not possible to accept all orders and some of them must be rejected at the time of entry to enable timely execution of other orders. A mathematical model with two objectives of maximizing profit and minimizing the number of rejected orders is presented and a deep reinforcement learning algorithm is used to solve this problem. The proposed algorithm is compared with the algorithms available in the literature in different categories of example problems and real problems and its efficiency is proven. The results show a 36.3% advantage in profit and 13.87% in the number of accepted orders. Also, by accepting 1% more orders, the profit decreases by 2.7% on average

Keywords

Main Subjects

Soft Computing

References

Aalaei, A., & Davoudpour, H. (2012, November). Designing a mathematical model for integrating dynamic cellular manufacturing into supply chain system. In AIP Conference Proceedings (Vol. 1499, No. 1, pp. 239-246). American Institute of Physics. https://doi.org/10.1063/1.4768994

Aalaei, A., & Davoudpour, H. (2016). TWO BOUNDS FOR INTEGRATING THE VIRTUAL DYNAMIC CELLULAR MANUFACTURING PROBLEM INTO SUPPLY CHAIN MANAGEMENT. Journal of Industrial & Management Optimization, 12(3). 10.3934/jimo.2016.12.907

An, Y., Chen, X., Gao, K., Zhang, L., Li, Y., & Zhao, Z. (2023). A hybrid multi-objective evolutionary algorithm for solving an adaptive flexible job-shop rescheduling problem with real-time order acceptance and condition-based preventive maintenance. Expert systems with applications, 212, 118711. https://doi.org/10.1016/j.eswa.2022.118711

Arkat, J., Hosseinabadi Farahani, M., & Hosseini, L. (2012). Integrating cell formation with cellular layout and operations scheduling. The International Journal of Advanced Manufacturing Technology, 61, 637-647. https://doi.org/10.1007/s00170-011-3733-4

Azadeh, A., Ravanbakhsh, M., Rezaei-Malek, M., Sheikhalishahi, M., & Taheri-Moghaddam, A. (2017). Unique NSGA-II and MOPSO algorithms for improved dynamic cellular manufacturing systems considering human factors. Applied Mathematical Modelling, 48, 655-672. https://doi.org/10.1016/j.apm.2017.02.026

Bílková, D. (2012). Lognormal distribution and using L-moment method for estimating its parameters. International journal of mathematical models and methods in applied sciences, 6(1), 30-44.

Bouazza, W., Sallez, Y., & Beldjilali, B. (2017). A distributed approach solving partially flexible job-shop scheduling problem with a Q-learning effect. IFAC-PapersOnLine, 50(1), 15890-15895. https://doi.org/10.1016/j.ifacol.2017.08.2354

Chen, C., Yang, Z., Tan, Y., & He, R. (2014). Diversity controlling genetic algorithm for order acceptance and scheduling problem. Mathematical Problems in Engineering, 2014(1), 367152.

Chen, X., Hao, X., Lin, H. W., & Murata, T. (2010, August). Rule driven multi objective dynamic scheduling by data envelopment analysis and reinforcement learning. In 2010 IEEE International Conference on Automation and Logistics (pp. 396-401). IEEE. DOI: 10.1109/ICAL.2010.5585316

Chu, X., Gao, D., Cheng, S., Wu, L., Chen, J., Shi, Y., & Qin, Q. (2019). Worker assignment with learning-forgetting effect in cellular manufacturing system using adaptive memetic differential search algorithm. Computers & Industrial Engineering, 136, 381-396. https://doi.org/10.1016/j.cie.2019.07.028

Chung, S. H., Wu, T. H., & Chang, C. C. (2011). An efficient tabu search algorithm to the cell formation problem with alternative routings and machine reliability considerations. Computers & Industrial Engineering, 60(1), 7-15. https://doi.org/10.1016/j.cie.2010.08.016

Delgoshaei, A., & Ali, A. (2020). A hybrid ant colony optimization and simulated annealing algorithm for multi-objective scheduling of cellular manufacturing systems. International Journal of Applied Metaheuristic Computing (IJAMC), 11(3), 1-40. DOI: 10.4018/IJAMC.2020070101

Ding, Z., Huang, Y., Yuan, H., & Dong, H. (2020). Introduction to reinforcement learning. Deep reinforcement learning: fundamentals, research and applications, 47-123. https://doi.org/10.1007/978-981-15-4095-0_2

Geramipour, S., Moslehi, G., & Reisi-Nafchi, M. (2017). Maximizing the profit in customer’s order acceptance and scheduling problem with weighted tardiness penalty. Journal of the Operational Research Society, 68(1), 89-101.

Goli, A., Tirkolaee, E. B., & Aydın, N. S. (2021). Fuzzy integrated cell formation and production scheduling considering automated guided vehicles and human factors. IEEE transactions on fuzzy systems, 29(12), 3686-3695. DOI: 10.1109/TFUZZ.2021.3053838

Hammami, N. E. H., Lardeux, B., B. Hadj-Alouane, A., & Jridi, M. (2024). Design and calibration of a DRL algorithm for solving the job shop scheduling problem under unexpected job arrivals. Flexible Services and Manufacturing Journal, 1-32. https://doi.org/10.1007/s10696-024-09540-2

Herasymovych, M., Märka, K., & Lukason, O. (2019). Using reinforcement learning to optimize the acceptance threshold of a credit scoring model. Applied Soft Computing, 84, 105697. https://doi.org/10.1016/j.asoc.2019.105697

Herbots, J., Herroelen, W., & Leus, R. (2007). Dynamic order acceptance and capacity planning on a single bottleneck resource. Naval Research Logistics (NRL), 54(8), 874-889.

Houshyar, A. N., Leman, Z., Moghadam, H. P., Ariffin, M. K. A. M., Ismail, N., & Iranmanesh, H. (2014, June). Literature review on dynamic cellular manufacturing system. In IOP conference series: materials science and engineering (Vol. 58, No. 1, p. 012016). IOP Publishing. DOI 10.1088/1757-899X/58/1/012016

Huang, J. P., Gao, L., & Li, X. Y. (2024). An end-to-end deep reinforcement learning method based on graph neural network for distributed job-shop scheduling problem. Expert Systems with Applications, 238, 121756. https://doi.org/10.1016/j.eswa.2023.121756

Jabal Ameli, M. S., Arkat, J., & Barzinpour, F. (2008). Modelling the effects of machine breakdowns in the generalized cell formation problem. The International Journal of Advanced Manufacturing Technology, 39, 838-850. https://doi.org/10.1007/s00170-007-1269-4

Leng, J., J. Guo, H. Zhang, K. Xu, Y. Qiao, P. Zheng and W. Shen (2023). "Dual deep reinforcement learning agents-based integrated order acceptance and scheduling of mass individualized prototyping." Journal of Cleaner Production 427: 139249. https://doi.org/10.1016/j.jmsy.2011.03.004

Li, F., S. Xu and Z. Xu (2023). "New exact and approximation algorithms for integrated production and transportation scheduling with committed delivery due dates and order acceptance." European Journal of Operational Research 306(1): 127-140. https://doi.org/10.1016/j.ejor.2013.07.032

Lin, S.-W. and K.-C. Ying (2013). "Increasing the total net revenue for single machine order acceptance and scheduling problems using an artificial bee colony algorithm." Journal of the Operational Research Society 64(2): 293-311. https://doi.org/10.1007/s00170-007-1269-4

Liu, C., J. Wang and J. Y.-T. Leung (2018). "Integrated bacteria foraging algorithm for cellular manufacturing in supply chain considering facility transfer and production planning." Applied Soft Computing 62: 602-618. https://doi.org/10.1007/s00170-007-1269-4

Lou, P., Q. Liu, Z. Zhou, H. Wang and S. X. Sun (2012). "Multi-agent-based proactive–reactive scheduling for a job shop." The International Journal of Advanced Manufacturing Technology 59: 311-324.

Luo, S. (2020). "Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning." Applied Soft Computing 91: 106208. https://doi.org/10.1016/j.ejor.2013.07.032

Mahdavi, I., A. Aalaei, M. M. Paydar and M. Solimanpur (2010). "Designing a mathematical model for dynamic cellular manufacturing systems considering production planning and worker assignment." Computers & Mathematics with Applications 60(4): 1014-1025. https://doi.org/10.1016/j.jmsy.2011.03.004

Mirhoseini, A., H. Pham, Q. V. Le, B. Steiner, R. Larsen, Y. Zhou, N. Kumar, M. Norouzi, S. Bengio and J. Dean (2017). Device placement optimization with reinforcement learning. International Conference on Machine Learning, PMLR. https://doi.org/10.1007/s00170-007-1269-4

Motahari, R., Z. Alavifar, A. Z. Andaryan, M. Chipulu and M. Saberi (2023). "A multi-objective linear programming model for scheduling part families and designing a group layout in cellular manufacturing systems." Computers & Operations Research 151: 106090. https://doi.org/10.1016/j.jmsy.2011.03.004

Nie, L., L. Gao, P. Li and X. Li (2013). "A GEP-based reactive scheduling policies constructing approach for dynamic flexible job shop scheduling problem with job release dates." Journal of Intelligent Manufacturing 24: 763-774.

Og, C., F. S. Salman and Z. B. Yalçın (2010). "Order acceptance and scheduling decisions in make-to-order systems." International Journal of Production Economics 125(1): 200-211. https://doi.org/10.1007/s00170-007-1269-4

Ou, J. and X. Zhong (2017). "Bicriteria order acceptance and scheduling with consideration of fill rate." European Journal of Operational Research 262(3): 904-907. https://doi.org/10.1016/j.jmsy.2011.03.004

Papaioannou, G. and J. M. Wilson (2010). "The evolution of cell formation problem methodologies based on recent studies (1997–2008): Review and directions for future research." European journal of operational research 206(3): 509-521. https://doi.org/10.1016/j.jmsy.2011.03.004

Potts, C. N. and L. N. Van Wassenhove (1985). "A branch and bound algorithm for the total weighted tardiness problem." Operations research 33(2): 363-377. https://doi.org/10.1007/s00170-007-1269-4

Rabbani, M., H. Farrokhi-Asl and M. Ravanbakhsh (2019). "Dynamic cellular manufacturing system considering machine failure and workload balance." Journal of Industrial Engineering International 15(1): 25-40. https://doi.org/10.1016/j.cie.2018.03.039

Rafiei, H., M. Rabbani, H. Gholizadeh and H. Dashti (2016). "A novel hybrid SA/GA algorithm for solving an integrated cell formation–job scheduling problem with sequence-dependent set-up times." International Journal of Management Science and Engineering Management 11(3): 134-142. https://doi.org/10.1016/j.cie.2018.03.039

Rahimi, V., J. Arkat and H. Farughi (2020). "A vibration damping optimization algorithm for the integrated problem of cell formation, cellular scheduling, and intercellular layout." Computers & Industrial Engineering 143: 106439. https://doi.org/10.1016/j.cie.2018.03.039

Rahman, H. F., M. N. Janardhanan and I. E. Nielsen (2019). "Real-time order acceptance and scheduling problems in a flow shop environment using hybrid GA-PSO algorithm." IEEE Access 7: 112742-112755. https://doi.org/10.1016/j.cie.2018.03.039

Ruiz-Torres, A. J., Paletta, G., & Pérez, E. (2013). Parallel machine scheduling to minimize the makespan with sequence dependent deteriorating effects. Computers & Operations Research, 40(8), 2051-2061. https://doi.org/10.1016/j.cor.2013.02.018

Sarvestani, H. K., Zadeh, A., Seyfi, M., & Rasti-Barzoki, M. (2019). Integrated order acceptance and supply chain scheduling problem with supplier selection and due date assignment. Applied Soft Computing, 75, 72-83. https://doi.org/10.1016/j.asoc.2018.10.045

Shafiee-Gol, S., Kia, R., Tavakkoli-Moghaddam, R., Kazemi, M., & Kamran, M. A. (2021). Integration of parts scheduling, MRP, production planning and generalized fixed-charge transportation planning in the design of a dynamic cellular manufacturing system. RAIRO-Operations Research, 55, S1875-S1912. https://doi.org/10.1051/ro/2020062

Shahrabi, J., Adibi, M. A., & Mahootchi, M. (2017). A reinforcement learning approach to parameter estimation in dynamic job shop scheduling. Computers & Industrial Engineering, 110, 75-82. https://doi.org/10.1016/j.cie.2017.05.026

Shiue, Y. R., Lee, K. C., & Su, C. T. (2018). Real-time scheduling for a smart factory using a reinforcement learning approach. Computers & Industrial Engineering, 125, 604-614. https://doi.org/10.1016/j.cie.2018.03.039

Silva, Y. L. T., Subramanian, A., & Pessoa, A. A. (2018). Exact and heuristic algorithms for order acceptance and scheduling with sequence-dependent setup times. Computers & operations research, 90, 142-160. https://doi.org/10.1016/j.cor.2017.09.006

Slotnick, S. A. (2011). Order acceptance and scheduling: A taxonomy and review. European Journal of Operational Research, 212(1), 1-11. https://doi.org/10.1016/j.ejor.2010.09.042

Tarhan, İ., & Oğuz, C. (2021). Generalized order acceptance and scheduling problem with batch delivery: Models and metaheuristics. Computers & Operations Research, 134, 105414. https://doi.org/10.1016/j.cor.2021.105414

Wiering, M. A., & Van Otterlo, M. (2012). Reinforcement learning. Adaptation, learning, and optimization, 12(3), 729. https://doi.org/10.1016/j.cor.2021.105414

Wang, C., & Hu, Q. (2020). Knowledge sharing in supply chain networks: Effects of collaborative innovation activities and capability on innovation performance. Technovation, 94, 102010. https://doi.org/10.1016/j.technovation.2017.12.002

Wang, T., Baldacci, R., Lim, A., & Hu, Q. (2018). A branch-and-price algorithm for scheduling of deteriorating jobs and flexible periodic maintenance on a single machine. European Journal of Operational Research, 271(3), 826-838. https://doi.org/10.1016/j.ejor.2018.05.050

Wang, Y., Wang, J. Q., & Yin, Y. (2020). Multitasking scheduling and due date assignment with deterioration effect and efficiency promotion. Computers & Industrial Engineering, 146, 106569. https://doi.org/10.1016/j.cie.2020.106569

Wang, Z., Qi, Y., Cui, H., & Zhang, J. (2019). A hybrid algorithm for order acceptance and scheduling problem in make-to-stock/make-to-order industries. Computers & Industrial Engineering, 127, 841-852. https://doi.org/10.1016/j.cie.2018.11.021

Wu, C. C., Hsu, P. H., & Lai, K. (2011). Simulated-annealing heuristics for the single-machine scheduling problem with learning and unequal job release times. Journal of Manufacturing Systems, 30(1), 54-62. https://doi.org/10.1016/j.jmsy.2011.03.004

Yavari, M., & Akbari, A. H. (2023). Service level and profit maximisation in order acceptance and scheduling problem with weighted tardiness. International Journal of Industrial and Systems Engineering, 43(3), 331-362. https://doi.org/10.1504/IJISE.2023.129138

Yavari, M., Marvi, M., & Akbari, A. H. (2020). Semi-permutation-based genetic algorithm for order acceptance and scheduling in two-stage assembly problem. Neural Computing and Applications, 32, 2989-3003. https://doi.org/10.1007/s00521-019-04027-w

Yuan, E., Cheng, S., Wang, L., Song, S., & Wu, F. (2023). Solving job shop scheduling problems via deep reinforcement learning. Applied Soft Computing, 143, 110436. https://doi.org/10.1016/j.asoc.2023.110436

Yuan, E., Wang, L., Cheng, S., Song, S., Fan, W., & Li, Y. (2024). Solving flexible job shop scheduling problems via deep reinforcement learning. Expert Systems with Applications, 245, 123019. https://doi.org/10.1016/j.eswa.2023.123019

Zandieh, M., & Roumani, M. (2017). A biogeography-based optimization algorithm for order acceptance and scheduling. Journal of Industrial and Production Engineering, 34(4), 312-321. https://doi.org/10.1016/j.ejor.2013.07.032

Zhang, H., Leng, J., Zhang, H., Ruan, G., Zhou, M., & Zhang, Y. (2021, July). A deep reinforcement learning algorithm for order acceptance decision of individualized product assembling. In 2021 IEEE 1st International Conference on Digital Twins and Parallel Intelligence (DTPI) (pp. 21-24). IEEE. DOI: 10.1109/DTPI52967.2021.9540190

Zhong, X., & Ou, J. (2017). Improved approximation algorithms for parallel machine scheduling with release dates and job rejection. 4OR, 15, 387-406. https://doi.org/10.1007/s10288-016-0339-6

Zhong, X., Ou, J., & Wang, G. (2014). Order acceptance and scheduling with machine availability constraints. European journal of operational research, 232(3), 435-441. https://doi.org/10.1016/j.ejor.2013.07.032

Tavakkoli-Moghaddam, R., Akbari, A. H., Tanhaeean, M., Moghdani, R., Gholian-Jouybari, F., & Hajiaghaei-Keshteli, M. (2024). Multi-objective boxing match algorithm for multi-objective optimization problems. Expert Systems with Applications, 239, 122394. https://doi.org/10.1016/j.eswa.2023.122394

Tanhaeean, M., Tavakkoli-Moghaddam, R., & Akbari, A. H. (2022). Boxing match algorithm: A new meta-heuristic algorithm. Soft Computing, 26(24), 13277-13299. https://doi.org/10.1007/s00500-022-07518-6

Rezaeenour, J. , Hashempoor, M. and Akbari, A. H. (2020). A four-echelon supply chain considering economic, social and regions satisfaction goals. Journal of Industrial Engineering Research in Production Systems, 7(15), 199-217. doi: 10.22084/ier.2020.19597.1875

Hosseini, S., Rezaeenour, J., Masoumi, M., & Akbari, A. H. (2021). A The Evaluation of Knowledge Management in Supply Chain Using EFQM Framework, Fuzzy Multi-Attribute Decision Making and Multi-Objective Programming. Industrial Management Studies, 19(60), 193-235. DOI: 10/22054/jims. 2021/407040/2289

Rezaenoor, J., Saadi, G., & Akbari, A. (2019). Design of a Decision Support System to Diagnose and Predict Heart Disease using Artificial Neural Network; a case study (Ayatollah Golpayegani Hospital in Qom). Management Strategies in Health System, 3(4), 320-331. 10.18502/mshsj.v3i4.515

Akbari, A. H., & Jafari, M. (2025). Development of a Deep Reinforcement Learning Algorithm in a Dynamic Cellular Manufacturing System Considering Order Rejection, Case Study: Stone Paper Factory. Engineering Management and Soft Computing, 10(2), 204-222. doi: 10.22091/jemsc.2025.11853.1230

Jabbari, M., Rezaeenour, J., & Akbari, A. H. (2023). A Feature Selection Method Based on Information Theory and Genetic Algorithm. Sciences and Techniques of Information Management, 9(3), 32-7. doi: 10.22091/stim.2023.8708.1877

Yavari, M., Marvi, M., & Akbari, A. H. (2020). Semi-permutation-based genetic algorithm for order acceptance and scheduling in two-stage assembly problem. Neural Computing and Applications, 32, 2989-3003. https://doi.org/10.1007/s00521-019-04027-w

Tanhaeean, M., Tavakkoli-Moghaddam, R., & Akbari, A. H. (2022). Boxing match algorithm: A new meta-heuristic algorithm. Soft Computing, 26(24), 13277-13299. https://doi.org/10.1007/s00500-022-07518-6

Rezaeenour, J., Hashempoor, M., & Akbari, A. H. (2020). A four-echelon supply chain considering economic, social and regions satisfaction goals. Journal of Industrial Engineering Research in Production Systems, 7(15), 199-217.

Yavari, M., & Akbari, A. H. (2023). Service level and profit maximisation in order acceptance and scheduling problem with weighted tardiness. International Journal of Industrial and Systems Engineering, 43(3), 331-362. https://doi.org/10.1504/IJISE.2023.129138

Sazvar, Z., Tanhaeean, M., Aria, S. S., Akbari, A., Ghaderi, S. F., & Iranmanesh, S. H. (2020). A computational intelligence approach to detect future trends of COVID-19 in France by analyzing chinese data. Health Education and Health Promotion, 8(3), 107-113.

Rezaenoor, J., Saadi, G., & Akbari, A. (2019). Design of a Decision Support System to Diagnose and Predict Heart Disease using Artificial Neural Network; a case study (Ayatollah Golpayegani Hospital in Qom). Management Strategies in Health System, 3(4), 320-331.

Akbari, A. H., & Jafari, M. (2025). Development of a Deep Reinforcement Learning Algorithm in a Dynamic Cellular Manufacturing System Considering Order Rejection, Case Study: Stone Paper Factory. Engineering Management and Soft Computing, 10(2), 204-222.

Jafari, M., & Akbari, A. H. (2025). Efficient Algorithms for Dynamic Cellular Manufacturing Systems by Considering Blockchain-Enabled (Case Study: Stone Paper Factory). Journal of Advanced Manufacturing Systems.

Jabbari, M., Rezaeenour, J., & Akbari, A. H. (2023). A Feature Selection Method Based on Information Theory and Genetic Algorithm. Sciences and Techniques of Information Management, 9(3), 32-7.

Hosseini, S. J., Rezaeenoor, J., Akbari, A. H., & Marjani, M. R. (2021). Operating Room Scheduling with Respect to Dynamic Facilities and Surgeon Specialty. Industrial Management Journal, 13(2), 194-221.

Hosseini, S., REZAEENOUR, J., & Akbari, A. (2021). The Evaluation of Knowledge Management and Sharing in Supply Chain Using EFQM Framework, Fuzzy MCDM and Multi-Objective Programming.

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Engineering Management and Soft Computing

Volume 10, Issue 2 - Serial Number 19
March 2025
Pages 204-222

Article View: 363
PDF Download: 326

Development of a Deep Reinforcement Learning Algorithm in a Dynamic Cellular Manufacturing System Considering Order Rejection, Case Study: Stone Paper Factory

References

Send comment about this article

Volume 10, Issue 2 - Serial Number 19
March 2025
Pages 204-222

Files

Share

How to cite

Statistics

Development of a Deep Reinforcement Learning Algorithm in a Dynamic Cellular Manufacturing System Considering Order Rejection, Case Study: Stone Paper Factory

References

Send comment about this article

Volume 10, Issue 2 - Serial Number 19March 2025Pages 204-222

Files

Share

How to cite

Statistics

Volume 10, Issue 2 - Serial Number 19
March 2025
Pages 204-222