Dynamic Scheduling Method for Job-Shop Manufacturing Systems by Deep Reinforcement Learning with Proximal Policy Optimization

Zhang, Ming ORCID: https://orcid.org/0000-0001-5202-5574, Lu, Yang ORCID: https://orcid.org/0000-0002-0583-2688, Hu, Youxi, Amaitik, Nasser ORCID: https://orcid.org/0000-0002-0962-4341 and Xu, Yuchun ORCID: https://orcid.org/0000-0001-6388-813X (2022) Dynamic Scheduling Method for Job-Shop Manufacturing Systems by Deep Reinforcement Learning with Proximal Policy Optimization. Sustainability, 14 (9). p. 5177.

[thumbnail of Dynamic Scheduling Method for Job-Shop Manufacturing Systems by Deep Reinforcement Learning with Proximal Policy Optimization _ Enhanced Reader.pdf]

Preview

Text
Dynamic Scheduling Method for Job-Shop Manufacturing Systems by Deep Reinforcement Learning with Proximal Policy Optimization _ Enhanced Reader.pdf - Published Version
Available under License Creative Commons Attribution.
| Preview

Official URL: http://dx.doi.org/10.3390/su14095177

Related URLs:

https://www.mdpi.com/2071-1050/14/9/5177

Abstract

With the rapid development of Industrial 4.0, the modern manufacturing system has been experiencing profoundly digital transformation. The development of new technologies helps to improve the efficiency of production and the quality of products. However, for the increasingly complex production systems, operational decision making encounters more challenges in terms of having sustainable manufacturing to satisfy customers and markets’ rapidly changing demands. Nowadays, rule-based heuristic approaches are widely used for scheduling management in production systems, which, however, significantly depends on the expert domain knowledge. In this way, the efficiency of decision making could not be guaranteed nor meet the dynamic scheduling requirement in the job-shop manufacturing environment. In this study, we propose using deep reinforcement learning (DRL) methods to tackle the dynamic scheduling problem in the job-shop manufacturing system with unexpected machine failure. The proximal policy optimization (PPO) algorithm was used in the DRL framework to accelerate the learning process and improve performance. The proposed method was testified within a real-world dynamic production environment, and it performs better compared with the state-of-the-art methods.

Item Type:	Article
Status:	Published
DOI:	10.3390/su14095177
Subjects:	Q Science > Q Science (General) Q Science > Q Science (General) > Q325 Machine learning Q Science > QA Mathematics Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software Q Science > QA Mathematics > QA76.9.H85 Human-Computer Interaction; Virtual Reality; Mixed Reality; Augmented Reality ; Extended Reality T Technology > T Technology (General) T Technology > TA Engineering (General). Civil engineering (General) T Technology > TA Engineering (General). Civil engineering (General) > TA174 Engineering design T Technology > TJ Mechanical engineering and machinery > TJ227-240 Machine design and drawing T Technology > TP Chemical technology > TP368-456 Food processing and manufacture T Technology > TS Manufactures T Technology > TS Manufactures > TS171 Product design Z Bibliography. Library Science. Information Resources > ZA Information resources Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4450 Databases
School/Department:	School of Science, Technology and Health
URI:	https://ray.yorksj.ac.uk/id/eprint/6350

University Staff: Request a correction | RaY Editors: Update this record

Altmetric

CORE (COnnecting REpositories)

Tools

Deposit and Record Details

ID Code:	6350
Depositing User:	Lu, Dr Yang
Deposited On:	03 May 2022 12:20
Last Modified:	02 Jul 2025 17:45