Study on the application of single-agent and multi-agent reinforcement learning to dynamic scheduling in manufacturing environments with growing complexity: Case study on the synthesis of an industrial IoT Test Bed Artikel uri icon

Open Access

  • true

Peer Reviewed

  • true

Abstract

  • Industry 4.0, smart manufacturing and smart products have recently attracted substantial attention and are becoming increasingly prevalent in manufacturing systems. As a result of the successful implementation of these technologies, highly customized products can be manufactured using responsive, autonomous manufacturing processes at a competitive cost. This study was conducted at HTW Dresden’s Industrial Internet of Things Test Bed, which simulates state-of-the-art manufacturing scenarios for educational and research purposes. Apart from the physical production facility itself, the associated operational information systems have been fully interconnected in order to allow fast and efficient information exchange between the various manufacturing stages and systems. The presence of this characteristic provides a strong foundation for dealing appropriately with unexpected or planned environmental changes, as well as prevailing uncertainty, which greatly increases the overall system’s resilience. The main objective of this study is to increase the efficiency of the manufacturing system in order to optimize resource consumption and minimize the overall completion time (makespan). This manuscript discusses our experiments in the area of flexible job-shop scheduling problems (FJSP). As part of our research, different methods of representing the state space were explored, heuristic, meta-heuristic, reinforcement learning (RL), and multi-agent reinforcement learning (MARL) methods were evaluated, and various methods of interaction with the system (designing the action space and filtering in certain situations) were examined. Furthermore, the design of the reward function, which plays an important role in the formulation of the dynamic scheduling problem into an RL problem, has been discussed in depth. Finally, this paper studies the effectiveness of single-agent and multi-agent RL approaches, with a special focus on the Proximal Policy Optimization (PPO) method, on the fully-fledged digital twin of an industrial IoT system at HTW Dresden. As a result of our experiments, in a multi-agent setting involving individual agents for each manufacturing operation, PPO was able to manage the resources in such a way as to improve the manufacturing system’s performance significantly. •Manuscript discusses our results-oriented research progress in the area of FJSP.•We demonstrate how to break down a complex real-world problem into sub-problems.•Applying different methods of representing the state space were explored.•Heuristic, meta-heuristic, (multi-agent) RL methods were evaluated.•Different experiments carried out to study the effectiveness RL approaches.

Veröffentlichungszeitpunkt

  • Dezember 1, 2024