Dynamic Job Shop Scheduling in an Industrial Assembly Environment Using Various Reinforcement Learning Techniques

The high volatility and dynamics within global value networks have recently led to a noticeable shortening of product and technology cycles. To realize an effective and efficient production, a dynamic regulation system is required. Currently, this is mostly accomplished statically via a Manufacturing Execution System, which decides for whole lots, and usually cannot react to uncertainties such as the failure of an operation, the variations in operation times or in the quality of the raw material. In this paper, we incorporated Reinforcement Learning to minimize makespan in the assembly line of our Industrial IoT Test Bed (at HTW Dresden), in the presence of multiple machines supporting the same operations as well as uncertain operation times. While multiple machines supporting the same operations improves the system’s reliability, they pose a challenging scheduling challenge. Additionally, uncertainty in operation times adds complexity to planning, which is largely neglected in traditional scheduling approaches. As a means of optimizing the scheduling problem under these conditions, we have implemented and compared four reinforcement learning methods including Deep-Q Networks, REINFORCE, Advantage Actor Critic and Proximal Policy Optimization. According to our results, PPO achieved greater accuracy and convergence speed than the other approaches, while minimizing the total makespan.

VIVO