Reinforcement Q-Learning and ILC with Self-Tuning Learning Rate for Contour Following Accuracy Improvement of Biaxial Motion Stage
Abstract
Biaxial motion stages are commonly used in precision motion applications. However, the contour following accuracy of a biaxial motion stage often suffers from system nonlinearities and external disturbances. To deal with the above-mentioned problem, a control scheme consisting of a reinforcement Q-learning controller with a self-tuning learning rate and two iterative learning controllers is proposed in this paper. In particular, the reinforcement Q-learning controller is used to compensate for friction and also cope with the problem of dynamics mismatch between different axes. In addition, one of the two iterative learning controllers is used to suppress periodic external disturbances, while the other one is employed to adjust the learning rate of the reinforcement Q-learning controller. Results of contour following experiments indicate that the proposed approach is feasible.
Keywords
Download Options
Introduction
Contour following is commonly seen in industrial processes such as machining, cutting, polishing, deburring, painting and welding. In these industrial processes, product quality depends on contour following accuracy. Generally speaking, better contour following accuracy can be achieved by reducing tracking errors and/or contour error [14]. As a matter of fact, tracking error reduction is one of the most important research topics in the contour following problems of multi-axis motion stage [1]-[4]. Due to factors such as external disturbance, system nonlinearity, servo lag and mismatch in axis dynamics, contour following accuracy of the multi-axis motion stage may not be able to meet the accuracy requirements [5]-[7].
There are many existing approaches that can be used in practice to reduce tracking error of a multi-axis motion stage [9]- [12]. For example, the commonly used multi-loop feedback control scheme with command feedforward is very effective in reducing tracking error caused by the servo lag phenomenon [8]. In addition, advanced control schemes such as sliding mode control and adaptive control can be used to reduce tracking error as well. Recently, the number of studies exploiting the paradigm of artificial neural network to improve contour following accuracy of multi-axis motion stage has risen steadily [13]-[22]. For instance, Wen and Cheng [13] proposed a fuzzy CMAC with a critic-based learning mechanism to cope with external disturbance and nonlinearity so as to reduce tracking error. Later on, Wen and Cheng [15] further proposed a recurrent fuzzy cerebellar model articulation controller with a self-tuning learning rate to improve contour following accuracy for a piezoelectric actuated dual-axis micro motion stage. In addition to tracking error reduction, the paradigm of artificial neural network has been applied to different fields such as wind power generation [24], the game of Go [22], and object grasping using robots [25]. Generally, a neural network needs to be trained before it can be used to solve a particular problem. Among different training mechanisms for neural networks, reinforcement learning is the one that has received a lot of attention recently [21]. In this paper, a control scheme consisting of a reinforcement Q-learning controller with an adjustable learning rate and two iterative learning controllers (ILC) is proposed to improve contour following accuracy of a bi-axial motion stage. In the proposed approach, the reinforcement Q-learning controller is responsible for friction compensation and also deals with the dynamics mismatch between different axes. In addition, one of the two ILCs is exploited to deal with the adverse effects due to periodic external disturbances from repetitive motions, while the other ILC is exploited to tune the learning rate of Q-learning based on current tracking error so as to further improve contour following accuracy.
Conclusion
This paper has proposed a motion control scheme consisting of two ILCs and one reinforcement Q-learning controller for contour following accuracy improvement. In particular, one ILC is used to tune the learning rate of the reinforcement Qlearning controller that is mainly used to cope with system nonlinearities, while the other ILC is exploited to suppress periodic disturbance during repetitive contour following motions. Results of contour following experiments conducted on a bi-axial motion stage indicate that the proposed control scheme is feasible and outperforms other control schemes also tested in the experiment.