Feedforward Learning Control For Multi Actuator Hard Drives And Freeform 3D Printers
This work addresses feedforward disturbance problems and trajectory tracking problems for both small-scale linear systems and large-scale nonlinear systems. The feedforward disturbance rejection problem, acoustic control problem, and trajectory tracking problem can all be formulated in a united manner as an optimization problem to minimize the difference between the plant output, under time-varying disturbances, and the desired reference response.
For a single-input-single-output (SISO) linear system, the compensator or the tracking controller can be represented by an FIR filter or IIR filter. When statistics of the reference signals are unknown, an adaptive filter is used to find the optimal controller parameters based on some recursive algorithms. The least mean squares (LMS) algorithm and recursive least squares (RLS) have been widely used for feedforward adaptive control. However, when the reference is a sequence of impulses or wavelets, these algorithms may converge slowly or even diverge.
In the first part of this work, a novel iterative batch least squares (IBLS) learning algorithm is developed for adaptive filtering with the reference consisting of a sequence of impulses or wavelets. The algorithm is formulated as a stochastic Newton optimization method with batch processing. The IBLS algorithm has been applied to a multi-actuator hard disk drive (HDD) to attenuate the vibration generated by the seeking actuator.
For a large-scale multi-input-multi-output (MIMO) nonlinear system, the track following problem could be hard to solve. Recently, significant progress has been made in deep reinforcement learning that provides the flexibility to solve complex tasks from high-dimensional sensory inputs without knowing the dynamics of the environment. To do so, deep neural networks were used to approximate the action-value function or Q-function.
In the second part of this thesis, we developed a modified deep deterministic policy gradient (DDPG) algorithm to address the trajectory following problem in a large-scale system with unknown dynamics. In this method, the reward function is defined as a function of the system state and its reference, which is maximized as long as the state of the system follows the desired trajectory. The modified DDPG algorithm is applied to a freeform 3D printing system to neutralize the effect of gravity and build a filament with the desired shape.