Learning and Tuning Fuzzy Logic Controllers Through Reinforcements

Learning and Tuning Fuzzy Logic Controllers Through Reinforcements

January, 1992 | Hamid R. Berenji, Pratap Khedkar
This paper presents a new method for learning and tuning a fuzzy logic controller based on reinforcements from a dynamic system. The Generalized Approximate Reasoning-based Intelligent Control (GARIC) architecture learns and tunes a fuzzy logic controller even when only weak reinforcements, such as a binary failure signal, is available. It introduces a new conjunction operator in computing the rule strengths of fuzzy control rules and a new localized mean of maximum (LMOM) method in combining the conclusions of several firing control rules. It also learns to produce real-valued control actions. Learning is achieved by integrating fuzzy inference into a feedforward network, which can then adaptively improve performance by using gradient descent methods. The GARIC architecture is applied to a cart-pole balancing system and has demonstrated significant improvements in terms of the speed of learning and robustness to changes in the dynamic system's parameters over previous schemes for cart-pole balancing. The paper discusses the fundamentals of fuzzy logic control, reinforcement learning, and credit assignment. It then discusses the general architecture for Approximate Reasoning-based Intelligent Control (GARIC), which addresses two related problems: designing rule-based controllers that use qualitative linguistic rules and learning directly from experience. The architecture is applied to the real-world control problem of cart-pole balancing. The paper also discusses reinforcement learning, which assumes there is no supervisor to critically judge the chosen control action at each time step. The learning system is told indirectly about the effect of its chosen control action. The study of reinforcement learning relates to credit assignment, where the performance of a process is distributed to the individual elements contributing to that performance. The paper discusses the ARIC architecture, which extends Anderson's method by including the prior control knowledge of expert operators in terms of fuzzy control rules. It uses a neural network to perform action and state evaluations and two coupled neural networks to select a control action at each time step. The paper then discusses the GARIC architecture, which uses a neural network to implement fuzzy inference. It includes an Action Selection Network (ASN) that maps a state vector into a recommended action, an Action Evaluation Network (AEN) that maps a state vector and a failure signal into a scalar score, and a Stochastic Action Modifier (SAM) that uses both F and r to produce an action F' which is applied to the plant. The AEN plays the role of an adaptive critic element and constantly predicts reinforcements associated with different input states. The ASN selects an action by implementing an inference scheme based on fuzzy control rules. The SAM uses the values of r from the previous time step and the action F recommended by the ASN to stochastically generate an action F' which is a gaussian random variable with mean F and standard deviation σ(r(t-1)). The paper discusses the learning mechanisms in AEN and ASN, including weight-updating and gradient descent methods. The paper also discusses the cart-pole balancing problem, which involves keeping the pole vertically balanced and keeping the cart within the rail track boundaries.This paper presents a new method for learning and tuning a fuzzy logic controller based on reinforcements from a dynamic system. The Generalized Approximate Reasoning-based Intelligent Control (GARIC) architecture learns and tunes a fuzzy logic controller even when only weak reinforcements, such as a binary failure signal, is available. It introduces a new conjunction operator in computing the rule strengths of fuzzy control rules and a new localized mean of maximum (LMOM) method in combining the conclusions of several firing control rules. It also learns to produce real-valued control actions. Learning is achieved by integrating fuzzy inference into a feedforward network, which can then adaptively improve performance by using gradient descent methods. The GARIC architecture is applied to a cart-pole balancing system and has demonstrated significant improvements in terms of the speed of learning and robustness to changes in the dynamic system's parameters over previous schemes for cart-pole balancing. The paper discusses the fundamentals of fuzzy logic control, reinforcement learning, and credit assignment. It then discusses the general architecture for Approximate Reasoning-based Intelligent Control (GARIC), which addresses two related problems: designing rule-based controllers that use qualitative linguistic rules and learning directly from experience. The architecture is applied to the real-world control problem of cart-pole balancing. The paper also discusses reinforcement learning, which assumes there is no supervisor to critically judge the chosen control action at each time step. The learning system is told indirectly about the effect of its chosen control action. The study of reinforcement learning relates to credit assignment, where the performance of a process is distributed to the individual elements contributing to that performance. The paper discusses the ARIC architecture, which extends Anderson's method by including the prior control knowledge of expert operators in terms of fuzzy control rules. It uses a neural network to perform action and state evaluations and two coupled neural networks to select a control action at each time step. The paper then discusses the GARIC architecture, which uses a neural network to implement fuzzy inference. It includes an Action Selection Network (ASN) that maps a state vector into a recommended action, an Action Evaluation Network (AEN) that maps a state vector and a failure signal into a scalar score, and a Stochastic Action Modifier (SAM) that uses both F and r to produce an action F' which is applied to the plant. The AEN plays the role of an adaptive critic element and constantly predicts reinforcements associated with different input states. The ASN selects an action by implementing an inference scheme based on fuzzy control rules. The SAM uses the values of r from the previous time step and the action F recommended by the ASN to stochastically generate an action F' which is a gaussian random variable with mean F and standard deviation σ(r(t-1)). The paper discusses the learning mechanisms in AEN and ASN, including weight-updating and gradient descent methods. The paper also discusses the cart-pole balancing problem, which involves keeping the pole vertically balanced and keeping the cart within the rail track boundaries.
Reach us at info@study.space
Understanding Learning and tuning fuzzy logic controllers through reinforcements