January 2024 | Gaochen Cui, Qing-Shan Jia, and Xiaohong Guan
This paper proposes a model-free reinforcement learning (RL) approach for energy management in networked microgrids (MGs) with real-time pricing. The goal is to coordinate MGs in a distribution network by setting reference price sequences as incentive signals. The DSO (distribution system operator) sets these prices, while MGs use them to plan generation and charging. The challenge lies in the high uncertainty of loads and renewable resources, which necessitates real-time pricing. However, privacy concerns prevent MGs from sharing their response behavior, making it difficult to build a closed-form model. To address this, the problem is transformed into a Markov decision process (MDP), and model-free RL is applied to optimize the pricing policy without requiring knowledge of the MGs' response behavior.
The proposed framework uses a bi-level system where the DSO sets reference prices, and MGs respond based on these prices. A reference policy is incorporated into the RL algorithm to improve training efficiency. The algorithm is tested on a 4-MG system in the IEEE 33-bus distribution network, where it successfully coordinates the MGs. The results show that the developed RL algorithm achieves performance close to a model-based method, with the added benefit of preserving privacy. The algorithm is also effective when MGs consider quadratic cost functions and (dis)charging losses. The RL algorithm is efficient, with the DSO agent computing reference price sequences in less than 0.002 seconds per time slot, making it suitable for online dispatch. The number of MGs affects the convergence of the algorithm, but the optimized pricing policy encourages lower-cost generators to produce more power. The study demonstrates that the proposed RL approach is practical and effective for coordinating MGs in a distribution network with real-time pricing.This paper proposes a model-free reinforcement learning (RL) approach for energy management in networked microgrids (MGs) with real-time pricing. The goal is to coordinate MGs in a distribution network by setting reference price sequences as incentive signals. The DSO (distribution system operator) sets these prices, while MGs use them to plan generation and charging. The challenge lies in the high uncertainty of loads and renewable resources, which necessitates real-time pricing. However, privacy concerns prevent MGs from sharing their response behavior, making it difficult to build a closed-form model. To address this, the problem is transformed into a Markov decision process (MDP), and model-free RL is applied to optimize the pricing policy without requiring knowledge of the MGs' response behavior.
The proposed framework uses a bi-level system where the DSO sets reference prices, and MGs respond based on these prices. A reference policy is incorporated into the RL algorithm to improve training efficiency. The algorithm is tested on a 4-MG system in the IEEE 33-bus distribution network, where it successfully coordinates the MGs. The results show that the developed RL algorithm achieves performance close to a model-based method, with the added benefit of preserving privacy. The algorithm is also effective when MGs consider quadratic cost functions and (dis)charging losses. The RL algorithm is efficient, with the DSO agent computing reference price sequences in less than 0.002 seconds per time slot, making it suitable for online dispatch. The number of MGs affects the convergence of the algorithm, but the optimized pricing policy encourages lower-cost generators to produce more power. The study demonstrates that the proposed RL approach is practical and effective for coordinating MGs in a distribution network with real-time pricing.