[slides and audio] DRL-Based Orchestration of Multi-User MISO Systems with Stacked Intelligent Metasurfaces

This paper proposes a deep reinforcement learning (DRL)-based approach for optimizing the performance of multi-user multiple-input single-output (MISO) wireless systems using stacked intelligent metasurfaces (SIM). The SIM is designed to enable wave-based precoding with low complexity transmit radio frequency (RF) chains. The paper presents an optimization formulation for the joint design of SIM phase shifts and transmit power allocation, which is efficiently solved via a customized DRL approach. The DRL approach continuously observes pre-designed states of the SIM-parametrized smart wireless environment and learns to optimize the system's sum-rate performance. The proposed method outperforms conventional precoding schemes under low transmit power conditions and demonstrates robustness through a whitening process. The system model involves a SIM-assisted multi-user MISO transmission system where the SIM facilitates wave-domain precoding. The paper introduces a spatially correlated channel model and formulates an optimization problem that jointly optimizes the coefficients of the SIM meta-atoms and the power allocation strategy of transmit antennas. The optimization problem is dynamically solved via a novel DRL approach, which uses deep deterministic policy gradient (DDPG) to handle continuous action spaces. The DRL formulation includes action and state spaces, reward functions, and a detailed explanation of the optimization process. The proposed DRL-based solution for the optimization problem involves an agent that continuously collects channel coefficients and uses four neural networks for policy and value function approximation. The agent interacts with the environment to learn optimal phase shifts and transmit power allocation strategies. Simulation results show that the proposed DRL-optimized SIM-assisted multi-user MISO system achieves a 2 bps/Hz sum-rate improvement compared to a state-of-the-art AO algorithm. The results also demonstrate the effectiveness of the whitening process in enhancing the robustness of the DRL algorithm. The proposed method is efficient, scalable, and suitable for large-scale joint parameter optimization in SIM-aided wireless communications. The paper concludes that the SIM's multilayer structure outperforms traditional precoding schemes in multiuser MISO systems, particularly under low transmit power levels.This paper proposes a deep reinforcement learning (DRL)-based approach for optimizing the performance of multi-user multiple-input single-output (MISO) wireless systems using stacked intelligent metasurfaces (SIM). The SIM is designed to enable wave-based precoding with low complexity transmit radio frequency (RF) chains. The paper presents an optimization formulation for the joint design of SIM phase shifts and transmit power allocation, which is efficiently solved via a customized DRL approach. The DRL approach continuously observes pre-designed states of the SIM-parametrized smart wireless environment and learns to optimize the system's sum-rate performance. The proposed method outperforms conventional precoding schemes under low transmit power conditions and demonstrates robustness through a whitening process. The system model involves a SIM-assisted multi-user MISO transmission system where the SIM facilitates wave-domain precoding. The paper introduces a spatially correlated channel model and formulates an optimization problem that jointly optimizes the coefficients of the SIM meta-atoms and the power allocation strategy of transmit antennas. The optimization problem is dynamically solved via a novel DRL approach, which uses deep deterministic policy gradient (DDPG) to handle continuous action spaces. The DRL formulation includes action and state spaces, reward functions, and a detailed explanation of the optimization process. The proposed DRL-based solution for the optimization problem involves an agent that continuously collects channel coefficients and uses four neural networks for policy and value function approximation. The agent interacts with the environment to learn optimal phase shifts and transmit power allocation strategies. Simulation results show that the proposed DRL-optimized SIM-assisted multi-user MISO system achieves a 2 bps/Hz sum-rate improvement compared to a state-of-the-art AO algorithm. The results also demonstrate the effectiveness of the whitening process in enhancing the robustness of the DRL algorithm. The proposed method is efficient, scalable, and suitable for large-scale joint parameter optimization in SIM-aided wireless communications. The paper concludes that the SIM's multilayer structure outperforms traditional precoding schemes in multiuser MISO systems, particularly under low transmit power levels.

DRL-Based Orchestration of Multi-User MISO Systems with Stacked Intelligent Metasurfaces

14 Feb 2024 | Hao Liu, Jiancheng An, Derrick Wing Kwan Ng, George C. Alexandropoulos, and Lu Gan