Federated Reinforcement Learning for Peak Demand Mitigation in Residential Energy Systems With Dynamic Comfort Preferences

The growing penetration of flexible loads such as air conditioners, electric vehicles (EVs), and energy storage systems (ESSs) has intensified the challenge of mitigating peak demand in residential communities. Coordinating these devices is essential to reduce energy costs, enhance grid reliability, and preserve user comfort. This paper presents a federated reinforcement learning (FRL) framework for decentralized coordination of Home Energy Management Systems (HEMSs) under demand-dependent dynamic pricing. Each HEMS operates as an autonomous agent trained using the Soft Actor-Critic (SAC) algorithm, while the Federated Proximal (FedProx) method is employed to stabilize local updates and address heterogeneity and non-IID data across homes. The framework models diverse appliance profiles, rooftop photovoltaic (PV) generation, ESSs, EVs, and adaptive comfort preferences. A key novelty lies in the direct integration of personalized comfort into the federated learning process, enabling agents to adapt scheduling policies to heterogeneous flexibility bounds while maintaining coordinated system performance. Simulation results show that the proposed FedProx-SAC method reduces average household energy cost by 42% compared to an unoptimized (no-HEMS) baseline and by 20% compared to FedAvg-SAC, while simultaneously lowering peak demand and ensuring comfort satisfaction. Furthermore, the framework supports scalability and privacy preservation, making it a practical and robust solution for future large-scale residential demand response programs.