混合动力系统偏好强化学习能量管理策略研究

唐香蕉; 满兴家; 罗少华; 邵杰

doi:10.3969/j.issn.1001-2222.2024.03.010

车用发动机 ›› 2024, Vol. 0 ›› Issue (3) : 58-65. DOI: 10.3969/j.issn.1001-2222.2024.03.010

栏目

混合动力系统偏好强化学习能量管理策略研究

唐香蕉¹，满兴家¹，罗少华²，邵杰¹

作者信息 +

Hybrid Power Energy Management Strategy Based on Preferring-Reinforcement Learning

TANG Xiangjiao¹,MAN Xingjia¹,LUO Shaohua²,SHAO Jie¹

Author information +

文章历史 +

摘要

为实现混合动力系统在电池荷电状态（state of charge，SOC）平衡以及动力性约束下的经济性提升，提出了基于偏好强化学习的混合动力能量管理策略，该策略将能量管理问题建模为马尔科夫决策过程，采用深度神经网络建立输入状态值到最优动作控制输出的函数映射关系。与传统的强化学习控制算法相比，偏好强化学习算法无需设定回报函数，只需对多动作进行偏好判断即可实现网络训练收敛，克服了传统强化学习方法中回报函数加权归一化设计难题。通过仿真试验和硬件在环验证了所提出能量管理策略的有效性和可行性。结果表明，与传统强化学习能量管理策略相比，该策略能够在满足混合动力车辆SOC平衡和动力性约束下，提升经济性4.6%~10.6%。

Abstract

To enhance the economy of hybrid power system under SOC balance and power constraints, a hybrid power energy management strategy was proposed based on the preferring reinforcement learning. The strategy treated the energy management problem as a Markov decision process and adopted a deep neural network to learn and build the nonlinear mapping from the input states to the optimal control inputs. Compared with the traditional reinforcement learning algorithm, the preferring reinforcement learning did not require the setting of a reward function and only needed to make preference judgments on multiple actions to achieve the convergence of network training, which overcame the design difficulty of weighting normalization in reward function. The effectiveness and feasibility of the proposed energy management strategy were verified through simulation experiments and hardware in the loop tests. The results show that compared with traditional reinforcement learning energy management strategies, the proposed strategy can improve the economy by 4.6% to 10.6% while maintaining the SOC balance and power constraints of hybrid power vehicle.

导出引用

唐香蕉，满兴家，罗少华，邵杰. 混合动力系统偏好强化学习能量管理策略研究[J]. 车用发动机. 2024, 0(3): 58-65 https://doi.org/10.3969/j.issn.1001-2222.2024.03.010

TANG Xiangjiao,MAN Xingjia,LUO Shaohua,SHAO Jie. Hybrid Power Energy Management Strategy Based on Preferring-Reinforcement Learning[J]. Vehicle Engine. 2024, 0(3): 58-65 https://doi.org/10.3969/j.issn.1001-2222.2024.03.010