量子电子学报 ›› 2025, Vol. 42 ›› Issue (1): 70-0.doi: 10.3969/j.issn.1007-5461.2025.01.007

• 量子物理 • 上一篇    下一篇

基于深度强化学习的量子奥托循环性能优化

李建松 1, 李 海 1*, 于文莉 2, 郝亚明 1   

  1. 1 山东工商学院信息与电子工程学院, 山东 烟台 264005; 2 山东工商学院计算机科学与技术学院, 山东 烟台 264005
  • 收稿日期:2023-02-06 修回日期:2023-04-07 出版日期:2025-01-28 发布日期:2025-01-28
  • 通讯作者: E-mail: shenghuo2003@126.com E-mail:E-mail: shenghuo2003@126.com
  • 作者简介:李建松 ( 1996 - ), 江苏淮安人, 研究生, 主要从事机器学习与量子热力学方面的研究。E-mail: ljs1019@foxmail.com
  • 基金资助:
    国家自然科学基金项目 (11547036), 山东省自然科学基金青年项目 (ZR2011FL009), 烟台市科技创新发展计划基金项目 (2022JCYJ044)

Performance optimization of quantum Otto cycle via deep reinforcement learning

LI Jiansong 1 , LI Hai 1*, YU Wenli 2 , HAO Yaming 1   

  1. 1 School of Information and Electronic Engineering, Shandong Technology and Business University, Yantai 264005, China; 2 School of Computer Science and Technology, Shandong Technology and Business University, Yantai 264005, China
  • Received:2023-02-06 Revised:2023-04-07 Published:2025-01-28 Online:2025-01-28

摘要: 针对通常情况下实现高性能的绝热捷径量子奥托循环 (QOC) 需要施加复杂调控场的难题, 研究了实验上相 对便于操控的线性驱动场下QOC的性能特征。利用基于策略函数的深度强化学习, 对以单量子比特为工质的QOC 膨胀与压缩过程的附加驱动场进行优化, 实现了线性驱动场下高性能的QOC。与非绝热自由演化方案下的QOC对 比, 优化附加驱动方案下的QOC在输出功、功率以及效率方面都表现出显著的优越性。特别是在较短循环周期中, 自由演化方案下的QOC因大量不可逆功的产生, 完全抑制了正功的输出, 然而优化驱动方案下的QOC却仍能正常 运行(有正功输出)。本工作初步检验了深度强化学习在优化量子热机性能中的有效性。

关键词: 量子热力学, 量子奥托循环, 深度强化学习, 附加驱动场, 功率与效率

Abstract: In response to the challenge that complicated control fields are generally required for realizing the high-performance shortcuts to adiabaticity quantum Otto cycle (QOC), the performance characteristics of QOC under linear driving field which is easy to manipulate in experiment, are studied in this work. Using the strategy-based deep reinforcement learning, the driving field added during the expansion and compression processes of QOC with single qubit as the working medium is optimized, and then the high-performance QOC under linear driving field can be realized. Compared with the scheme of QOC with the non-adiabatic free evolution, the QOC under the optimized additional driving scheme exhibits significant advantages in the output work, power and efficiency. Especially, in the case of shortcycle period, for the QOC under free evolution scheme, the output of positive work is completely suppressed due to the generation of a large amount of irreversible work, while the QOC under the optimized driving scheme can still operate normally (with output positive work). This work preliminarily tests the validity of deep reinforcement learning in optimizing the performance of quantum engine.

Key words: quantum thermodynamics, quantum Otto cycle, deep reinforcement learning, additional driving field, power and efficiency

中图分类号: