Training effective deep reinforcement learning agents for real-time life-cycle production optimization

Date: 2024-09-03 　Cicking Rate: 50

Recently, the team of Zhang Kai has made new progress in the field of reservoir production optimization, and the related research has been published in the Journal of Petroleum Science and Engineering. The paper is titled " Training effective deep reinforcement learning agents for real-time life-cycle production optimization ".

Innovation: Traditional production optimization methods evaluate the quality of a solution by calling a reservoir numerical reservoir simulator and iteratively optimizes the optimal solution. The main limitation is that the number of numerical simulations evaluations is large and the optimization speed is difficult to meet the engineering needs. In this study, we model the reservoir production optimization problem as a Markovian decision process to build a deep reinforcement learning-based production optimization framework to address these challenges. The deep reinforcement learning algorithm is used to train an agent to continuously improve the solution decision making capability by dynamically interacting with the reservoir environment in a trial-and-error learning manner. The method achieves good results in terms of both efficiency and performance of optimization.

Abstract: Life-cycle production optimization aims to obtain the optimal well control scheme at each time control step to maximize financial profit and hydrocarbon production. However, searching for the optimal policy under the limited number of simulation evaluations is a challenging task. In this paper, a novel production optimization method is presented, which maximizes the net present value (NPV) over the entire life-cycle and achieves realtime well control scheme adjustment. The proposed method models the life-cycle production optimization problem as a finite-horizon Markov decision process (MDP), where the well control scheme can be viewed as sequence decisions. Soft actor-critic, known as the state-of-the-art model-free deep reinforcement learning (DRL) algorithm, is subsequently utilized to train DRL agents that can solve the above MDP. The DRL agent strives to maximize long-term NPV rewards as well as the control scheme randomness by training a stochastic policy that maps reservoir states to well control variables and an action-value function that estimates the objective value of the current policy. Since the trained policy is an explicit function structure, the DRL agent can adjust the well control scheme in real-time under different reservoir states. Different from most existing methods that introduce task-specific sensitive parameters or construct complex supplementary structures, the DRL agent learns adaptively by executing goal-directed interactions with an uncertain reservoir environment and making use of accumulated well control experience, which is similar to the actual field well control mode. The key insight here is that the DRL method’s ability to utilize gradients information (well-control experience) for higher sample efficiency. The simulation results based on two reservoir models indicate that compared to other optimization methods, the proposed method can attain higher NPV and access excellent performance in terms of oil displacement.

The Journal of Petroleum Science and Engineering covers the fields of petroleum and natural gas exploration, production and flow in its broadest possible sense. Topics include: reservoir engineering; reservoir simulation; rock mechanics; petrophysics; pore-level phenomena; well logging, testing and evaluation; mathematical modeling; enhanced oil and gas recovery; petroleum geology; compaction/diagenesis; petroleum economics; drilling and drilling fluids; thermodynamics and phase behavior; fluid mechanics; multi-phase flow in porous media; production engineering; formation evaluation; exploration methods; CO₂ Sequestration in geological formations/sub-surface, etc. The latest impact factor of the journal is 4.346, and the average impact factor IF in the past 3 years is 3.646. This journal currently appears in the JCR Q1 or the Chinese Academy of Sciences ranking 2.

Paper link:

https://doi.org/10.1016/j.petrol.2021.109766

Citation:

Zhang K, Wang Z, Chen G, et al. Training effective deep reinforcement learning agents for real-time life-cycle production optimization [J]. Journal of Petroleum Science and Engineering, 2022, 208: 109766.