A maximum entropy deep reinforcement learning method for sequential well placement optimization using multi-discrete action spaces
Date: 2024-06-12  Cicking Rate: 17


Recently, the team of Zhang Kai has made new progress in the field of reservoir well placement optimization, and the related research has been published in the Geoenergy Science and Engineering. The paper is titled "A maximum entropy deep reinforcement learning method for sequential well placement optimization using multi-discrete action spaces".

Innovation: Current well placement optimization methods face challenges such as high-dimensional discretization of optimization variables, lack of effective exploration incentives for policy, and inability to adjust well placement optimization schemes in real-time. These issues lead to suboptimal performance that fails to meet engineering needs. To address these challenges, this study models the sequential well placement optimization problem in reservoirs as a Markov decision process. The large-scale discrete action space of well placement optimization variables is reconstructed into multi-discrete action spaces. Utilizing the maximum entropy mechanism to encourage policy exploration enhances global optimization capability, thereby establishing a well placement optimization framework based on maximum entropy deep reinforcement learning. Additionally, the trained policy can quickly adapt to the specific states of the target reservoir without the need for retraining, enabling offline application and demonstrating good real-time adjustability. The proposed method shows excellent performance in both optimization efficiency and offline application.

Abstract: Well placement optimization is a crucial method to solve the planar conflicts in reservoir development, mainly to determine the optimal well locations and drilling sequence to maximize the economic benefits of reservoir development. However, the current well placement optimization methods face the problems of high-dimensional discretization of optimization variables and lack of effective incentives for policy exploration, which make it challenging to improve the global optimization ability (the ability to jump out of locally optimal solutions in time and continuously search for better solutions in the whole optimization process) and real-time adjustability of the well placement optimization methods under limited numerical simulation times. In this paper, we propose a new sequential well placement optimization method, based on the Discrete Soft Actor-Critic Algorithm (DSAC), which incorporates the maximum entropy mechanism to formulate well placement and drilling sequencing schemes more efficiently and maximize the net present value (NPV) over the entire life cycle of reservoir development. Specifically, the method models the well placement optimization problem as a Markov Decision Process (MDP) and achieves sequential well placement optimization by training a Deep Reinforcement Learning (DRL) agent that maps reservoir states to a stochastic policy of well placement variables as well as evaluates the value function of the current policy. The DRL agent can determine the optimal infill well location in real-time based on the reservoir state at different times during the development process, thus obtaining the optimal drilling sequence. The proposed method in this paper has two innovations. First, by reconstructing the large-scale discrete action space of well placement optimization variables into multi-discrete action spaces, and with the maximum entropy mechanism, policy exploration is encouraged to improve the global optimization capability. Second, the trained policy can swiftly adapt the subsequent well placement scheme for a specific state of the target reservoir without the requirement to initiate training from scratch, which can realize the offline application of the trained policy and has better real-time adjustability. To verify the effectiveness of the proposed method, it is tested in 2D and 3D reservoir models. The results show that DSAC not only outperforms the gradient-based optimization method, classical evolutionary algorithms, and existing reinforcement learning proximal policy optimization (PPO) method in terms of global optimization ability but also shows better real-time adjustability of the trained policy when applied offline.

The Geoenergy Science and Engineering covers the fields of petroleum and natural gas exploration, production and flow in its broadest possible sense. Topics include: reservoir engineering; reservoir simulation; rock mechanics; petrophysics; pore-level phenomena; well logging, testing and evaluation; mathematical modeling; enhanced oil and gas recovery; petroleum geology; compaction/diagenesis; petroleum economics; drilling and drilling fluids; thermodynamics and phase behavior; fluid mechanics; multi-phase flow in porous media; production engineering; formation evaluation; exploration methods; CO2 Sequestration in geological formations/sub-surface, etc. The latest impact factor of the journal is 4.4, and the average impact factor IF in the past 3 years is 4.5. This journal currently appears in the JCR Q1 or the Chinese Academy of Sciences ranking Q2.

Paper link:

https://doi.org/10.1016/j.geoen.2024.213004

Citation:

Zhang K, Sun Z, Zhang L, et al. A maximum entropy deep reinforcement learning method for sequential well placement optimization using multi-discrete action spaces[J]. Geoenergy Science and Engineering, 2024: 213004.




Copyright:@ The Zhang Group
You are the
th Visitor