Прогнозирование свойств песчаных коллекторов играет ключевую роль в разведке и разработке нефтегазовых месторождений. Традиционные подходы зачастую сталкиваются с ограничениями, связанными с нелинейностью зависимостей, неоднородностью и изменчивостью пород. Эти проблемы приводят к снижению точности прогнозирования коллекторов, что влечёт за собой риски при проектировании и разработке месторождения. В связи с этим актуальным становится внедрение методов машинного обучения, способных автоматически выявлять сложные закономерности, учитывать многофакторные взаимосвязи и адаптироваться к изменяющимся условиям, что открывает новые возможности для повышения точности и надежности прогнозов. В данной статье рассматриваются современные методы нейронного прогнозирования, их преимущества и недостатки, а также практические аспекты применения машинного обучения для прогнозирования песчаных коллекторов. Особое внимание уделено вопросам выбора входных данных, создания архитектуры нейронной сети, настройки параметров расчёта и интерпретации результатов. Исследование направлено на демонстрацию эффективности нейросетевых технологий в решении задач прогнозирования характеристик песчаных коллекторов. Ожидается, что результаты работы поспособствуют оптимизации геолого-разведочных работ и повышению экономической эффективности разработки месторождений.
нейросетевой прогноз, иерархическая нейронная сеть, самоорганизующиеся карты Кохонена
Introduction
The long-term presence of the oil and gas industry in a number of key sectors of the thermal power complex has resulted in a shift away from simple-structured reservoirs to reservoirs with complex spatial morphology and hard-to-recover reserves. Today, the focus is on technologies that simplify the process of supporting the development of such assets, which are becoming more and more common among the largest Russian companies. One of these in Rosneft is a site in the north of West Siberia, where the target object is PK1-7 reservoirs of the Pokur Series, which show extremely complex geological structure. The upper part of the reservoir is saturated with gas, and a highly-viscous oil zone under the gas cap is currently under development. The layers were formed in a wide range of transitional depositional environments: from continental to marine. The layers are represented by interbedded sandy and silty clayey units of varying thickness with high facies heterogeneity. According to the latest sedimentological analysis, three distinct cycles of reservoir formation have been identified (Fig. 1).
A special problem is the development of the second cycle, where a significant discrepancy between the geological model forecast and the actual data is recorded during drilling, which leads to an incorrect assessment of the well potential in this zone. The accuracy of the initial flow rate forecast drops by 30% in the cycle 2. At the same time, it is this cycle that accounts for the majority of the remaining planned well stock, which makes it relevant to improve approaches to building a geological model. A high percentage of model non-confirmation is associated with extreme lateral variability, which is rather difficult to track even with available 3D seismic data. Seismic data, in turn, is distorted by a massive gas reservoir, creating a “shadowing” effect due to a sharp decrease in the density and velocity of elastic waves in gas-saturated rocks. In order to minimize the impact of the above-mentioned complicating factors on seismic data, a new technique based on local detailing of the seismic prediction using a neural network analysis module in the domestic software was tested in this paper.
Materials and methods
To improve the quality of geological model prediction, a method has been developed that allows a local detailing of the accuracy of seismic predicting based on neural networks.
Work methodology:
1. Select a prediction zone in the area of planned production drilling.
2. Select input data (wells for learning and testing, estimated curves, seismic base).
3. Perform neural network prediction runs.
4. Update the geological model.
5. Optimize production drilling.
6. Repeat the process for every new production well pad.
The advantage of this method is the speed of the geological model history matching (1–2 days).
Selecting a Prediction Zone
For a detailed seismogeological analysis and improving quality of the forecast, a key area in the northern part of the field was chosen. The selection was based on three criteria:
1) The presence of a representative dataset of actual wells, providing a reliable statistical foundation for model building.
2) A recorded significant discrepancy between the predictions of the current geological model and the well logging data (high model misfit), indicating an area of substantial uncertainty.
3) The presence of Planned development drilling within the area in the near future, enabling prompt verification of the analysis results.
In accordance with these criteria, an area of 19 km² was selected, containing 7 pads with actual wells and 22 pads of the development wells planned for drilling. The forecasting was conducted within the oil-saturated interval of PK2-PK5.
Selecting Input Data
The minimum set of input data required for the prediction includes well data (prediction curve) and a seismic volume. The prediction quality directly depends on the volume and representativeness of the initial sample: the better the prediction parameter is described, the higher the prediction quality.
The study involved data from 91 wells, with 70% used to train the model, which corresponds to generally accepted machine learning standards (70/30 ratio). The net-reservoir/non-reservoir curve was used as a prediction parameter, which is relevant for problems of assessing reservoir properties. The wells for training were selected to be spaced at a sufficient distance from each other to ensure a more comprehensive characterization of the study area. The lithology logs used as input data were not subjected to any prior reprocessing.
Note the geological features of the study area: a complex geological structure due to the transition from continental to coastal-marine depositional environments; the cross-section is characterized by a high degree of heterogeneity, frequent interbedding of sand reservoirs up to 1 m thick with clay interlayers. That is why the more detailed and accurate the initial prediction volume, the higher the modeling accuracy, especially in complex geological sections.
The following seismic volumes, which were preliminary resampled to a common 1 m discretization, were tested as the seismic basis in the depth domain:
1) Amplitude volume.
2) Elastic properties volumes: acoustic impedance and Vp/Vs ratio, i.e. the key parameters for lithological separation based on seismic data (Ampilov et al., 2009).
3) Volumetric lithology forecast based on results of deterministic simultaneous inversion.
4) Volume of net-reservoir/non-reservoir probability based on high-resolution stochastic inversion (Yakovlev et al., 2011).
It should be noted that no additional preprocessing of the seismic volumes is necessary, since the application software inherently accounts for data dimensions.
Neural Network Prediction
Neural networks are a type of machine learning where the program operates on the principle of the human brain, using various neural connections.
This work considered various options for neural network prediction:
• Multiple linear regression — a statistical method that analyzes the relationship between a dependent variable and several independent variables, allowing for the simultaneous consideration of the influence of multiple factors.
• Classical Neural Networks — nonlinear regression, prediction operator based on a classical neural network with hybrid learning technologies.
• The Kolmogorov neural network. It is not a practical algorithm for training networks, but a fundamental theoretical theorem proven by the Soviet mathematician Andrey Nikolaevich Kolmogorov in 1957. The essence of the theorem is that any continuous function of many variables can be represented with any given accuracy as a superposition (combination) of a finite number of functions of one variable and the addition operation.
• A hierarchical neural network — a deep learning-based extension of the Kohonen self-organizing map (SOM), where clustering is performed using a nearest neighbors similarity measure.
Testing of neural network prediction methods showed that the running time for linear regression and classical neural networks takes about 40 minutes for 63 wells, with no correlation between the “net-reservoir/non-reservoir” curves in “blind” wells and the prediction results. The Kolmogorov Neural Network performed the run in 15-20 minutes, the result showed a poor relationship between the data. The Hierarchical Neural Network took a short time (1-2 minutes) and showed the highest quality prediction with a large number of wells.
Based on the test results, it was decided to use a Hierarchical Neural Network. The basis is the method of analogies. If the seismic field is similar to the field around a well point, then the property values may also be similar.
Let us consider the algorithm in more detail (Priezzhev et al., 2025):
1) Form a training sample
The training sample is formed by creating a set of “seismic response — predictive parameter” pairs in wells. Each pair is formed as follows:
— The predicted parameter is calculated as the average value of the target characteristic within a specified interval along the well deviation. In this study, the discrete property of "reservoir/non-reservoir" along the wells was forecasted. The averaging is performed by selecting the most frequent value among the nearest neighbors, which allows for obtaining the most probable predicted parameter.
— The seismic response is a fragment of the wave field of size m × n, where m is the number of vertical samples and n is the number of adjacent traces. Parameters m and n are set by the user, allowing the analysis to be adjusted to the specific discretization of the seismic base.
2) Build a search cluster decision tree
The clustering process is performed using an unsupervised classification algorithm –Kohonen's Self-Organizing Map (SOM) (Kokhonen, 2008) using multidimensional space projection (1D, 2D, 3D). The clustering criterion is a measure of response signature similarity determined by the distance in the multidimensional space. With this approach, two seismic responses will be assigned to the same cluster if their signatures are similar. In a simple case, the distance can be defined as the squared difference between the two responses. The result of the procedure is a cluster decision tree.
3) Perform prediction runs based on a cluster decision tree
Each seismic response is compared with reference samples stored in the nodes of the search cluster decision tree. The comparison process starts from the root node and ends with one of the tree leaves. As a result, the seismic response is assigned a value of the predictive parameter according to the pairs in the training sample.
This approach effectively considers nonlinear functions between seismic attributes and predicted parameters which improves interpretation accuracy when addressing complex geological problems.
During testing, all computation modes were tried; the following, presented in Figure 2, were chosen as optimal. Parameter testing showed that with a vertical discretization of 1 m, the number of the nearest neighbors should be around 20. The greater the number of the nearest neighbors, the smoother the prediction volume. With values greater than 30, the layering of the section disappears; with values less than 10, the prediction volume becomes noisy.
The nearest neighbor computation mode has 4 types:
1) an average value with weights based on similarity and distance;
2) an average value with weights based on similarity only;
3) an average value with weights based on distance only;
4) a median value.
The similarity and distance mode proved to be more accurate than the others.
The option with the addition of a low-frequency component (LFM, a low frequency model) stabilizes the computation result, since the volume with a low-frequency trend, bounded by horizons, is entered into the neural network to consider the low frequencies in the log curve that are absent in the seismic spectrum.
Various options for the seismic base were tested. To evaluate the training results, the average prediction accuracy values in “blind” wells were estimated using the formula:
, (1) where: mcol is the number of true positive outcomes, mnoncol is the number of true negative outcomes, n is the total number of outcomes.
The best prediction was shown by the lithology probability volume based on stochastic inversion where the prediction accuracy in “blind” wells increased from 51% to 61%. Table 1 shows how much the prediction accuracy changed before and after neural network training using various seismic bases in “blind” wells. A comprehensive description of the model parameters, the volumes of input data, and the results achieved is available in Appendix 1.
The application of neural prediction improved the prediction accuracy by 7–10% depending on the input data. Given the high geological heterogeneity of the target reservoir, it is expected that the absolute values of classification metrics do not reach levels typical of more homogeneous formations. Nevertheless, the proposed approach demonstrated a statistically significant improvement over both random prediction and the pre-training baseline, particularly when tested using the stochastic inversion cube. For this cube, the model achieved a recall of 0.64 after training, indicating correct identification of approximately two-thirds of all reservoir bodies. The precision-recall trade-off reveals that the model is predominantly optimized to minimize the omission of reservoir intervals — a strategic priority during the field development drilling stage.
Fig. 3 shows the net-reservoir/non-reservoir curves before neural network training (prediction option based on the net-reservoir/non-reservoir probability volume from stochastic inversion), actual data (log-based lithology), and curves after neural network prediction in training and “blind” wells. The log displays clearly show the predicting performance in both thin-bedded sand reservoirs and in thicker units.
The application of neural prediction using the lithology probability volume from stochastic inversion data showed the best result when compared with well curves. The prediction accuracy was evaluated in "blind" wells both before and after training, with a detailed description provided in Appendix 2.
The resulting net-reservoir/non-reservoir prediction volumes based on volumetric lithology prediction from deterministic inversion and the lithology probability volume from stochastic inversion were handed over for further testing and updating of the geological model.
Building a Geological Model
To update the geological model, the existing structural framework and two reservoir probability volumes obtained from the described neural network analysis were used. Based on this data, four model options were built: two kriging-based and two Sequential Gaussian Simulation (SGS)-based options. All four options demonstrated an improved accuracy in net reservoir prediction compared to the current geological model. The modeling results are shown in Figure 4.
The best results were achieved using the SGS method based on the neural network prediction from stochastic inversion. The modeling result, presented as a comparison of the estimated net pays and well data for the test well set, is shown in Figure 4.
Table 2 also presents model evaluation metrics, calculated from the estimated net pays (MD) in the test wells, allowing for a comparison with the current model in use.
Despite an increase in the maximum error, the new models show a significant improvement in reproducing the spatial trends of reservoir distribution — the correlation coefficient rose from 0.63 to 0.82 a critical enhancement for geological modeling. This indicates that the model now better reflects the geological heterogeneity of the intervals, though it requires additional calibration in local zones of complex structure.
Results
To assess the feasibility of the methodology, the correctness of lithology prediction by the geological model was verified in the area of new drilling, located at a significant distance (1 km) from the wells used for neural network training and local model update (Fig. 5). To date, five new vertical wells have been drilled in this area, and the logging interpretation data has been compared with the model prediction runs.
Figure 6 shows the correlation for the new vertical wells (pilots), where:
Left track — lithology in the current model
Middle track — lithology in the new model
Right track — actual lithology from well logging interpretation data.
Also, Table 3 presents a comparison of total oil net pays from the current and new models with actual well logging interpretation data.
The metrics obtained for assessing the quality of the geological model indicate a significant improvement in estimating oil net pays in the studied area, as well as accurate selection of the target interval within the cross-section. This confirms improved predicting capability of the model even in areas remote from drilling zones.
Summary
The conducted analysis showed that the direct application of the neural network algorithm increased the accuracy of reservoir determination from 0.51 to 0.61. Although the gain in absolute accuracy may seem modest, the key effect manifested at the next stage. The resulting probability cube of the reservoir was integrated into the geological model. This led to a substantial improvement in the model adequacy: the correlation coefficient between the estimated and actual data increased from 0.63 to 0.82. Thus, despite the modest improvement at the classification stage, the proposed methodology demonstrates significant potential for enhancing the reliability of geological modeling in complex reservoirs. The developed method can be applied to all fields if 3D seismic data and a sufficient volume of actual drilling data is available for training the neural network.
1. Ампилов Ю. П., Барков А. Ю., Яковлев И. В., Филиппова К. Е., Приезжев И. И. Почти все о сейсмической инверсии. Ч. 1 // Технологии сейсморазведки. 2009. № 4. C. 3–16.
2. Кохонен Т. Самоорганизующиеся карты. М.: БИНОМ. Лаборатория знаний, 2008. 655 с.
3. Приезжев И. И., Данько Д. А., Онищенко А. Н. Иерархические нейронные сети в задачах прогноза свойств коллекторов нефти и газа по скважинным и сейсмическим данным // Геология и геофизика. 2025. Т. 66, № 1. C. 131–140. DOI:https://doi.org/10.15372/GiG2024141
4. Яковлев И. В., Ампилов Ю. П., Филиппова К. Е. Почти все о сейсмической инверсии. Ч. 2 // Технологии сейсморазведки. 2011. № 1. C. 5–15.



