Evaluation of machine learning methods for formation lithology identification: A comparison of tuning processes and model performances
Evaluation of machine learning methods for formation lithology identification: A comparison of tuning processes and model performances
Identification of underground formation lithology from well log data is an important task in petroleum exploration and engineering. Recently, several computational algorithms have been used for lithology identification to improve the prediction accuracy. In this paper, we evaluate five typical machine learning methods, namely the Naïve Bayes, Support Vector Machine, Artificial Neural Network, Random Forest and Gradient Tree Boosting, for formation lithology identification using data from the Daniudui gas field and the Hangjinqi gas field. The input to each model consists of features selected from different well log data samples. To determine the best model to classify the lithology type, this study used validation curve to determine the parameter search range and adopted the hyper-parameter optimization method to obtain the best parameter set for each model. The performance of each classifier is also evaluated using 5-fold cross validation. The results suggest that ensemble methods are good algorithm choices for supervised classification of lithology using well log data. The Gradient Tree Boosting classifier is robust to overfitting because it grows trees sequentially by adjusting the weight of the training data distribution to minimize a loss function. The random forest classifier is also a suitable option. An evaluation matrix showed that the Gradient Tree Boosting and Random Forest classifiers have lower prediction errors compared with the other three models. Although all the models have difficulties in distinguishing sandstone classes, the Gradient Tree Boosting performs well on this task compared with the other four methods. Moreover, the classification accuracy is remarkably similar across the lithology classes for both the Random Forest and Gradient Tree Boosting models.
Lithology identification, Supervised learning, Gradient boosting, Tuning parameter
182-193
Xie, Yunxin
6867f662-973a-488a-a5e4-785ccf300b42
Zhu, Chenyang
67a1c085-5e0b-4dcf-8770-b99c520115fc
Zhou, Wen
1f092e05-e11d-43cb-899c-ee302862bcc3
LI, Zhongdong
8c9a602a-e113-4f83-a5c2-545964ea833d
Tu, Mei
e4842bd4-4f33-456b-bf42-be9bb6d32ac5
January 2018
Xie, Yunxin
6867f662-973a-488a-a5e4-785ccf300b42
Zhu, Chenyang
67a1c085-5e0b-4dcf-8770-b99c520115fc
Zhou, Wen
1f092e05-e11d-43cb-899c-ee302862bcc3
LI, Zhongdong
8c9a602a-e113-4f83-a5c2-545964ea833d
Tu, Mei
e4842bd4-4f33-456b-bf42-be9bb6d32ac5
Xie, Yunxin, Zhu, Chenyang, Zhou, Wen, LI, Zhongdong and Tu, Mei
(2018)
Evaluation of machine learning methods for formation lithology identification: A comparison of tuning processes and model performances.
Journal of Petroleum Science and Engineering, 160, .
(doi:10.1016/j.petrol.2017.10.028).
Abstract
Identification of underground formation lithology from well log data is an important task in petroleum exploration and engineering. Recently, several computational algorithms have been used for lithology identification to improve the prediction accuracy. In this paper, we evaluate five typical machine learning methods, namely the Naïve Bayes, Support Vector Machine, Artificial Neural Network, Random Forest and Gradient Tree Boosting, for formation lithology identification using data from the Daniudui gas field and the Hangjinqi gas field. The input to each model consists of features selected from different well log data samples. To determine the best model to classify the lithology type, this study used validation curve to determine the parameter search range and adopted the hyper-parameter optimization method to obtain the best parameter set for each model. The performance of each classifier is also evaluated using 5-fold cross validation. The results suggest that ensemble methods are good algorithm choices for supervised classification of lithology using well log data. The Gradient Tree Boosting classifier is robust to overfitting because it grows trees sequentially by adjusting the weight of the training data distribution to minimize a loss function. The random forest classifier is also a suitable option. An evaluation matrix showed that the Gradient Tree Boosting and Random Forest classifiers have lower prediction errors compared with the other three models. Although all the models have difficulties in distinguishing sandstone classes, the Gradient Tree Boosting performs well on this task compared with the other four methods. Moreover, the classification accuracy is remarkably similar across the lithology classes for both the Random Forest and Gradient Tree Boosting models.
Text
Evaluation of machine learning methods for formation lithology identification: A comparison of tuning processes and model performances
- Accepted Manuscript
More information
Accepted/In Press date: 10 October 2017
e-pub ahead of print date: 20 October 2017
Published date: January 2018
Keywords:
Lithology identification, Supervised learning, Gradient boosting, Tuning parameter
Identifiers
Local EPrints ID: 416173
URI: http://eprints.soton.ac.uk/id/eprint/416173
ISSN: 0920-4105
PURE UUID: 09959f16-f1eb-4a24-826a-0cdbfd381617
Catalogue record
Date deposited: 06 Dec 2017 17:30
Last modified: 16 Mar 2024 05:59
Export record
Altmetrics
Contributors
Author:
Yunxin Xie
Author:
Chenyang Zhu
Author:
Wen Zhou
Author:
Zhongdong LI
Author:
Mei Tu
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics