基于集成学习的松辽盆地砂岩型铀矿地层岩性自动识别研究

Automatic Lithology Identification of Sandstone-type Uranium Deposit in Songliao Basin Based on Ensemble Learning

  • 摘要: 地层岩性的准确识别与砂岩型铀矿层的圈定密切相关,岩性组合的正确分析对于开展砂岩型铀矿的勘查与异常识别具有重要意义。本文针对传统测井岩性识别方法与机器学习类方法中存在的问题,以北方松辽盆地砂岩型铀矿为研究对象,采用两种典型的集成算法模型(XGBoost和SMOTE随机森林)开展地层岩性自动识别研究,并将识别结果与K最近邻分类算法(KNN)、梯度提升决策树算法(GBDT)等典型机器学习算法进行对比。结果表明,XGBoost和SMOTE随机森林两种集成算法模型对砂岩型铀矿地层岩性识别的准确率都在95%以上,且较KNN模型和GBDT模型的准确率有明显提高。XGBoost模型用于控制过拟合的正则项和节点分裂时支持特征多线程进行增益的计算,显著提高了运算效率,SMOTE合成少数过采样技术解决了样本数据不平衡的问题。基于集成算法的优化过程可为砂岩型铀矿岩性分类问题提供理论依据与技术支撑。

     

    Abstract: The accurate identification of stratigraphic lithology is closely related to the delineation of sandstone-type uranium deposits. In the face of complex stratigraphic structure, the correct analysis of lithology combination is of great significance to the exploration and anomaly identification of sandstone-type uranium deposits. In uranium exploration, geophysical logging data, as a bridge between the change of geophysical properties and the underground geological environment, is an effective and irreplaceable method to understand the underground rock structure and reservoir characteristics. Conventional lithology identification methods such as crossplot method, probability statistic method, cluster analysis method and conventional machine learning class method have some defects, such as low accuracy, identification efficiency and generalization ability. Ensemble learning is a method of achieving consensus in predictions by integrating significant attributes of two or more models, making the final learning framework more comprehensive than that of a single component model, reducing errors and other factors. Compared with ordinary machine learning algorithms, integrated learning algorithms have more advantages in data processing. In this paper aiming at the problems existing in traditional logging lithology identification methods and machine learning methods, the sandstone-type uranium ore in Songliao basin in north China was taken as the research object, and the original data were analyzed and pretreated. Combined with previous studies, two typical integrated algorithm models (XGBoost and SMOTE random Forest) were used to carry out automatic lithology identification of sandstone-type uranium ore in Songliao basin, and the recognition results of the two integrated algorithm models were compared with K-Nearest Neighbor (KNN), Gradient Boosting Decision Tree (GBDT) and other typical machine learning algorithm models were also compared. The results show that the accuracy of XGBoost and SMOTE stochastic forest integrated algorithm model for lithology identification of sandstone-type uranium ore is above 95%, and the accuracy of KNN model and GBDT model is significantly improved. In order to solve the problem of overfitting in operation, XGBoost algorithm model was used to control the regular term of overfitting and node splitting, and support characteristic multithreading to calculate the gain, which improves the operation efficiency and ensures the reliability of the integrated algorithm model. SMOTE synthetic minority oversampling technique solves the problem of sample data imbalance in the random forest algorithm model. The optimization process based on integrated algorithm model provides a theoretical basis for lithology classification of sandstone-type uranium deposits, and provides technical support for strategic breakthrough in uranium exploration.

     

/

返回文章
返回