Regression trees for modelling geochemical data- an application to Late Jurassic carbonates (Ammonitico Rosso)
Regression trees for modelling geochemical data- an application to Late Jurassic carbonates (Ammonitico Rosso)
Research based on ancient carbonate geochemical records is often assisted by multivariate statistical analysis, among others, used for data mining. This contribution reports a complementary approach that can be applied to paleoenvironmental research. The choice to use a machine learning method, here regression trees (RT), relied in the ability to learn complex patterns, integrating multiple types of data with different statistical distributions to obtain a knowledge model of geochemical behaviour along a paleo-platform.
The Late Jurassic epioceanic deposits under scope are represented by six stratigraphic sections located in SE Spain and on the Majorca Island. The used database comprises a total of 1960 data points corresponding to eight variables (stable C and O isotopes, the elements Ca, Mg, Sr, Fe, Mn and skeletal content). This study uses RT models in which the predictive variables are the geochemical proxies, whilst skeletal content is used as a target variable. The resulting model is data driven, explaining variations in the target variable and providing additional information on the relative importance of each variable to each prediction, as well as its corresponding threshold values.
The obtained RT revealed a structured distribution of samples, organized either by stratigraphic section or sets of nearby sections. Averaged estimated skeletal abundance confirmed the initial observations of higher skeletal content for the most distal sections with estimated values from 18 to 27%. In contrast, lower skeletal abundance from 5 to 15% is proposed for the remaining sections. The geochemical variable that best discriminates this major trend is δ18O, at a threshold value of -0.2‰, interpreted as evidence for separation of water-mass properties across the studied areas. Other four variables were considered relevant by the obtained decision tree: C isotopes, Ca, Sr and Mn, providing new insights for further differentiation between sets of samples.
jurassic, carbonate geochemistry, machine learning, regression trees
198-207
Coimbra, R.
5d16b1f6-4560-4348-b8b2-e7c170b54ae5
Rodriguez-Galiano, V.F.
1eb6a1dd-f73d-4e90-a9cf-a51f20712c3c
Oloriz, F.
3914eb4b-0e40-48ef-bc02-59b5d970b7bf
Chica-Olmo, M.
c7291c15-3b53-45d7-942c-06985f77d6f6
Coimbra, R.
5d16b1f6-4560-4348-b8b2-e7c170b54ae5
Rodriguez-Galiano, V.F.
1eb6a1dd-f73d-4e90-a9cf-a51f20712c3c
Oloriz, F.
3914eb4b-0e40-48ef-bc02-59b5d970b7bf
Chica-Olmo, M.
c7291c15-3b53-45d7-942c-06985f77d6f6
Coimbra, R., Rodriguez-Galiano, V.F., Oloriz, F. and Chica-Olmo, M.
(2014)
Regression trees for modelling geochemical data- an application to Late Jurassic carbonates (Ammonitico Rosso).
Computers & Geosciences, 73, .
(doi:10.1016/j.cageo.2014.09.007).
Abstract
Research based on ancient carbonate geochemical records is often assisted by multivariate statistical analysis, among others, used for data mining. This contribution reports a complementary approach that can be applied to paleoenvironmental research. The choice to use a machine learning method, here regression trees (RT), relied in the ability to learn complex patterns, integrating multiple types of data with different statistical distributions to obtain a knowledge model of geochemical behaviour along a paleo-platform.
The Late Jurassic epioceanic deposits under scope are represented by six stratigraphic sections located in SE Spain and on the Majorca Island. The used database comprises a total of 1960 data points corresponding to eight variables (stable C and O isotopes, the elements Ca, Mg, Sr, Fe, Mn and skeletal content). This study uses RT models in which the predictive variables are the geochemical proxies, whilst skeletal content is used as a target variable. The resulting model is data driven, explaining variations in the target variable and providing additional information on the relative importance of each variable to each prediction, as well as its corresponding threshold values.
The obtained RT revealed a structured distribution of samples, organized either by stratigraphic section or sets of nearby sections. Averaged estimated skeletal abundance confirmed the initial observations of higher skeletal content for the most distal sections with estimated values from 18 to 27%. In contrast, lower skeletal abundance from 5 to 15% is proposed for the remaining sections. The geochemical variable that best discriminates this major trend is δ18O, at a threshold value of -0.2‰, interpreted as evidence for separation of water-mass properties across the studied areas. Other four variables were considered relevant by the obtained decision tree: C isotopes, Ca, Sr and Mn, providing new insights for further differentiation between sets of samples.
This record has no associated files available for download.
More information
e-pub ahead of print date: 7 October 2014
Keywords:
jurassic, carbonate geochemistry, machine learning, regression trees
Organisations:
Global Env Change & Earth Observation
Identifiers
Local EPrints ID: 370084
URI: http://eprints.soton.ac.uk/id/eprint/370084
ISSN: 0098-3004
PURE UUID: 65609ee2-2472-4a5c-8794-4a33f6c625ab
Catalogue record
Date deposited: 24 Oct 2014 08:11
Last modified: 14 Mar 2024 18:12
Export record
Altmetrics
Contributors
Author:
R. Coimbra
Author:
V.F. Rodriguez-Galiano
Author:
F. Oloriz
Author:
M. Chica-Olmo
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics