Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Data for machine learning predictions of pH in the glacial aquifer system, northern USA

Metadata Updated: July 6, 2024

A boosted regression tree (BRT) model was developed to predict pH conditions in three-dimensions throughout the glacial aquifer system (GLAC) of the contiguous United States using pH measurements in samples from 18,258 wells and predictor variables that represent aspects of the hydrogeologic setting. Model results indicate that the carbonate content of soils and aquifer materials strongly controls pH and when coupled with long flow paths, results in the most alkaline conditions. Conversely, in areas where glacial sediments are thin and carbonate-poor, pH conditions remain acidic. At depths typical of drinking-water supplies, predicted pH > 7.5 – which is associated with arsenic mobilization – occurs more frequently than predicted pH < 6 – which is associated with water corrosivity and the mobilization of other trace elements. A novel aspect of this model was the inclusion of numerically based estimates of groundwater flow characteristics (age and flow path length) as predictor variables. The sensitivity of pH predictions to these variables was consistent with hydrologic understanding of groundwater flow systems and the geochemical evolution of groundwater quality. The model was not developed to provide precise estimates of pH at any given location. Rather, it can be used to more generally identify areas where contaminants may be mobilized into groundwater and where corrosivity issues may be of concern to set priorities among areas for future groundwater monitoring. Data are provided in 2 tables and 3 compressed files that contain various files associated with the BRT model. The 2 tables include: 1) pH_Predictions_GLAC_GeochMod_Dataset.csv (GM dataset): This table is generally a subset of the pH dataset (the measured pH data for well sites that were separated into the training and testing dataset files “trnData.txt” and “testData.txt” included in model_archive.7z) that was used to model pH conditions but includes more complete geochemical data and also includes some additional wells from Wilson and others (2019). The table includes pH, general chemical characteristics, and concentrations of major and trace elements, calculated parameters, and mineral saturation indices (SI) computed with PHREEQC (Parkhurst and Appelo, 2013) for 9,655 groundwater samples from wells in the GLAC. 2) pH_Predictions_GLAC_Variable_Descriptions.txt: A table listing all variables (short abbreviation and long description) used in the BRT model, including the importance rank of the variable, units, and reference. The 3 compressed files include: 1) model_archive.7z: contains 15 files associated with the BRT model 2) rstack_dom.7z: rstack_dom.txt 3) rstack_pub.7z : rstack_pub.txt Refer to the README.txt file in model_archive.7z for information about the files in the archive and how to use them to run the BRT model. "The "rstack" files represent raster stacks which are a collection of raster layer objects with the same spatial extent and resolution and which are vertically aligned. Rstack.dom consists of raster layer objects at the depth typically used for domestic supplies and rstack.pub, those at the depth typically used for public supplies.

Access & Use Information

Public: This dataset is intended for public access and use. License: No license information was provided. If this work was prepared by an officer or employee of the United States government as part of that person's official duties it is considered a U.S. Government Work.

Downloads & Resources

Dates

Metadata Created Date June 1, 2023
Metadata Updated Date July 6, 2024

Metadata Source

Harvested from DOI EDI

Additional Metadata

Resource Type Dataset
Metadata Created Date June 1, 2023
Metadata Updated Date July 6, 2024
Publisher U.S. Geological Survey
Maintainer
@Id http://datainventory.doi.gov/id/dataset/7b0a3c85a94396bcfa583ada195fc630
Identifier USGS:5efe239782ce3fd7e8a82651
Data Last Modified 20201216
Category geospatial
Public Access Level public
Bureau Code 010:12
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Metadata Catalog ID https://datainventory.doi.gov/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id 14506bd2-7423-4295-92e5-3fd5deb5235e
Harvest Source Id 52bfcc16-6e15-478f-809a-b1bc76f1aeda
Harvest Source Title DOI EDI
Metadata Type geospatial
Old Spatial -120.0,35.0,-65.0,50.0
Publisher Hierarchy White House > U.S. Department of the Interior > U.S. Geological Survey
Source Datajson Identifier True
Source Hash 8067959e2dd77202eb72484b241af383b7a2772d64f3330ff56da9e975328548
Source Schema Version 1.1
Spatial {"type": "Polygon", "coordinates": -120.0, 35.0, -120.0, 50.0, -65.0, 50.0, -65.0, 35.0, -120.0, 35.0}

Didn't find what you're looking for? Suggest a dataset here.