Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Observed, predicted, and misclassification error data for observations in the training datset for nitrate and arsenic concentrations in basin-fill aquifers in the Southwest Principal Aquifers study.

Metadata Updated: December 11, 2025

This product "Observed, predicted, and misclassification error data for observations in the training dataset for nitrate and arsenic concentrations in basin-fill aquifers in the Southwest Principal Aquifers study" is a 1:250,000-scale point dataset and was developed as part of a regional Southwest Principal Aquifers (SWPA) study. The study examined the vulnerability of basin-fill aquifers in the southwestern United States to nitrate contamination and arsenic enrichment. Statistical models were developed by using the random forest classifier algorithm
to predict concentrations of nitrate and arsenic across a model grid that represents local- and basin-scale measures of source, aquifer susceptibility, and geochemical conditions.

Separate classifiers were developed for nitrate and arsenic because each constituent was expected to be affected by a different set of factors, and each factor could have a different magnitude or directional influence (increase/decrease) on concentration. For each constituent, two different classifiers were developed; a prediction classifier and a confirmatory classifier. The prediction classifiers were developed specifically to predict nitrate and arsenic concentrations in basin-fill aquifers across the SWPA study area and were based on explanatory variables representing source and susceptibility conditions. These explanatory variables were available throughout the entire SWPA study area and, therefore, did not pose a limitation for using the classifiers to predict concentrations.

The confirmatory classifiers were developed to supplement the prediction classifiers in the evaluation of the conceptual model. The name, "confirmatory," reflects the classifier's purpose for evaluation of a-priori hypotheses and contrasts other general types of statistical models, such as those used for prediction or exploratory purposes. The confirmatory classifiers included the explanatory variables used in the prediction classifiers, as well as additional variables representing geochemical conditions and basin groundwater budget components. The inclusion of the geochemical and basin groundwater budget variables in the confirmatory classifiers allowed for further evaluation of the conceptual models, which was not possible with the prediction classifiers alone. The geochemical data, however, were only available at specific well locations, and consistent water-budget data were not available for every basin in the study area. The limited availability of the data for these variables constrained the confirmatory classifiers to observations from 16 case-study basins and precluded use of the confirmatory classifier for predicting concentrations across the SWPA study area. To contrast the scope of the two classifiers, the confirmatory classifiers were developed by using all available explanatory variables but with observations restricted to the 16 case-study basins, whereas the prediction classifiers were unrestricted with respect to spatial extent because these were developed by using a subset of the explanatory variables that were available throughout the study area.

Access & Use Information

Public: This dataset is intended for public access and use. License: No license information was provided. If this work was prepared by an officer or employee of the United States government as part of that person's official duties it is considered a U.S. Government Work.

Downloads & Resources

Dates

Metadata Created Date September 13, 2025
Metadata Updated Date December 11, 2025

Metadata Source

Harvested from DOI USGS DCAT-US

Additional Metadata

Resource Type Dataset
Metadata Created Date September 13, 2025
Metadata Updated Date December 11, 2025
Publisher U.S. Geological Survey
Maintainer
Identifier http://datainventory.doi.gov/id/dataset/usgs-1d589b73-af80-4229-bd58-c62dd4192bc4
Data Last Modified 2020-11-17T00:00:00Z
Category geospatial
Public Access Level public
Bureau Code 010:12
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Metadata Catalog ID https://ddi.doi.gov/usgs-data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id ecd483bf-231d-45f4-a214-5757fc5b7554
Harvest Source Id 2b80d118-ab3a-48ba-bd93-996bbacefac2
Harvest Source Title DOI USGS DCAT-US
Metadata Type geospatial
Source Datajson Identifier True
Source Hash 06160c9ba6deb12689f82e4e2d8c855ac9f158a48280a1211b0f2374f6c04fc8
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.