Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Daily and Annual PM2.5, O3, and NO2 Concentrations at ZIP Codes for the Contiguous U.S., 2000-2016, v1.0

Metadata Updated: August 23, 2025

The Daily and Annual PM2.5, O3, and NO2 Concentrations at ZIP Codes for the Contiguous U.S., 2000-2016, v1.0 data set contains daily and annual concentration predictions for Fine Particulate Matter (PM2.5), Ozone (O3), and Nitrogen Dioxide (NO2) pollutants at ZIP Code-level for the years 2000 to 2016. Ensemble predictions of three machine-learning models were implemented (Random Forest, Gradient Boosting, and Neural Network) to estimate the daily PM2.5, O3, and NO2 at the centroids of 1km x 1km grid cells across the contiguous U.S. for 2000 to 2016. The predictors included air monitoring data, satellite aerosol optical depth, meteorological conditions, chemical transport model simulations, and land-use variables. The ensemble models demonstrated excellent predictive performance with 10-fold cross-validated R-squared values of 0.86 for PM2.5, 0.86 for O3, and 0.79 for NO2. These high-resolution, well-validated predictions allow for estimates of ZIP Code-level pollution concentrations with a high degree of accuracy. For general ZIP Codes with polygon representations, pollution levels were estimated by averaging the predictions of grid cells whose centroids lie inside the polygon of that ZIP Code; for other ZIP Codes such as Post Offices or large volume single customers, they were treated as a single point and predicted their pollution levels by assigning the predictions using the nearest grid cell. The polygon shapes and points with latitudes and longitudes for ZIP Codes were obtained from Esri and the U.S. ZIP Code Database and were updated annually. The data include about 31,000 general ZIP Codes with polygon representations, and about 10,000 ZIP Codes as single points. The aggregated ZIP Code-level, daily predictions are applicable in research such as environmental epidemiology, environmental justice, health equity, and political science, by linking with ZIP Code-level demographic and medical data sets, including national inpatient care records, medical claims data, census data, U.S. Census Bureau American CommUnity Survey (ACS), and Area Deprivation Index (ADI). The data are particularly useful for studies on rural populations who are under-represented due to the lack of air monitoring sites in rural areas. Compared with the 1km grid data, the ZIP Code-level predictions are much smaller in size and are manageable in personal computing environments. This greatly improves the inclusion of scientists in different fields by lowering the key barrier to participation in air pollution research. The Units are ug/m^3 for PM2.5 and ppb for O3 and NO2.

Access & Use Information

Public: This dataset is intended for public access and use. License: No license information was provided. If this work was prepared by an officer or employee of the United States government as part of that person's official duties it is considered a U.S. Government Work.

Downloads & Resources

References

https://doi.org/10.1164/rccm.202107-1596OC
https://doi.org/10.1038/s41586-021-04190-y
https://doi.org/10.1016/j.envres.2022.114636

Dates

Metadata Created Date May 30, 2023
Metadata Updated Date August 23, 2025

Metadata Source

Harvested from NASA Data.json

Additional Metadata

Resource Type Dataset
Metadata Created Date May 30, 2023
Metadata Updated Date August 23, 2025
Publisher SEDAC
Maintainer
Identifier C2563727886-SEDAC
Data First Published 2022-12-09
Language en-US
Data Last Modified 2025-07-17
Category AQDH, geospatial
Public Access Level public
Bureau Code 026:00
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id d1acb6cc-f7c0-407f-9e4e-1dee216b3349
Harvest Source Id 58f92550-7a01-4f00-b1b2-8dc953bd598f
Harvest Source Title NASA Data.json
Metadata Type geospatial
Old Spatial -180.0 17.0 -65.0 72.0
Program Code 026:001
Related Documents https://doi.org/10.1164/rccm.202107-1596OC, https://doi.org/10.1038/s41586-021-04190-y, https://doi.org/10.1016/j.envres.2022.114636
Source Datajson Identifier True
Source Hash 06aa50bd7fe7cde08c89e1b87f454b6389c0614d5186d6d9f5611c4e2617d853
Source Schema Version 1.1
Spatial
Temporal 2000-01-01T00:00:00Z/2016-12-31T00:00:00Z

Didn't find what you're looking for? Suggest a dataset here.