Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning

Metadata Updated: November 12, 2020

QSAR Model Reporting Formats. Examples of R code: feature selection and regression analysis. Figure S1: Data distribution of logBCF, BP, MP and logVP. Figures S2–S5: Relationship between model complexity and prediction errors as well as the plots of estimated values versus experimental data for logBCF, BP, MP, and logVP, respectively. Figure S6: Plots of leverage versus standardized residuals for logBCF, BP, MP, and logVP models. Table S1: Chemical product classes for training and test sets. Tables S2–S5: Regression statistics for logBCF, BP, MP, and logVP, respectively. Table S6: Applicability domains for logBCF, BP, MP, and logVP. Tables S7–S12: Chemicals with large prediction residuals for the six properties (PDF)

Chemical names, CAS registry number and SMILES as well as experimentally measured and estimated property values of the training and test sets (XLSX).

This dataset is associated with the following publication: Zang, Q., K. Mansouri, A. Williams, R. Judson, D. Allen, W.M. Casey, and N.C. Kleinstreuer. (Journal of Chemical Information and Modeling) In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning. Journal of Chemical Information and Modeling. American Chemical Society, Washington, DC, USA, 57(1): 36-49, (2017).

Access & Use Information

Public: This dataset is intended for public access and use. License: See this page for license information.

Downloads & Resources

References

https://doi.org/10.1021/acs.jcim.6b00625

Dates

Metadata Created Date November 12, 2020
Metadata Updated Date November 12, 2020

Metadata Source

Harvested from EPA ScienceHub

Additional Metadata

Resource Type Dataset
Metadata Created Date November 12, 2020
Metadata Updated Date November 12, 2020
Publisher U.S. EPA Office of Research and Development (ORD)
Maintainer
Identifier https://doi.org/10.23719/1504459
Data Last Modified 2017-12-22
Public Access Level public
Bureau Code 020:00
Schema Version https://project-open-data.cio.gov/v1.1/schema
Harvest Object Id 29cef7dd-3133-40ed-a199-1d51573896da
Harvest Source Id 04b59eaf-ae53-4066-93db-80f2ed0df446
Harvest Source Title EPA ScienceHub
License https://pasteur.epa.gov/license/sciencehub-license.html
Program Code 020:095
Publisher Hierarchy U.S. Government > U.S. Environmental Protection Agency > U.S. EPA Office of Research and Development (ORD)
Related Documents https://doi.org/10.1021/acs.jcim.6b00625
Source Datajson Identifier True
Source Hash fc54d26bbbb7e094d22df00fb0cff73b305609e8
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.