QSARs for Plasma Protein Binding: Source Data and Predictions

Metadata Updated: April 27, 2019

The dataset has all of the information used to create and evaluate 3 independent QSAR models for the fraction of a chemical unbound by plasma protein (Fub) for environmentally relevant chemicals. In vitro plasma protein values for 1245 pharmaceuticals and 406 ToxCast chemicals were collected from the literature (Obach 2008, Zhu 2013, Wetmore 2012, Wetmore 2015). The 21 descriptors calculated by MOE that were used in the models are included, as is an acid/base/neutral/zwitterions classification based on ionization percentages calculated in ADMET Predictor. Finally, the dataset includes the in silico Fub predictions for each chemical from the constructed k-nearest neighbor, support vector machine, and random forest QSAR models, as well as a consensus (average) prediction.

This dataset is associated with the following publication: Ingle, B., R. Tornero-Velez, J. Nichols, and B. Veber. Informing the Human Plasma Protein Binding of Environmental Chemicals by Machine Learning in the Pharmaceutical Space: Applicability Domain and Limits of Predictability. Journal of Chemical Information and Modeling. American Chemical Society, Washington, DC, USA, 56(11): 2243-2252, (2016).

Access & Use Information

Public: This dataset is intended for public access and use. License: See this page for license information.

Downloads & Resources




Metadata Created Date May 5, 2017
Metadata Updated Date April 27, 2019

Metadata Source

Harvested from EPA ScienceHub

Additional Metadata

Resource Type Dataset
Metadata Created Date May 5, 2017
Metadata Updated Date April 27, 2019
Publisher U.S. EPA Office of Research and Development (ORD)
Unique Identifier A-rbpk-569
Rogelio Tornero-Velez
Maintainer Email
Public Access Level public
Bureau Code 020:00
Schema Version https://project-open-data.cio.gov/v1.1/schema
Data Dictionary https://pasteur.epa.gov/uploads/569/documents/DataDictionary_PPB_JCIM.docx
Data Dictionary Type application/vnd.openxmlformats-officedocument.wordprocessingml.document
Harvest Object Id 288dfbc8-22e7-4ec3-9613-66b35d7836e1
Harvest Source Id cf9b0004-f9fd-420e-bade-a86839e82acf
Harvest Source Title EPA ScienceHub
License https://pasteur.epa.gov/license/sciencehub-license.html
Data Last Modified 2016-08-26
Program Code 020:000
Publisher Hierarchy U.S. Government > U.S. Environmental Protection Agency > U.S. EPA Office of Research and Development (ORD)
Related Documents https://dx.doi.org/10.1021/acs.jcim.6b00291
Source Datajson Identifier True
Source Hash 7efa7c0bd5c54e3517b0da3885a6faac3bc5f9d9
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.