Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content


Metadata Updated: May 2, 2021

Read-across is an important data gap filling technique used within category and analog approaches for regulatory hazard identification and risk assessment. Although much technical guidance is available that describes how to develop category/analog approaches, practical principles to evaluate and substantiate analog validity (suitability) are still lacking. This case study uses hindered phenols as an example chemical class to determine: (1) the capability of three structure fingerprint/descriptor methods (PubChem, ToxPrints and MoSS MCSS) to identify analogs for read-across to predict Estrogen Receptor (ER) binding activity and, (2) the utility of data confidence measures, physicochemical properties, and chemical R-group properties as filters to improve ER binding predictions. The training dataset comprised 462 hindered phenols and 257 non- hindered phenols. For each chemical of interest (target), source analogs were identified from two datasets (hindered and non-hindered phenols) that had been characterized by a fingerprint/descriptor method and by two cut-offs: (1) minimum similarity distance (range: 0.1 - 0.9) and, (2) N closest analogs (range: 1 - 10). Analogs were then filtered using: (1) physicochemical properties of the phenol (termed global filtering) and, (2) physicochemical properties of the R-groups neighboring the active hydroxyl group (termed local filtering). A read-across prediction was made for each target chemical on the basis of a majority vote of the N closest analogs. The results demonstrate that: (1) concordance in ER activity increases with structural similarity, regardless of the structure fingerprint/descriptor method, (2) increased data confidence significantly improves read-across predictions, and (3) filtering analogs using global and local properties can help identify more suitable analogs. This case study illustrates that the quality of the underlying experimental data and use of endpoint relevant chemical descriptors to evaluate source analogs are critical to achieving robust read-across predictions.

This dataset is associated with the following publication: Pradeep, P., K. Mansouri, G. Patlewicz, and R. Judson. (Computational Toxicology) A systematic evaluation of analogs and automated read-across prediction of estrogenicity: A case study using hindered phenols. Computational Toxicology. Elsevier B.V., Amsterdam, NETHERLANDS, 4: 22-30, (2017).

Access & Use Information

Public: This dataset is intended for public access and use. License: See this page for license information.

Downloads & Resources



Metadata Created Date November 12, 2020
Metadata Updated Date May 2, 2021

Metadata Source

Harvested from EPA ScienceHub

Additional Metadata

Resource Type Dataset
Metadata Created Date November 12, 2020
Metadata Updated Date May 2, 2021
Publisher U.S. EPA Office of Research and Development (ORD)
Data Last Modified 2017-09-18
Public Access Level public
Bureau Code 020:00
Schema Version
Harvest Object Id 5b1f6ac6-e47c-4e05-a64f-8ed67ee2b53a
Harvest Source Id 04b59eaf-ae53-4066-93db-80f2ed0df446
Harvest Source Title EPA ScienceHub
Program Code 020:095
Publisher Hierarchy U.S. Government > U.S. Environmental Protection Agency > U.S. EPA Office of Research and Development (ORD)
Related Documents
Source Datajson Identifier True
Source Hash 2669dda47031adca0bad106806c80acb9f1b8bd4
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.