Judson_RefChem_InVitro_Assays

Metadata Updated: May 2, 2021

This supplemental material describes two sets of methods; first, it briefly describes the process used to create the EPA’s LitDB database, and second, it describes how a subset of records were extracted from LitDB to be included in the reference chemical database RefChemDB. LitDB is a database of data elements extracted from the xml download of all MEDLINE PubMed records. Perl scripts are used to extract the identifying information from each citation record, information like title, abstract, authors and PubMed ID. Additionally, the MeSH (Medical subject heading) terms are extracted with subheadings (also known as qualifiers).
The Perl scripts extract to text files which are then loaded into a mysql database. The MeSH heading and descriptor tree files are also downloaded into mysql tables. They are available at https://www.nlm.nih.gov/mesh/filelist.html.
To make the data more useful for research in chemicals, the data is passed stepwise through a series of algorithms.

This dataset is associated with the following publication: Judson, R., R. Thomas, N. Baker, A. Simha, X.M. Howey, C. Marable, N. Kleinstreuer, and K. Houck. Workflow for Defining Reference Chemicals for Assessing Performance of In Vitro Assays (Altex). ALTEX. Society ALTEX Edition, Kuesnacht, SWITZERLAND, 36(2): 261-276, (2019).

Access & Use Information

Public: This dataset is intended for public access and use. License: See this page for license information.

Downloads & Resources

https://gaftp.epa.gov/COMPTOX/NCCT_Publication_...

Visit page

References

https://doi.org/10.14573/altex.1809281

Dates

Metadata Created Date	November 12, 2020
Metadata Updated Date	May 2, 2021

Metadata Source

Data.json Data.json Metadata
Download Metadata

Harvested from EPA ScienceHub

Additional Metadata

Resource Type	Dataset
Metadata Created Date	November 12, 2020
Metadata Updated Date	May 2, 2021
Publisher	U.S. EPA Office of Research and Development (ORD)
Maintainer	Keith Houck
Identifier	https://doi.org/10.23719/1503083
Data Last Modified	2018-12-12
Public Access Level	public
Bureau Code	020:00
Schema Version	https://project-open-data.cio.gov/v1.1/schema
Harvest Object Id	07744aaf-d8d4-4b16-96a2-3040263d3ca4
Harvest Source Id	04b59eaf-ae53-4066-93db-80f2ed0df446
Harvest Source Title	EPA ScienceHub
License	https://pasteur.epa.gov/license/sciencehub-license.html
Program Code	020:095
Publisher Hierarchy	U.S. Government > U.S. Environmental Protection Agency > U.S. EPA Office of Research and Development (ORD)
Related Documents	https://doi.org/10.14573/altex.1809281
Source Datajson Identifier	True
Source Hash	98d32622b976aab24960a7ad04a2911fcb7ab34d
Source Schema Version	1.1

Didn't find what you're looking for? Suggest a dataset here.

Data Catalog