Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Training and Test-Related Data for Keyphrase Extraction for Technical Language Processing

Metadata Updated: July 29, 2022

Training and test-related data to accompany "Keyphrase Extraction for Technical Language Processing" by Alden Dima and Aaron Massey (in press). The subdirectories "keyphrase-extraction-jct-train" and "keyphrase-extraction-jct-test" contain a total of 1153 ThermoML files which are each associated with a corresponding Journal of Chemical Thermodynamics (JCT) article. These ThermoML files contain information about these papers in extensible markup language (XML) format including the title, authors, abstract, digital object identifier (DOI) and keywords. They also contain thermophysical property data unrelated to the keyphrase extraction study. These files were obtained from the National Institute of Standard and Technology (NIST) Thermodynamics Research Center (TRC) in Boulder, Colorado (https://trc.nist.gov/). Readers wishing to replicate this work will also need to obtain the original JCT articles which can be obtained from https://www.sciencedirect.com/journal/the-journal-of-chemical-thermodynamics.

Access & Use Information

Public: This dataset is intended for public access and use. License: See this page for license information.

Downloads & Resources

Dates

Metadata Created Date March 11, 2021
Metadata Updated Date July 29, 2022
Data Update Frequency irregular

Metadata Source

Harvested from NIST

Additional Metadata

Resource Type Dataset
Metadata Created Date March 11, 2021
Metadata Updated Date July 29, 2022
Publisher National Institute of Standards and Technology
Maintainer
Identifier ark:/88434/mds2-2161
Data First Published 2019-12-20
Language en
Data Last Modified 2019-12-12 00:00:00
Category Information Technology:Software research, Information Technology:Data and informatics
Public Access Level public
Data Update Frequency irregular
Bureau Code 006:55
Metadata Context https://project-open-data.cio.gov/v1.1/schema/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id 022de511-a0aa-40c0-89c2-f97c17942a61
Harvest Source Id 74e175d9-66b3-4323-ac98-e2a90eeb93c0
Harvest Source Title NIST
Homepage URL https://data.nist.gov/od/id/mds2-2161
License https://www.nist.gov/open/license
Program Code 006:045
Source Datajson Identifier True
Source Hash 99da5a90fde9f01c6c279f5825ffbd5dab9a312a
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.