Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Trojan Detection Software Challenge - nlp-sentiment-classification-mar2021-train

Metadata Updated: September 30, 2023

Round 5 Train DatasetThe data being generated and disseminated is the train data used to construct trojan detection software solutions. This data, generated at NIST, consists of natural language processing (NLP) AIs trained to perform text sentiment classification on English text. A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 1656 adversarially trained, sentiment classification AI models using a small set of model architectures. The models were trained on text data drawn from movie and product reviews. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the images when the trigger is present. Errata: The following models were contaminated during dataset packaging. This caused nominally clean models to have a trigger. Please avoid using these models. Due to the similarity between the Round5 and Round6 datasets (both contain similarly trained sentiment classification AI models), the dataset authors suggest ignoring the Round5 data and only using the Round6 dataset. Corrupted Models: [id-00000007, id-00000014, id-00000030, id-00000036, id-00000047, id-00000074, id-00000080, id-00000088, id-00000089, id-00000097, id-00000103, id-00000105, id-00000122, id-00000123, id-00000124, id-00000127, id-00000148, id-00000151, id-00000154, id-00000162, id-00000165, id-00000181, id-00000184, id-00000185, id-00000193, id-00000197, id-00000198, id-00000207, id-00000230, id-00000236, id-00000239, id-00000240, id-00000244, id-00000251, id-00000256, id-00000258, id-00000265, id-00000272, id-00000284, id-00000321, id-00000336, id-00000364, id-00000389, id-00000391, id-00000396, id-00000423, id-00000425, id-00000446, id-00000449, id-00000463, id-00000468, id-00000479, id-00000499, id-00000516, id-00000524, id-00000532, id-00000537, id-00000563, id-00000575, id-00000577, id-00000583, id-00000592, id-00000629, id-00000635, id-00000643, id-00000644, id-00000685, id-00000710, id-00000720, id-00000724, id-00000730, id-00000735, id-00000780, id-00000784, id-00000794, id-00000798, id-00000802, id-00000808, id-00000818, id-00000828, id-00000841, id-00000864, id-00000867, id-00000923, id-00000970, id-00000971, id-00000973, id-00000989, id-00000990, id-00000996, id-00001000, id-00001036, id-00001040, id-00001041, id-00001044, id-00001048, id-00001053, id-00001059, id-00001063, id-00001116, id-00001131, id-00001139, id-00001146, id-00001159, id-00001163, id-00001166, id-00001171, id-00001183, id-00001188, id-00001201, id-00001211, id-00001233, id-00001251, id-00001262, id-00001291, id-00001300, id-00001302, id-00001305, id-00001312, id-00001314, id-00001327, id-00001341, id-00001344, id-00001346, id-00001364, id-00001365, id-00001373, id-00001389, id-00001390, id-00001391, id-00001392, id-00001399, id-00001414, id-00001418, id-00001425, id-00001449, id-00001470, id-00001486, id-00001516, id-00001517, id-00001518, id-00001532, id-00001533, id-00001537, id-00001542, id-00001549, id-00001579, id-00001580, id-00001581, id-00001586, id-00001591, id-00001599, id-00001600, id-00001604, id-00001610, id-00001618, id-00001643, id-00001650]

Access & Use Information

Public: This dataset is intended for public access and use. License: See this page for license information.

Downloads & Resources

Dates

Metadata Created Date March 11, 2021
Metadata Updated Date September 30, 2023

Metadata Source

Harvested from NIST

Additional Metadata

Resource Type Dataset
Metadata Created Date March 11, 2021
Metadata Updated Date September 30, 2023
Publisher National Institute of Standards and Technology
Maintainer
Identifier ark:/88434/mds2-2373
Data First Published 2021-03-08
Language en
Data Last Modified 2021-02-24 00:00:00
Category Information Technology:Software research, Information Technology:Cybersecurity, Information Technology:Computational science
Public Access Level public
Bureau Code 006:55
Metadata Context https://project-open-data.cio.gov/v1.1/schema/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id eb828557-f771-4102-a742-e652fe6eeef5
Harvest Source Id 74e175d9-66b3-4323-ac98-e2a90eeb93c0
Harvest Source Title NIST
Homepage URL https://data.nist.gov/od/id/mds2-2373
License https://www.nist.gov/open/license
Program Code 006:045
Source Datajson Identifier True
Source Hash 30a4249208ba50158009b2f6dde3f404b61d3b4bb0e64579e0027ea769228584
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.