Return to search results
NIST Excerpts Benchmark Data
The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products.
Find Related Datasets
Search by Tags
Click any tag below to search for similar datasets
Complete Metadata
| @type | dcat:Dataset |
|---|---|
| accessLevel | public |
| bureauCode |
[
"006:55"
]
|
| contactPoint |
{
"fn": "Gary Howarth II",
"hasEmail": "mailto:gary.howarth@nist.gov"
}
|
| describedBy | https://github.com/usnistgov/SDNist/tree/main/BenchmarkData |
| description | The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products. |
| distribution |
[
{
"title": "NIST Excerpt Benchmark Data",
"format": "A data respository",
"accessURL": "https://github.com/usnistgov/SDNist/tree/main/BenchmarkData",
"description": "The NIST Data Excerpts are curated subsets of publicly released tabular data sets, drawn from real households and businesses in the U.S. The Excerpts serve as benchmark data for the [SDNist v2: Deidentified Data Report Tool](https://github.com/usnistgov/SDNist/) ."
}
]
|
| identifier | ark:/88434/mds2-2895 |
| issued | 2023-06-02 |
| keyword |
[
"American Community Survey",
"SDNist",
"demographic data",
"privacy",
"synthetic data"
]
|
| landingPage | https://data.nist.gov/od/id/mds2-2895 |
| language |
[
"en"
]
|
| license | https://www.nist.gov/open/license |
| modified | 2025-01-31 00:00:00 |
| programCode |
[
"006:045"
]
|
| publisher |
{
"name": "National Institute of Standards and Technology",
"@type": "org:Organization"
}
|
| theme |
[
"Information Technology:Data and informatics",
"Information Technology:Privacy",
"Information Technology:Software research"
]
|
| title | NIST Excerpts Benchmark Data |