Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Data from: A Community Resource for Exploring and Utilizing Genetic Diversity in the USDA Pea Single Plant Plus Collection

Metadata Updated: March 30, 2024

Included in this dataset are SNP and fasta data for the Pea Single Plant Plus Collection (PSPPC) and the PSPPC augmented with 25 P. fulvum accessions. These 6 datasets can be roughly divided into two groups. Group 1 consists of three datasets labeled PSPPC which refer to SNP data pertaining to the USDA Pea Single Plant Plus Collection. Group 2 consists of three datasets labeled PSPPC + P. fulvum which refer to SNP data pertaining to the USDA PSPPC with 25 accessions of Pisum fulvum added. SNPs for each of these groups were called independently; therefore SNP names that are shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus. For analysis, SNP data is available in two widely used formats: hapmap and vcf. These formats can be successfully loaded into TASSEL v. 5.2.25 (http://www.maizegenetics.net/tassel). Explanations of fields (columns) in the VCF files are contained within commented (##) rows at the top of the file. Descriptions of the first 11 columns in the hapmap file are as follows:

rs#- Name of locus (i.e. SNP name) alleles- Indicates the SNPs for each allele at the locus chrom- Irrelevant for these datasets, since markers are unordered. pos- Irrelevant for these datasets, since markers are unordered. strand- Irrelevant for these datasets, since markers are unordered assembly#- required field for hapmap format. NA for these datasets center- required field for hapmap format. NA for these datasets protLSID- required field for hapmap format. NA for these datasets assayLSID- required field for hapmap format. NA for these datasets panel- required field for hapmap format. NA for these datasets QCcode- required field for hapmap format. NA for these datasets

The fasta sequences containing the SNPs are also available for such downstream applications as development of primers for platform-specific markers. For more information about this dataset, contact Clarice Coyne at Clarice.Coyne@usda.gov or coynec@wsu.edu. Resources in this dataset:Resource Title: PSPPC SNPs in hapmap format. File Name: PSPPC.hmp.txtResource Description: 66591 unanchored SNPs for the PSPPC collection in hapmap formatResource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: PSPPC SNP FASTA Sequences. File Name: PSPPC.fa.txtResource Description: FASTA sequences for each allele of the PSPPC SNP datasetResource Title: PPSPPC + P. fulvum SNPs in hapmap format. File Name: PSPPC+fulvums.hmp.txtResource Description: 67400 SNPs from the PSPPC augmented with 25 P. fulvum accessions in hapmap format. SNP names are independent and unrelated to plain PSPPC SNP files.Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: PSPPC + P. fulvum SNP FASTA Sequences. File Name: PSPPC+fulvums.fa.txtResource Description: FASTA sequences for each allele of the PSPPC + P. fulvum SNP dataset. SNP names are independent and unrelated to plain PSPPC SNP files.Resource Title: PSPPC + P. fulvum SNPs in vcf format. File Name: PSPPC+fulvums.vcf.txtResource Description: 67400 SNPs from the PSPPC augmented with 25 P. fulvum accessions in vcf format. SNP names are independent and unrelated to plain PSPPC SNP files.Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: PSPPC SNPs in vcf format. File Name: PSPPC.vcf.txtResource Description: 66591 SNPs from the PSPPC in vcf formatResource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: README. File Name: Data Dictionary.docxResource Description: These data are for the Pea Single Plant Plus Collection (PSPPC) and the PSPPC augmented with 25 P. fulvum accessions. The 6 datasets can be divided into two groups. Group 1 consists of 3 datasets labeled “PSPPC” which refer to SNP data pertaining to the USDA Pea Single Plant Plus Collection. Group 2 consists of 3 datasets labeled “PSPPC + P. fulvum” which refer to SNP data pertaining to the PSPPC with 25 accessions of Pisum fulvum added. SNPs for each of these groups were called independently; therefore any SNP name that is shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus. For analysis, SNP data is available in two widely used formats: hapmap and vcf. These files were successfully loaded into the standalone version of TASSEL v. 5.2.25 (http://www.maizegenetics.net/tassel). Explanations of fields (columns) in the VCF files are contained within commented (##) rows at the top of the file. The first 11 columns required for the hapmap format are as follows: rs#- Name of locus (i.e. SNP name) alleles- Indicates the SNPs for each allele at the locus chrom- N/A, since markers are unordered. pos- N/A, since markers are unordered. strand- N/A, since markers are unordered assembly#- N/A center- N/A protLSID- N/A assayLSID- N/A panel- N/A QCcode- N/A The fasta sequences containing the SNPs are also available here for such downstream applications as development of primers for platform-specific markers.

Access & Use Information

Public: This dataset is intended for public access and use. License: us-pd

Downloads & Resources

Dates

Metadata Created Date March 30, 2024
Metadata Updated Date March 30, 2024

Metadata Source

Harvested from USDA JSON

Additional Metadata

Resource Type Dataset
Metadata Created Date March 30, 2024
Metadata Updated Date March 30, 2024
Publisher Agricultural Research Service
Maintainer
Identifier 10.15482/USDA.ADC/1347137
Data Last Modified 2024-02-08
Public Access Level public
Bureau Code 005:18
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id 77077659-0093-41b7-bfd4-0e6252af8ae6
Harvest Source Id d3fafa34-0cb9-48f1-ab1d-5b5fdc783806
Harvest Source Title USDA JSON
License https://www.usa.gov/publicdomain/label/1.0/
Old Spatial {"type": "Polygon", "coordinates": -166.640625, -59.987997631212, -166.640625, 83.254516804633, 194.765625, 83.254516804633, 194.765625, -59.987997631212, -166.640625, -59.987997631212}
Program Code 005:040
Source Datajson Identifier True
Source Hash 98f42082f1c7cb1bcc54a4e39fa1beeddcac0593db9fe9534e936d6dc0e9a82f
Source Schema Version 1.1
Spatial {"type": "Polygon", "coordinates": -166.640625, -59.987997631212, -166.640625, 83.254516804633, 194.765625, 83.254516804633, 194.765625, -59.987997631212, -166.640625, -59.987997631212}

Didn't find what you're looking for? Suggest a dataset here.