Included in this dataset are SNP and fasta data for the Pea Single Plant Plus Collection (PSPPC) and the PSPPC augmented with 25 P. fulvum accessions.
These 6 datasets can be roughly divided into two groups. Group 1 consists of three datasets labeled PSPPC which refer to SNP data pertaining to the USDA Pea Single Plant Plus Collection. Group 2 consists of three datasets labeled PSPPC + P. fulvum which refer to SNP data pertaining to the USDA PSPPC with 25 accessions of Pisum fulvum added. SNPs for each of these groups were called independently; therefore SNP names that are shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus.
For analysis, SNP data is available in two widely used formats: hapmap and vcf. These formats can be successfully loaded into TASSEL v. 5.2.25 (http://www.maizegenetics.net/tassel). Explanations of fields (columns) in the VCF files are contained within commented (##) rows at the top of the file.
Descriptions of the first 11 columns in the hapmap file are as follows:
rs#- Name of locus (i.e. SNP name)
alleles- Indicates the SNPs for each allele at the locus
chrom- Irrelevant for these datasets, since markers are unordered.
pos- Irrelevant for these datasets, since markers are unordered.
strand- Irrelevant for these datasets, since markers are unordered
assembly#- required field for hapmap format. NA for these datasets
center- required field for hapmap format. NA for these datasets
protLSID- required field for hapmap format. NA for these datasets
assayLSID- required field for hapmap format. NA for these datasets
panel- required field for hapmap format. NA for these datasets
QCcode- required field for hapmap format. NA for these datasets
The fasta sequences containing the SNPs are also available for such downstream applications as development of primers for platform-specific markers.
For more information about this dataset, contact Clarice Coyne at Clarice.Coyne@usda.gov or coynec@wsu.edu. Resources in this dataset:Resource Title: PSPPC SNPs in hapmap format. File Name: PSPPC.hmp.txtResource Description: 66591 unanchored SNPs for the PSPPC collection in hapmap formatResource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: PSPPC SNP FASTA Sequences. File Name: PSPPC.fa.txtResource Description: FASTA sequences for each allele of the PSPPC SNP datasetResource Title: PPSPPC + P. fulvum SNPs in hapmap format. File Name: PSPPC+fulvums.hmp.txtResource Description: 67400 SNPs from the PSPPC augmented with 25 P. fulvum accessions in hapmap format. SNP names are independent and unrelated to plain PSPPC SNP files.Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: PSPPC + P. fulvum SNP FASTA Sequences. File Name: PSPPC+fulvums.fa.txtResource Description: FASTA sequences for each allele of the PSPPC + P. fulvum SNP dataset. SNP names are independent and unrelated to plain PSPPC SNP files.Resource Title: PSPPC + P. fulvum SNPs in vcf format. File Name: PSPPC+fulvums.vcf.txtResource Description: 67400 SNPs from the PSPPC augmented with 25 P. fulvum accessions in vcf format. SNP names are independent and unrelated to plain PSPPC SNP files.Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: PSPPC SNPs in vcf format. File Name: PSPPC.vcf.txtResource Description: 66591 SNPs from the PSPPC in vcf formatResource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: README. File Name: Data Dictionary.docxResource Description: These data are for the Pea Single Plant Plus Collection (PSPPC) and the PSPPC augmented with 25 P. fulvum accessions.
The 6 datasets can be divided into two groups. Group 1 consists of 3 datasets labeled “PSPPC” which refer to SNP data pertaining to the USDA Pea Single Plant Plus Collection. Group 2 consists of 3 datasets labeled “PSPPC + P. fulvum” which refer to SNP data pertaining to the PSPPC with 25 accessions of Pisum fulvum added. SNPs for each of these groups were called independently; therefore any SNP name that is shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus.
For analysis, SNP data is available in two widely used formats: hapmap and vcf. These files were successfully loaded into the standalone version of TASSEL v. 5.2.25 (http://www.maizegenetics.net/tassel).
Explanations of fields (columns) in the VCF files are contained within commented (##) rows at the top of the file.
The first 11 columns required for the hapmap format are as follows:
rs#- Name of locus (i.e. SNP name)
alleles- Indicates the SNPs for each allele at the locus
chrom- N/A, since markers are unordered.
pos- N/A, since markers are unordered.
strand- N/A, since markers are unordered
assembly#- N/A
center- N/A
protLSID- N/A
assayLSID- N/A
panel- N/A
QCcode- N/A
The fasta sequences containing the SNPs are also available here for such downstream applications as development of primers for platform-specific markers.