Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Data from: Genotypic characterization of the U.S. peanut core collection

Metadata Updated: March 30, 2024

This collection contains supplementary data for the manuscript "Genotypic characterization of the U.S. Peanut Core Collection", which describes genotyping results for the USDA peanut core collection. Each accession was genotyped with the Arachis_Axiom2 SNP array, yielding 14,430 high-quality, informative SNPs across the collection. Additionally, a subset of the core collection was replicated genotyped in replicate, using between two and five seeds per accession to assess heterogeneity within an accession. Supplementary files include: descriptive information about the genotyped accessions, SNP genotype calls in several formats, a phylogenetic tree calculated from the genotype data, Structure analysis, PCA analysis, and comparisons with the diploid progenitors. This research was co-funded by the National Institute of Food and Agriculture and the National Peanut Board. Resources in this dataset:Resource Title: Structure membership breakdown. File Name: SF10_K5_membership.pdfResource Description: The proportion of accessions assigned to clusters 1-5 in a Structure analysis (manuscript Figure 3), for K=5 clusters. Resource Title: Structure membership assignments for accessions. File Name: SF11_K5_cluster_assignment.xlsxResource Description: The proportional assignments of each cluster to all accessions (relative to the Structure diagram shown in manuscript Figure 3). Resource Title: Principal components analysis. File Name: SF12_pca_34.pdfResource Description: Principal Component Analysis of 1120 samples based on 2063 unlinked SNP markers. The X-axis represents PC 3 and the Y-axis represents PC 4. Samples are colored and grouped according to: A. clade membership as defined in the phylogenetic and network analyses, B. botanical varieties, C. market type, D. growth Habit, E. pod shape, and F. collection type Resource Title: Pod images for PI 497426. File Name: SF14_PI497426_pods.jpgResource Description: Pods from accession PI 497426 (clade 4), illustrating the distinctive reticulation pattern seen in some accessions in this clade. Resource Title: Data dictionary. File Name: data_dictionary_KNWV.txtResource Description: Description of all files in this Dataset. Changes were made to this file on 4/15/202, to update some file names to indicate new versions.Resource Title: Main descriptive information about genotyped accessions. File Name: SF01_peanut_core_v14.xlsxResource Description: The main descriptive information about the genotyped accessions, including: information about replicate similarity; phylogenetic clades, geographic origin, and phenotype; and summaries of phenotypic and country information relative to clade assignments. Changes were made to this file on 4/15/2020: Added INDEX worksheet and corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: SNPs as called by the Axiom suite . File Name: SF02_SNPs_whole_Axiom_Arachis2_txt.gzResource Description: The original genotype calls for the Axiom array (for poly-high resolution SNPs). Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: Genotyping calls in VCF format. File Name: SF03_SNPs_whole_Axiom_Arachis2_vcf.gzResource Description: The Axiom array genotype calls, in VCF format. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: DNA variants for all accessions, including from genome assemblies, in TSV format. File Name: SF04_SNPs_w_4_genomes_tsv.gzResource Description: The predominant DNA variants at each SNP location, for all accessions, including variants inferred from four available genome assemblies: A. duranensis and A. ipaensis together, and A. hypogaea accessions Tifrunner, Shitouqi, and Fuhuasheng. The format is in a simple tab-separated table, with 14431 columns (SNP positions). Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: DNA variants for all accessions, including from genome assemblies, in fasta format. File Name: SF05_SNPs_w_4_gnm_mrgd_fas.gzResource Description: The predominant DNA variants at each SNP location, for all accessions, including variants inferred from four available genome assemblies: A. duranensis and A. ipaensis together, and A. hypogaea accessions Tifrunner, Shitouqi, and Fuhuasheng. In fasta format. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: Base-calls for selected accessions, relative to A- and B-genome progenitors. File Name: SF06_chip_and_genome_samples_v05.xlsxResource Description: DNA base-calls for 16 selected, diverse accessions, with comparisons to the variants observed in the A. duranensis and A. ipaensis genomes, and inferences regarding the likely progenitor for the DNA, i.e. A-genome (A. duranensis) or B-genome (A. ipaensis). Changes were made to this file on 4/15/2020: Added INDEX worksheet and corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: Reduced fasta alignments, at 98% identity. File Name: SF07_SNPs_w_4_gnm_mrgd_cen98_fas.gzResource Description: Reduced fasta alignments (relative to the complete alignment file, S5). File S7 has the centroid representatives at 98% identity. This files has 518 sequences. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: Reduced fasta alignments, at 99% identity. File Name: SF08_SNPs_w_4_gnm_mrgd_cen99_fas.gzResource Description: Reduced fasta alignments (relative to the complete alignment file, S5). File S8 has the centroid representatives at 99% identity. This file has 680 sequences. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: Phylogenetic tree of genotype data. File Name: SF09_SNPs_w_4_gnm_mrgd_rt3_nh_txt.gzResource Description: Phylogenetic tree (Newick format) calculated from the alignent in S5, and corresponding with the phylogenetic tree shown in manuscript Figure 1. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: Subgenome origins of SNPs relative to the A-genome and B-genome progenitors. File Name: SF13_chip_and_genome_GFFs.xlsxResource Description: Inferred subgenome origins of SNPs relative to the A-genome and B-genome progenitors (A. duranensis and A. ipaensis). This data is in GFF format, derived from S6, and used as the basis for the plots in Figure 7 (showing regions of possible subgenome invasions). Changes were made to this file on 4/15/2020: Added INDEX worksheet and corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2Resource Title: Peruvian Moche-era peanut necklace. File Name: SF15_Sipan_neclkace_Donnan_Einstein.jpgResource Description: Picture of necklace of peanuts, sculpted in gold and silver, from the Moche-era tomb at Sipán (c.AD 250) in coastal Peru. Photograph by Susan Einstein, courtesy of Christopher Donnan. Changes were made to this file on 4/15/2020: Replaced black-and-white derived image with original color image

Access & Use Information

Public: This dataset is intended for public access and use. License: us-pd

Downloads & Resources

Dates

Metadata Created Date March 30, 2024
Metadata Updated Date March 30, 2024

Metadata Source

Harvested from USDA JSON

Additional Metadata

Resource Type Dataset
Metadata Created Date March 30, 2024
Metadata Updated Date March 30, 2024
Publisher Agricultural Research Service
Maintainer
Identifier 10.15482/USDA.ADC/1518508
Data Last Modified 2024-02-15
Public Access Level public
Bureau Code 005:18
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id ec7e3a18-09d2-4605-abc2-418b928a4172
Harvest Source Id d3fafa34-0cb9-48f1-ab1d-5b5fdc783806
Harvest Source Title USDA JSON
License https://www.usa.gov/publicdomain/label/1.0/
Program Code 005:040
Source Datajson Identifier True
Source Hash 3a0fcf4be67b17b7b64e9c1940c6669d79825ba2177f784186b86e7592ab5109
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.