{"@type": "dcat:Dataset", "accessLevel": "public", "bureauCode": ["005:18"], "contactPoint": {"fn": "Brown, Anne V.", "hasEmail": "mailto:anne.brown@usda.gov"}, "description": "<p dir=\"ltr\">This data is from the manuscript titled: \"Genetic variation among 481 diverse soybean accessions, inferred from genomic re-sequencing\". SNP calls were obtained from resequencing 481 diverse soybean lines comprising 52 wild (<i>Glycine soja</i>) and 429 cultivated (<i>Glycine max</i>). This dataset contains 6 gzipped VCF (Variant Call Format) files with variant calls for all 481 USB accessions, all <i>G. max</i> accessions, <i>G. soja</i> accessions, accessions sequenced at 15x coverage, accessions sequenced at 40x coverage, and 106 accessions re-sequenced from a previous study (Valliyodan et al. 2016). SNPs were called using the Haplotype caller algorithm from the Genome Analysis Toolkit (GATK) version gatk-2.5-2-gf57256b. A total of 7.8 million SNPs were identified between the 481 re-sequenced accessions. SNPs were assigned IDs using the script \"assign_name.awk\" available at <a href=\"https://github.com/soybase/SoySNP-Names\">https://github.com/soybase/SoySNP-Names</a>. SNP effects were predicted using SnpEff 3.0.</p><p dir=\"ltr\">Dataset also available at <a href=\"https://soybase.org/data/v2/Glycine/max/diversity/Wm82.gnm2.div.Valliyodan_Brown_2021/\">https://soybase.org/data/v2/Glycine/max/diversity/Wm82.gnm2.div.Valliyodan_Brown_2021/</a></p><p dir=\"ltr\">Funding support provided by the United Soybean Board for the large-scale sequencing of soybean genomes (project #1320-532-5615), Bayer (previously Monsanto and Bayer), and Corteva (previously Dow AgroSciences), with in-kind support for analysis from USDA Agricultural Research Service project 5030-21000-069-00-D.</p><p dir=\"ltr\">Resources in this dataset:</p><ul><li>Resource Title: Data Dictionary.File Name: Data_Dictionary_USB481.csvResource Description: Provides the name of Data file with details of Data type, Description of data content, Correspondence to SoyBase Data Store File, and Size of file.</li><li><br></li><li>Resource Title: List_of_Accessions.txt.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB481_list.txt.gzResource Description: Table containing the list of all the accessions that were re-sequenced and the metadata associated with each accession.</li><li><br></li><li>Resource Title: Alignment_used_for_Phylogenetic_trees.fna.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.sampled_25Kpos.fna_.gzResource Description: Aligned SNP data for USB481 accessions, based on SNPs sampled at one SNP per 25kb</li><li><br></li><li>Resource Title: Phylogenetic_tree.nh.txt.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.sampled_25Kpos.nh_.txt.gzResource Description: Phylogenetic tree (newick format) of SNP data for USB481 data, based on SNPs sampled at one SNP per 25kb</li><li><br></li><li>Resource Title: Phylogenetic_tree.pxml.txt.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.sampled_25Kpos.pxml_.txt.gzResource Description: Phylogenetic tree (phyloxml format; colored) of SNP data for USB481 data, based on SNPs sampled at one SNP per 25kb</li><li><br></li><li>Resource Title: SNP_Effect_predictions.gff3.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.snpEff.gff3_.gzResource Description: Output from snpEff program using the SNPs from the full USB481.vcf file as input.</li><li><br></li><li>Resource Title: Soja_SNP_calls.vcf.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.Soja_.vcf.gzResource Description: Genotype information in vcf format for 45 Soja lines from USB-funded project.</li><li><br></li><li>Resource Title: Soy106.vcf.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.Soy106.vcf.gzResource Description: Genotype information in VCF format for 106 accessions from USB-funded project; from Valliyodan et al Sci Rep 2016.</li><li><br></li><li>Resource Title: USB481_index.vcf.gz.tbi.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB481_list.txt.gzResource Description: Binary indexed USB481.vcf.gz produced using tabix.</li><li><br></li><li>Resource Title: USB-40x.vcf.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB-40x.vcf.gzResource Description: Genotype information in VCF format for 46 accessions sequenced at 40x coverage from USB-funded project.</li><li><br></li><li>Resource Title: SnpEff_predictions_Gmax_Accessions.gff.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.snpEff_Gmax.gff_.gzResource Description: SnpEff results in GFF format using the USB481_nosoja.vcf file as input.</li><li><br></li><li>Resource Title: SnpEff_predictions_Gsoja_Accessions.gff.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.snpEff_Gsoja.gff_.gzResource Description: SnpEff output in GFF format using Soja_SNP_Calls.vcf.gz as an input.</li><li><br></li><li>Resource Title: USB-15x.vcf.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB-15x.vcf.gzResource Description: Genotype information in VCF format for 284 accessions sequenced at 15x coverage from USB-funded project.</li><li><br></li><li>Resource Title: USB481.vcf.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB481.vcf.gzResource Description: Genotype information in VCF format for all 481 accessions from USB-funded project <a href=\"https://soybase.org/data/public/Glycine_max/Wm82.gnm2.div.G787/glyma.Wm82.gnm2.div.G787.USB481.vcf.gz\" target=\"_blank\">https://soybase.org/data/public/Glycine_max/Wm82.gnm2.div.G787/glyma.Wm82.gnm2.div.G787.USB481.vcf.gz</a> </li><li>Title: USB481_nosoja.vcf.gz.File Name: glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB481_nosoja.vcf.gzResource Description: Combined genotype information, in VCF format, for all USB lines excluding the Sjoa lines from USB funded project. https://soybase.org/data/public/Glycine_max/Wm82.gnm2.div.G787/glyma.Wm82.gnm2.div.G787.USB481_nosoja.vcf.gz</li></ul><p><br></p>", "distribution": [{"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566043", "format": "gz", "mediaType": "application/gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.sampled_25Kpos.nh_.txt.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566046", "format": "gz", "mediaType": "application/gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB481_list.txt.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566049", "format": "gz", "mediaType": "application/gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.sampled_25Kpos.pxml_.txt.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566052", "format": "gz", "mediaType": "application/gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.sampled_25Kpos.fna_.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566055", "format": "gz", "mediaType": "application/gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.snpEff_Gmax.gff_.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566058", "format": "gz", "mediaType": "application/gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.snpEff.gff3_.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566061", "format": "gz", "mediaType": "application/gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.snpEff_Gsoja.gff_.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566067", "format": "gz", "mediaType": "application/gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB-40x.vcf.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566076", "format": "gz", "mediaType": "application/x-gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.Soy106.vcf.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566082", "format": "gz", "mediaType": "application/gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.Soja_.vcf.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566112", "format": "gz", "mediaType": "application/x-gzip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB-15x.vcf.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566157", "format": "gz", "mediaType": "application/zip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB481_nosoja.vcf.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44566160", "format": "gz", "mediaType": "application/zip", "title": "glyma.Wm82.gnm2_.div_.Valliyodan_Brown_2021.USB481.vcf.gz"}, {"@type": "dcat:Distribution", "downloadURL": "https://ndownloader.figshare.com/files/44577088", "format": "csv", "mediaType": "application/csv", "title": "Data_Dictionary_USB481.csv"}], "identifier": "10.15482/USDA.ADC/1518301", "keyword": ["ARS", "NP301", "SNPs", "SoyBase", "data.gov", "genetic variation", "resequencing", "soybean"], "license": "https://www.usa.gov/publicdomain/label/1.0/", "modified": "2024-02-21", "programCode": ["005:040"], "publisher": {"@type": "org:Organization", "name": "Agricultural Research Service"}, "title": "Data from: Genetic variation among 481 diverse soybean accessions"}