Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Data from: Phased Genotyping-by-Sequencing Enhances Analysis of Genetic Diversity and Reveals Divergent Copy Number Variants in Maize

Metadata Updated: March 30, 2024

High-throughput sequencing (HTS) of reduced representation genomic libraries has ushered in an era of genotyping-by-sequencing (GBS), where genome-wide genotype data can be obtained for nearly any species. However, there remains a need for imputation-free GBS methods for genotyping large samples taken from heterogeneous populations of heterozygous individuals. This requires that a number of issues encountered with GBS be considered, including the sequencing of nonoverlapping sets of loci across multiple GBS libraries, a common missing data problem that results in low call rates for markers per individual, and a tendency for applicability only in inbred line samples with sufficient linkage disequilibrium for accurate imputation. We addressed these issues while developing and validating a new, comprehensive platform for GBS. This study supports the notion that GBS can be tailored to particular aims, and using Zea mays our results indicate that large samples of unknown pedigree can be genotyped to obtain complete and accurate GBS data. Optimizing size selection to sequence a high proportion of shared loci among individuals in different libraries and using simple in silico filters, a GBS procedure was established that produces high call rates per marker (>85%) with accuracy exceeding 99.4%. Furthermore, by capitalizing on the sequence-read structure of GBS data (stacks of reads), a new tool for resolving local haplotypes and scoring phased genotypes was developed, a feature that is not available in many GBS pipelines. Using local haplotypes reduces the marker dimensionality of the genotype matrix while increasing the informativeness of the data. Phased GBS in maize also revealed the existence of reproducibly inaccurate (apparent accuracy) genotypes that were due to divergent copy number variants (CNVs) unobservable in the underlying single nucleotide polymorphism (SNP) data. Resources in this dataset:Resource Title: Supplementary Data. File Name: Web Page, url: https://academic.oup.com/g3journal/article/7/7/2161/6053605#supplementary-data

Access & Use Information

Public: This dataset is intended for public access and use. License: Creative Commons Attribution

Downloads & Resources

Dates

Metadata Created Date March 30, 2024
Metadata Updated Date March 30, 2024

Metadata Source

Harvested from USDA JSON

Additional Metadata

Resource Type Dataset
Metadata Created Date March 30, 2024
Metadata Updated Date March 30, 2024
Publisher Agricultural Research Service
Maintainer
Identifier 10.1534/g3.117.042036
Data Last Modified 2024-02-13
Public Access Level public
Bureau Code 005:18
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id aa537b3b-dbdc-485b-a432-39488ada909b
Harvest Source Id d3fafa34-0cb9-48f1-ab1d-5b5fdc783806
Harvest Source Title USDA JSON
License https://creativecommons.org/licenses/by/4.0/
Old Spatial {"type": "Polygon", "coordinates": -531.5625, -83.164278290951, -531.5625, 85.287916121237, -161.71875, 85.287916121237, -161.71875, -83.164278290951, -531.5625, -83.164278290951}
Program Code 005:040
Source Datajson Identifier True
Source Hash f2a4646e35237592d7782783891a6e7f7e6179e5ea78e74d5277d94a2f40b952
Source Schema Version 1.1
Spatial {"type": "Polygon", "coordinates": -531.5625, -83.164278290951, -531.5625, 85.287916121237, -161.71875, 85.287916121237, -161.71875, -83.164278290951, -531.5625, -83.164278290951}

Didn't find what you're looking for? Suggest a dataset here.