Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Data from: A High-Quality Genome Assembly from a Single, Field-collected Spotted Lanternfly (Lycorma delicatula) using the PacBio Sequel II System

Metadata Updated: April 21, 2025

A high-quality reference genome is an essential tool for applied and basic research on arthropods. Long-read sequencing technologies may be used to generate more complete and contiguous genome assemblies than alternate technologies, however, long-read methods have historically had greater input DNA requirements and higher costs than next generation sequencing, which are barriers to their use on many samples. Here, we present a 2.3 Gb de novo genome assembly of a field-collected adult female Spotted Lanternfly (Lycorma delicatula) using a single PacBio SMRT Cell. The Spotted Lanternfly is an invasive species recently discovered in the northeastern United States, threatening to damage economically important crop plants in the region. The DNA from one individual female specimen collected in Reading, Berks County, Pennsylvania was used to make one standard, size-selected library with an average DNA fragment size of ~20 kb. The library was run on one Sequel II SMRT Cell 8M, generating a total of 132 Gb of long-read sequences, of which 82 Gb were from unique library molecules, representing approximately 38x coverage of the genome. The assembly had high contiguity (contig N50 length = 1.5 Mb), completeness, and sequence level accuracy as estimated by conserved gene set analysis (96.8% of conserved genes both complete and without frame shift errors). Further, it was possible to segregate more than half of the diploid genome into the two separate haplotypes. The assembly also recovered two microbial symbiont genomes known to be associated with L. delicatula, each microbial genome being assembled into a single contig. We demonstrate that field-collected arthropods can be used for the rapid generation of high-quality genome assemblies, an attractive approach for projects on emerging invasive species, disease vectors, or conservation efforts of endangered species. Supporting files for the manuscript "A High-Quality Genome Assembly from a Single, Field-collected Spotted Lanternfly (Lycorma delicatula) using the PacBio Sequel II System", include several intermediate versions of the assembly (raw output from Falcon, raw output from Falcon unzip, etc.) as well as the final assembly primary contigs and haplotigs (for the regions of the genome that were phased). Resources in this dataset:Resource Title: Final Assembly file . File Name: FinalAssembly.zipResource Description: Primary and haplotigs contigs in fasta format. File slf.8M.final.primary.fasta are the primary contigs, and slf.8M.final.haplotigs.fasta are the haplotigsResource Title: Falcon Raw assembly, polished with arrow. File Name: FalconAssembly.zipResource Description: Raw Primary contig assembly prior to falcon unzip. Contigs were polished with all subreads with arrow polishing tool.Resource Title: Fasta file of contig assemblies of the two symbiont genomes. File Name: Symbiont.zipResource Description: Contains contig fasta files for Sulcia (Sulciamuelleri.fa) and Vidania (vidania.fa) symbiont genomes recovered from the de novo assemblyResource Title: Haplotig placement file in PAF format. File Name: slf.haplotigPlacement.paf.zipResource Description: Final assembly placement file , describing the placement of haplotigs on the primary contig assemblyResource Title: Falcon Unzip assembly Polished with arrow . File Name: FalconUnzipAssembly.zipResource Description: Falcon unzip assembly both the primary and haplotigs, unfiltered

Access & Use Information

Public: This dataset is intended for public access and use. License: us-pd

Downloads & Resources

Dates

Metadata Created Date March 30, 2024
Metadata Updated Date April 21, 2025

Metadata Source

Harvested from USDA JSON

Additional Metadata

Resource Type Dataset
Metadata Created Date March 30, 2024
Metadata Updated Date April 21, 2025
Publisher Agricultural Research Service
Maintainer
Identifier 10.15482/USDA.ADC/1503745
Data Last Modified 2023-12-18
Public Access Level public
Bureau Code 005:18
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id 8d3f012f-7cd3-4a6d-9e30-06dfa862c1d0
Harvest Source Id d3fafa34-0cb9-48f1-ab1d-5b5fdc783806
Harvest Source Title USDA JSON
License https://www.usa.gov/publicdomain/label/1.0/
Old Spatial {"type": "Polygon", "coordinates": -75.915994048119, 40.335385813355, -75.915994048119, 40.346376494447, -75.897797942162, 40.346376494447, -75.897797942162, 40.335385813355, -75.915994048119, 40.335385813355}
Program Code 005:040
Source Datajson Identifier True
Source Hash 27ee026bbb41285434ee5cf0800517b64ebecf71ede5f674f1c73bad61847ab1
Source Schema Version 1.1
Spatial {"type": "Polygon", "coordinates": -75.915994048119, 40.335385813355, -75.915994048119, 40.346376494447, -75.897797942162, 40.346376494447, -75.897797942162, 40.335385813355, -75.915994048119, 40.335385813355}
Temporal 2018-08-26/2018-08-26

Didn't find what you're looking for? Suggest a dataset here.