Data from: Pathogen webs in collapsing honey bee colonies

Published by Agricultural Research Service | Department of Agriculture | Catalog Last Checked: May 05, 2026 at 11:41 PM | Dataset Last Updated: May 02, 2025

Here we explore the incidence and abundance of currently known honey bee pathogens in colonies suffering from Colony Collapse Disorder (CCD), otherwise weak colonies, and strong colonies from across the United States. This data set was generated in order to use deep RNA sequencing to further characterize microbial diversity in CCD and non-CCD hives. We identified novel strains of the recently described Lake Sinai viruses (LSV) and found evidence of a shift in gut bacterial composition that may be a biomarker of CCD. The results are discussed with respect to host-parasite interactions and other environmental stressors of honey bees. RNA was pooled by combining equal aliquots from each CCD or non-CCD colony described above. Five µg of RNA from the “CCD−” pool was used to generate cDNA using a cocktail of random heptamer primers. cDNA was size-selected from agarose and end-polished with End Repair Enzyme (Illumina) following manufacturer protocols. A 3′ polyadenine tract was then added with Klenow fragment (Invitrogen) and the products purified with a Qiaquick DNA purification column (Qiagen). Illumina adapters were ligated to cDNA with T4 DNA ligase and the products were amplified under the following thermocycler conditions: an initial denaturing step at 98°C for 30 seconds, followed by 14 cycles at 98°C for 30 seconds, 65°C for 30 seconds, and 72°C for 30 seconds. Final products of 100–300 bp were size-selected from agarose and sequenced on an Illumina Genome Analyzer by the Institute for Genome Sciences, University of Maryland, Baltimore. Equivalently prepared cDNA from the “CCD+” pool was sequenced using a paired-end strategy with a 350-bp fragment size. A paired-end approach facilitates the assembly of longer contigs, and therefore may provide more diagnostic sequences for annotation, but at a cost of reduced read length (67 bp). Both sequencing runs were quality-trimmed by retaining only the longest contiguous sequence of each read with a minimum (Phred-equivalent) quality score of 15, excepting at most one ambiguous base. Reads less than 50 bp after this trimming step were discarded. A small number of reads were removed because they matched Illumina primer sequence in the Univec database (www.ncbi.nlm.nih.gov/VecScreen/UniVec.html). Reads were assembled into contigs using the Velvet assembly package [24]. CCD− reads were assembled into contigs using multiple iterations of Velvet with successive hash lengths of 21, 31, 41, 51, or 61. Contigs of less than 100 bp or with less than 3X coverage were discarded. This assembly strategy was chosen to accommodate the broad spectrum of RNA sources in the sample (viruses, a diverse bacterial community, and eukaryotic pathogens as well as the host genome) that are likely to have different optimal hash lengths for assembly. CCD+ reads were assembled in a similar fashion without read-pair information; in addition, a single paired-end assembly was performed with Velvet using a hash length of 21 and an expected fragment length of 350. Contigs from all intermediate assemblies were then merged using the BlastClust component of Basic Local Alignment Search Tool (BLAST) at 98% identity and 90% nonreciprocal overlap. Because there was substantial redundancy of contigs remaining after this step, we input the contigs to CAP3 [25] for more aggressive assembly, requiring a 60-bp overlap with 92% identity. Raw reads are available as accessions SRX028143 and SRX028145 of the National Center for Biotechnology Information (NCBI) Sequence Read Archive, however, the resulting contigs were not submitted because of an NCBI policy against hosting assemblies from mixed sources. Highlight photo credit:Image D2368-2 - Honey bee landing on a watermelon flower: Copyright free, public domain photo by Stephen Ausmus Resources in this dataset:Resource Title: cDNA contigs resulting from assembly of Illumina sequence reads for Pathogen Webs in Collapsing Honey Bee Colonies. File Name: Web Page, url: http://journals.plos.org/plosone/article/asset?unique&id=info:doi/10.1371/journal.pone.0043562.s003 This is File S2 of the supplemental data with Cornman, R. S., Tarpy, D. R., Chen, Y., Jeffreys, L., Lopez, D., Pettis, J. S., … Evans, J. D. (2012). Pathogen webs in collapsing honey bee colonies. PloS One, 7(8), e43562. doi:10.1371/journal.pone.0043562 http://handle.nal.usda.gov/10113/60548

Resources

1 resource available

http://journals.plos.org/plosone/article/asset?unique&id=info:doi/10.1371/journal.pone.0043562.s003

TEXT/HTML

Download

Find Related Datasets

Search by Tags

Click any tag below to search for similar datasets

Complete Metadata

@type	dcat:Dataset
accessLevel	public
bureauCode	[ "005:18" ]
contactPoint	{ "fn": "Evans, Jay D.", "hasEmail": "mailto:jay.evans@ars.usda.gov" }
description	<p>Here we explore the incidence and abundance of currently known honey bee pathogens in colonies suffering from Colony Collapse Disorder (CCD), otherwise weak colonies, and strong colonies from across the United States. This data set was generated in order to use deep RNA sequencing to further characterize microbial diversity in CCD and non-CCD hives. We identified novel strains of the recently described Lake Sinai viruses (LSV) and found evidence of a shift in gut bacterial composition that may be a biomarker of CCD. The results are discussed with respect to host-parasite interactions and other environmental stressors of honey bees.</p> <p>RNA was pooled by combining equal aliquots from each CCD or non-CCD colony described above. Five µg of RNA from the “CCD−” pool was used to generate cDNA using a cocktail of random heptamer primers. cDNA was size-selected from agarose and end-polished with End Repair Enzyme (Illumina) following manufacturer protocols. A 3′ polyadenine tract was then added with Klenow fragment (Invitrogen) and the products purified with a Qiaquick DNA purification column (Qiagen). Illumina adapters were ligated to cDNA with T4 DNA ligase and the products were amplified under the following thermocycler conditions: an initial denaturing step at 98°C for 30 seconds, followed by 14 cycles at 98°C for 30 seconds, 65°C for 30 seconds, and 72°C for 30 seconds. Final products of 100–300 bp were size-selected from agarose and sequenced on an Illumina Genome Analyzer by the Institute for Genome Sciences, University of Maryland, Baltimore.</p> <p>Equivalently prepared cDNA from the “CCD+” pool was sequenced using a paired-end strategy with a 350-bp fragment size. A paired-end approach facilitates the assembly of longer contigs, and therefore may provide more diagnostic sequences for annotation, but at a cost of reduced read length (67 bp). Both sequencing runs were quality-trimmed by retaining only the longest contiguous sequence of each read with a minimum (Phred-equivalent) quality score of 15, excepting at most one ambiguous base. Reads less than 50 bp after this trimming step were discarded. A small number of reads were removed because they matched Illumina primer sequence in the Univec database (<a href="http://www.ncbi.nlm.nih.gov/VecScreen/UniVec.html">www.ncbi.nlm.nih.gov/VecScreen/UniVec.html</a>).</p> <p>Reads were assembled into contigs using the Velvet assembly package [24]. CCD− reads were assembled into contigs using multiple iterations of Velvet with successive hash lengths of 21, 31, 41, 51, or 61. Contigs of less than 100 bp or with less than 3X coverage were discarded. This assembly strategy was chosen to accommodate the broad spectrum of RNA sources in the sample (viruses, a diverse bacterial community, and eukaryotic pathogens as well as the host genome) that are likely to have different optimal hash lengths for assembly. CCD+ reads were assembled in a similar fashion without read-pair information; in addition, a single paired-end assembly was performed with Velvet using a hash length of 21 and an expected fragment length of 350. Contigs from all intermediate assemblies were then merged using the BlastClust component of Basic Local Alignment Search Tool (BLAST) at 98% identity and 90% nonreciprocal overlap. Because there was substantial redundancy of contigs remaining after this step, we input the contigs to CAP3 [25] for more aggressive assembly, requiring a 60-bp overlap with 92% identity. Raw reads are available as accessions SRX028143 and SRX028145 of the National Center for Biotechnology Information (NCBI) Sequence Read Archive, however, the resulting contigs were not submitted because of an NCBI policy against hosting assemblies from mixed sources.</p> <p>Highlight photo credit:<br><a href="http://www.ars.usda.gov/is/graphics/photos/may12/d2368-2.htm">Image D2368-2 - Honey bee landing on a watermelon flower:</a> Copyright free, public domain photo by Stephen Ausmus</p> <div><br>Resources in this dataset:</div><br><ul><li><p>Resource Title: cDNA contigs resulting from assembly of Illumina sequence reads for Pathogen Webs in Collapsing Honey Bee Colonies.</p> <p>File Name: Web Page, url: <a href="http://journals.plos.org/plosone/article/asset?unique&id=info:doi/10.1371/journal.pone.0043562.s003" target="_blank">http://journals.plos.org/plosone/article/asset?unique&id=info:doi/10.1371/journal.pone.0043562.s003</a> </p><p></p><p>This is File S2 of the supplemental data with Cornman, R. S., Tarpy, D. R., Chen, Y., Jeffreys, L., Lopez, D., Pettis, J. S., … Evans, J. D. (2012). Pathogen webs in collapsing honey bee colonies. PloS One, 7(8), e43562. doi:10.1371/journal.pone.0043562 <a href="http://handle.nal.usda.gov/10113/60548">http://handle.nal.usda.gov/10113/60548</a></p> <p></p></li></ul>
distribution	[ { "@type": "dcat:Distribution", "title": "http://journals.plos.org/plosone/article/asset?unique&id=info:doi/10.1371/journal.pone.0043562.s003", "mediaType": "text/html", "downloadURL": "http://journals.plos.org/plosone/article/asset?unique&id=info:doi/10.1371/journal.pone.0043562.s003" } ]
identifier	10.1371/journal.pone.0043562.s003
keyword	[ "ARS", "Insects", "NP305", "data.gov", "pollinators" ]
license	https://creativecommons.org/publicdomain/zero/1.0/
modified	2025-05-02
programCode	[ "005:040" ]
publisher	{ "name": "Agricultural Research Service", "@type": "org:Organization" }
temporal	2007-01-01/2007-01-01
title	Data from: Pathogen webs in collapsing honey bee colonies

Have questions or suggestions about this dataset? Reach out to the contact below.