Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

HoloBee Database v2016.1

Metadata Updated: April 21, 2025

Organisms living in honey bees and honey bee colonies form large associative holobiont communities that are integral to bee biology. High-throughput sequencing approaches to characterize these holobiont communities from honey bees in various states of health and disease are now commonplace, producing large amounts of nucleotide sequence data that must be accurately and consistently analyzed in order to produce reliable and comparable reports. In addition, new species designations and revisions are actively being made from honey bee holobiont communities, complicating nomenclature in larger databases where taxonomic descriptions associated with archived sequences can quickly become outdated and misleading.
To improve the accuracy and consistency of honey bee holobiont research, we have developed HoloBee: a curated database of publicly accessioned nucleotide sequences from the honey bee holobiont community. Except in rare and noted exceptions made by curators, sequences used in HoloBee were obtained from, or in association with, Apis mellifera (Western honey bee) as well as other honey bee species where available (e.g. Apis cerana, Apis dorsata, Apis laboriosa, Apis koschevnikovi, Apis florea, Apis andreniformis and Apis nigrocincta). Sources include: within or on the surface of honey bees (adult, pupae, larvae, egg), corbicular pollen, bee bread, royal jelly, honey, comb, hive surfaces (e.g. bottom board debris, frames, landing platforms), and isolates of microbes, parasites and pathogens from honey bees. HoloBee contains two non-overlapping sets of sequence data, HoloBee-Barcode and HoloBee-Mop, each of which have distinct intended uses. HoloBee-Barcode is a non-redundant database of taxonomically informative barcoding loci for all viruses, bacteria, fungi, protozoans and metazoans associated with honey bees (Apis spp.). It was created from an exhaustive master sequence archive of all valid holobiont sequences. Redundancy was removed from this master archive using a clustering algorithm that grouped sequences with ≥ 99% identity and retained the longest sequence from each cluster as the representative accession for that sequence type (“centroid”). These centroid sequences were concatenated into a fasta formatted file to create the HoloBee-Barcode database. Associated taxonomy for each centroid, including Superkingdom through Species and Strain/Isolate, was individually reviewed and corrected when necessary by a curator. Cross reference tables (separated according to 5 major taxonomic groups) provide a user-friendly outline of information for each centroid accession within HoloBee-Barcode including taxonomy, gene/product name, sequence length, the unaltered NCBI definition line, the number and identity of redundant sequences clustered within each centroid, and any additional information provided by the curator. HoloBee-Barcode centroid counts are: Viruses = 86; Bacteria = 496; Fungi = 41; Protozoa = 4; Metazoa = 60. HoloBee-Barcode is intended to improve and standardize quantitative and qualitative metagenomic descriptions of holobiont communities associated with honey bees by providing a curated set of barcode sequences. The goal of genetic barcoding is to associate a nucleotide sequence sample to a taxonomically valid species. Genomic regions targeted for such barcoding purposes varied by taxonomic group. The small subunit (SSU) ribosomal RNA, or 16S rRNA, is the most commonly used barcode for bacteria and is used in HB-Barcode. These 16S rRNA sequences will support the analysis of data generated with the widely used approach of amplicon-based 16S rRNA deep sequencing to study microbiota communities. Although barcode markers for fungi are less definitive than bacteria, HB-Barcode defaults to the ribosomal RNA internal transcribed spacer region (ITS), which typically includes ITS-1, 5.8S, and ITS-2. For some clades that cannot be resolved by this region, other barcode markers were selected. The majority of barcodes for metazoan taxa are the mitochondrial locus cytochrome c oxidase subunit I (COI). Complete mitochondrial DNA (mtDNA) sequence for Apis cerana (Asian honey bee) and Galleria mellonella (Greater wax moth) are included as barcodes for these species. We note that A. cerana mtDNA is included because it is considered a potentially invasive honey bee species and monitoring for its occurrence is in practice regionally, including in Australia, New Zealand and the USA. Protozoan barcodes include cytochrome b oxidase (Cytb), SSU, or ITS while entire genomes are used for viral barcoding. HoloBee-Mop is a database comprised mostly of chromosomal, mitochondrial and plasmid genome assemblies in order to aggregate as much honey bee holobiont genomic sequence information as possible. For a few organisms without genome assembly data, transcriptome data are included (e.g. Aethina tumida, small hive beetle). Unlike HoloBee-Barcode, redundancy removal was not performed on the HoloBee-Mop database and thus this resource provides an archive of nucleotide sequence assemblies from honey bee holobionts. However, since full viral genomes are used in HoloBee-Barcode, only redundant viral sequences occur in HoloBee-Mop. All accessions within each of these assemblies were concatenated into a single fasta formatted file to create the HoloBee-Mop database. The intended purpose of HoloBee-Mop is to improve honey bee genome and transcriptome assemblies by “mopping-up” as much viral, bacterial, fungal, protozoan and non-honey bee metazoan sequence data as possible. Therefore, sequence data remaining after processing reads through both HoloBee-Barcode and HoloBee-Mop that do not map to the honey bee genome may contain unique data from taxonomic variants or novel species. Details for each sequence assembly within HoloBee-Mop are tabulated in cross reference tables according to each major taxonomic group. HoloBee-Mop assembly counts are: Viruses = 2; Bacteria = 55; Fungi = 5; Protozoa = 1; Metazoa = 6. Follow the HoloBee database on Twitter at: https://twitter.com/HoloBee_db For questions about the HoloBee database, contact: HoloBee database team: holobee.db@gmail.com Jay Evans: Jay.Evans@ars.usda.gov Anna Childers: Anna.Childers@ars.usda.gov Resources in this dataset:Resource Title: HoloBee_v2016.1 sequence database. File Name: HB_v2016.1.zipResource Description: This compressed file contains two fasta sequence files:

  1. HB_Bar_v2016.1.fasta (HoloBee-Barcode database)
  2. HB_Mop_v2016.1.fasta (HoloBee-Mop database)

md5 values:

  • HB_v2016.1.zip: 6e372e443744282128eb51488176503f
  • HB_Bar_v2016.1.fasta: 109e1f686a690c70ef78fc4b5066a01f
  • HB_Mop_v2016.1.fasta: ced8c3f5987dce69e800c8c491471eba

Resource Title: data dictionary for HoloBee_v2016.1. File Name: Data_Dictionary_HoloBee_v2016.1.xlsxResource Title: HoloBee_v2016.1 cross reference tables. File Name: HB_v2016.1_crossref.zipResource Description: This compressed file contains ten spreadsheet files (.xlsx) tabulating detailed information for all centroids (HoloBee-Barcode database) and sequence assemblies (HoloBee-Mop database) used in HoloBee v2016.1:

  1. HB_Bar_v2016.1_bacteria_crossref_2016-05-18.xlsx
  2. HB_Bar_v2016.1_fungi_crossref_2016-05-20.xlsx
  3. HB_Bar_v2016.1_metazoa_crossref_2016-05-16.xlsx
  4. HB_Bar_v2016.1_protozoa_crossref_2016-05-20.xlsx
  5. HB_Bar_v2016.1_viruses_crossref_2016-05-17.xlsx
  6. HB_Mop_v2016.1_bacteria_crossref_2016-05-12.xlsx
  7. HB_Mop_v2016.1_fungi_crossref_2016-05-12.xlsx
  8. HB_Mop_v2016.1_metazoa_crossref_2016-04-15.xlsx
  9. HB_Mop_v2016.1_protozoa_crossref_2016-04-11.xlsx
  10. HB_Mop_v2016.1_viruses_crossref_2016-05-12.xlsx

md5 value:

  • HB_v2016.1_crossref.zip: a8a57d92830eb77904743afc95980465 Resource Title: data dictionary for HoloBee_v2016.1. File Name: Data_Dictionary_HoloBee_v2016.1.csv

Access & Use Information

Public: This dataset is intended for public access and use. License: Creative Commons CCZero

Downloads & Resources

Dates

Metadata Created Date March 30, 2024
Metadata Updated Date April 21, 2025

Metadata Source

Harvested from USDA JSON

Additional Metadata

Resource Type Dataset
Metadata Created Date March 30, 2024
Metadata Updated Date April 21, 2025
Publisher Agricultural Research Service
Maintainer
Identifier 10.15482/USDA.ADC/1255217
Data Last Modified 2024-02-08
Public Access Level public
Bureau Code 005:18
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id 1249b620-cfea-4339-bf00-71f1730ce670
Harvest Source Id d3fafa34-0cb9-48f1-ab1d-5b5fdc783806
Harvest Source Title USDA JSON
License https://creativecommons.org/publicdomain/zero/1.0/
Program Code 005:040
Source Datajson Identifier True
Source Hash b5c1471e8c3bad47cec89433fcce89b8b80d88a982958aa9942467efff55c293
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.