Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

The ESAT-6 gene cluster of

Metadata Updated: September 6, 2025

Background The genome of Mycobacterium tuberculosis H37Rv has five copies of a cluster of genes known as the ESAT-6 loci. These clusters contain members of the CFP-10 (lhp) and ESAT-6 (esat-6) gene families (encoding secreted T-cell antigens that lack detectable secretion signals) as well as genes encoding secreted, cell-wall-associated subtilisin-like serine proteases, putative ABC transporters, ATP-binding proteins and other membrane-associated proteins. These membrane-associated and energy-providing proteins may function to secrete members of the ESAT-6 and CFP-10 protein families, and the proteases may be involved in processing the secreted peptide.

      Results
      Finished and unfinished genome sequencing data of 98 publicly available microbial genomes has been analyzed for the presence of orthologs of the ESAT-6 loci. The multiple duplicates of the ESAT-6 gene cluster found in the genome of M. tuberculosis H37Rv are also conserved in the genomes of other mycobacteria, for example M. tuberculosis CDC1551, M. tuberculosis 210, M. bovis, M. leprae, M. avium, and the avirulent strain M. smegmatis. Phylogenetic analyses of the resulting sequences have established the duplication order of the gene clusters and demonstrated that the gene cluster known as region 4 (Rv3444c-3450c) is ancestral. Region 4 is also the only region for which an ortholog could be found in the genomes of Corynebacterium diphtheriae and Streptomyces coelicolor.


      Conclusions
      Comparative genomic analysis revealed that the presence of the ESAT-6 gene cluster is a feature of some high-G+C Gram-positive bacteria. Multiple duplications of this cluster have occurred and are maintained only within the genomes of members of the genus Mycobacterium.

Access & Use Information

Public: This dataset is intended for public access and use. License: No license information was provided. If this work was prepared by an officer or employee of the United States government as part of that person's official duties it is considered a U.S. Government Work.

Downloads & Resources

Dates

Metadata Created Date July 24, 2025
Metadata Updated Date September 6, 2025

Metadata Source

Harvested from Healthdata.gov

Additional Metadata

Resource Type Dataset
Metadata Created Date July 24, 2025
Metadata Updated Date September 6, 2025
Publisher National Institutes of Health
Maintainer
NIH
Identifier https://healthdata.gov/api/views/hekv-rqwx
Data First Published 2025-07-14
Data Last Modified 2025-09-06
Category NIH
Public Access Level public
Bureau Code 009:25
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Metadata Catalog ID https://healthdata.gov/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id d513f8b4-764d-46ca-afe2-35c23b31f584
Harvest Source Id 651e43b2-321c-4e4c-b86a-835cfc342cb0
Harvest Source Title Healthdata.gov
Homepage URL https://healthdata.gov/d/hekv-rqwx
Program Code 009:033
Source Datajson Identifier True
Source Hash 20fd7b143ae2f02d16c754f8878ea350c3d6a70334ff128d215b81e355507a3f
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.