Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

A functional update of the

Metadata Updated: September 7, 2025

Background Since the genome of Escherichia coli K-12 was initially annotated in 1997, additional functional information based on biological characterization and functions of sequence-similar proteins has become available. On the basis of this new information, an updated version of the annotated chromosome has been generated.

      Results
      The E. coli K-12 chromosome is currently represented by 4,401 genes encoding 116 RNAs and 4,285 proteins. The boundaries of the genes identified in the GenBank Accession U00096 were used. Some protein-coding sequences are compound and encode multimodular proteins. The coding sequences (CDSs) are represented by modules (protein elements of at least 100 amino acids with biological activity and independent evolutionary history). There are 4,616 identified modules in the 4,285 proteins. Of these, 48.9% have been characterized, 29.5% have an imputed function, 2.1% have a phenotype and 19.5% have no function assignment. Only 7% of the modules appear unique to E. coli, and this number is expected to be reduced as more genome data becomes available. The imputed functions were assigned on the basis of manual evaluation of functions predicted by BLAST and DARWIN analyses and by the MAGPIE genome annotation system.


      Conclusions
      Much knowledge has been gained about functions encoded by the E. coli K-12 genome since the 1997 annotation was published. The data presented here should be useful for analysis of E. coli gene products as well as gene products encoded by other genomes.

Access & Use Information

Public: This dataset is intended for public access and use. License: No license information was provided. If this work was prepared by an officer or employee of the United States government as part of that person's official duties it is considered a U.S. Government Work.

Downloads & Resources

Dates

Metadata Created Date July 24, 2025
Metadata Updated Date September 7, 2025

Metadata Source

Harvested from Healthdata.gov

Additional Metadata

Resource Type Dataset
Metadata Created Date July 24, 2025
Metadata Updated Date September 7, 2025
Publisher National Institutes of Health
Maintainer
NIH
Identifier https://healthdata.gov/api/views/xgqu-4qrg
Data First Published 2025-07-14
Data Last Modified 2025-09-06
Category NIH
Public Access Level public
Bureau Code 009:25
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Metadata Catalog ID https://healthdata.gov/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id 290c5047-be5f-4043-953f-5eb1f86fdd8f
Harvest Source Id 651e43b2-321c-4e4c-b86a-835cfc342cb0
Harvest Source Title Healthdata.gov
Homepage URL https://healthdata.gov/d/xgqu-4qrg
Program Code 009:033
Source Datajson Identifier True
Source Hash f866fd090d278e358f1230c8d8a71b8d6179eaec7a20613d4aca61fbf8a35895
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.