Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Try the next-generation Data Catalog at catalog-beta.data.gov and help shape it with your feedback.

Deep Green Unannotated Protein Structures

Metadata Updated: March 21, 2026

The Deep Green list is based on the identification and curation of conserved unannotated proteins in three green lineage (Viridiplantae) model organisms; Arabidopsis thaliana, Chlamydomonas reinhardtii, and Setaria viridis. Preliminary characterization of Deep Green proteins and genes was done using various informatics tools and published data sets and is presented in Knoshaug, Sun, et al., 2023, submitted. The structures of these unannotated proteins were also predicted using AlphaFold (Jumper et al., 2021). The data deposited here are the AlphaFold structural predictions having the highest pLDDT score and thus identified as the best folded structure (ranked_0). These data enable others to do in-depth structural characterizations to aid in functional characterization leading to deeper understanding of plant biology. References: Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P. and Hassabis, D. (2021) Highly accurate protein structure prediction with AlphaFold. Nature, 596:583-589. Knoshaug, E. P., Sun, P., Nag, A., Nguyen, H., Mattoon, E. M., Zhang, N., Liu, J., Chen, C., Cheng, J., Zhang, R., St. John, P., and Umen, J. (submitted) Identification and preliminary characterization of conserved uncharacterized proteins from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Setaria viridis.

Access & Use Information

Public: This dataset is intended for public access and use. License: Creative Commons Attribution

Downloads & Resources

Dates

Metadata Created Date January 12, 2025
Metadata Updated Date March 21, 2026

Metadata Source

Harvested from OpenEI data.json

Additional Metadata

Resource Type Dataset
Metadata Created Date January 12, 2025
Metadata Updated Date March 21, 2026
Publisher National Renewable Energy Laboratory
Maintainer
Identifier https://data.openei.org/submissions/8267
Data First Published 2023-04-20T16:14:18Z
Data Last Modified 2026-03-12T18:10:31Z
Public Access Level public
Bureau Code 019:20
Metadata Context https://openei.org/data.json
Metadata Catalog ID https://openei.org/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Data Quality True
Harvest Object Id 76a756d1-f416-456d-ada9-140aa06acd26
Harvest Source Id 7cbf9085-0290-4e9f-bec1-91653baeddfd
Harvest Source Title OpenEI data.json
Homepage URL https://data.nlr.gov/submissions/216
License https://creativecommons.org/licenses/by/4.0/
Program Code 019:005
Projectnumber ERW9098
Projecttitle Deep Green: Structural and Functional Genomic Characterization of Conserved Unannotated Green Lineage Proteins
Source Datajson Identifier True
Source Hash 079fade3b93e2a061cba55f00afc4568fc76f945e47293b91b8128d94d1cb22c
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.