A study of quality measures for protein threading models

Metadata Updated: September 6, 2025

Background Prediction of protein structures is one of the fundamental challenges in biology today. To fully understand how well different prediction methods perform, it is necessary to use measures that evaluate their performance. Every two years, starting in 1994, the CASP (Critical Assessment of protein Structure Prediction) process has been organized to evaluate the ability of different predictors to blindly predict the structure of proteins. To capture different features of the models, several measures have been developed during the CASP processes. However, these measures have not been examined in detail before. In an attempt to develop fully automatic measures that can be used in CASP, as well as in other type of benchmarking experiments, we have compared twenty-one measures. These measures include the measures used in CASP3 and CASP2 as well as have measures introduced later. We have studied their ability to distinguish between the better and worse models submitted to CASP3 and the correlation between them.

      Results
      Using a small set of 1340 models for 23 different targets we show that most methods correlate with each other. Most pairs of measures show a correlation coefficient of about 0.5. The correlation is slightly higher for measures of similar types. We found that a significant problem when developing automatic measures is how to deal with proteins of different length. Also the comparisons between different measures is complicated as many measures are dependent on the size of the target. We show that the manual assessment can be reproduced to about 70% using automatic measures. Alignment independent measures, detects slightly more of the models with the correct fold, while alignment dependent measures agree better when selecting the best models for each target. Finally we show that using automatic measures would, to a large extent, reproduce the assessors ranking of the predictors at CASP3.


      Conclusions
      We show that given a sufficient number of targets the manual and automatic measures would have given almost identical results at CASP3. If the intent is to reproduce the type of scoring done by the manual assessor in in CASP3, the best approach might be to use a combination of alignment independent and alignment dependent measures, as used in several recent studies.

Access & Use Information

Public: This dataset is intended for public access and use. License: No license information was provided. If this work was prepared by an officer or employee of the United States government as part of that person's official duties it is considered a U.S. Government Work.

Downloads & Resources

Official Government Data SourceHTML
Visit the original government dataset for complete information,...

Visit page

Landing PageLanding Page

Visit page

Dates

Metadata Created Date	July 24, 2025
Metadata Updated Date	September 6, 2025

Metadata Source

Data.json Data.json Metadata
Download Metadata

Harvested from Healthdata.gov

Additional Metadata

Resource Type	Dataset
Metadata Created Date	July 24, 2025
Metadata Updated Date	September 6, 2025
Publisher	National Institutes of Health
Maintainer	NIH
Identifier	https://healthdata.gov/api/views/ekrd-x6ie
Data First Published	2025-07-14
Data Last Modified	2025-09-06
Category	NIH
Public Access Level	public
Bureau Code	009:25
Metadata Context	https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Metadata Catalog ID	https://healthdata.gov/data.json
Schema Version	https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby	https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id	e1b97a57-d3cd-442b-8bf7-735741863833
Harvest Source Id	651e43b2-321c-4e4c-b86a-835cfc342cb0
Harvest Source Title	Healthdata.gov
Homepage URL	https://healthdata.gov/d/ekrd-x6ie
Program Code	009:033
Source Datajson Identifier	True
Source Hash	8f9a6054c3f5dc25cbbc23b91690d61bfbd298a909afa5783fba31d90570d2b6
Source Schema Version	1.1

Didn't find what you're looking for? Suggest a dataset here.

Data Catalog