Free R: Difference between revisions

From Proteopedia
Jump to navigation Jump to search
Eric Martz (talk | contribs)
No edit summary
Eric Martz (talk | contribs)
 
(35 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Free R]] "is generally considered the most useful global measure of model-to-data agreement"<ref>PMID: 22000512</ref>. It is a statistical quantity introduced in 1992 by Axel T. Brünger<ref>Brünger AT. Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature. 355:472-5. [http://www.ncbi.nlm.nih.gov/pubmed/18481394 PubMed]</ref> to assess the quality of a model from X-ray crystallographic data. It is calculated in the same manner as the [[R value]], but from a subset of the data set aside for the calculation of free R, and not used in the refinement of the model. It is a more reliable tool for assessing the model than the R value because it is not self-referential -- that is, as an estimation of errors, free R is free of any bias that may have been introduced during refinement. Free R is an example of the application of the more general method of [http://en.wikipedia.org/wiki/Cross-validation_(statistics) cross validation].
[[Free R]] (also called R<sub>free</sub>) "is generally considered the most useful global measure of model-to-data agreement"<ref name="validation4pdb">PMID: 22000512</ref>. It is a statistical quantity introduced in 1992 by Axel T. Brünger<ref name="bruenger">PMID: 18481394</ref> to assess the quality of a model from X-ray crystallographic data. It is calculated in the same manner as the [[R value]], but from a subset of the data set aside for the calculation of free R, and not used in the refinement of the model. It is a more reliable tool for assessing the model than the R value because it is not self-referential -- that is, as an estimation of errors, free R is free of any bias that may have been introduced during refinement. Free R is an example of the application of the more general method of [http://en.wikipedia.org/wiki/Cross-validation_(statistics) cross validation].


As a rule of thumb, free R should not exceed the R value by more than 0.05; that is, if the R value is 0.20, free R should not significantly exceed 0.25. Free R values exceeding 0.40 raise serious doubts about the model.
==Help Interpreting Free R==
When a crystallographic protein structure is visualized in [[FirstGlance/Index|FirstGlance in Jmol]], its Free R will be automatically interpreted as ''Reliability'' with the following levels, each relative to its [[Resolution]]:
*<span style="color:#00e000;font-weight:bold;">Much better than average</span>
*<span style="color:green;font-weight:bold;">Better than average</span>
*<span style="color:black;font-weight:bold;">Average</span>
*<span style="color:#a0a000;font-weight:bold;">Worse than average</span>
*<span style="color:red;font-weight:bold;">Unreliable</span>
 
==Good Values for Free R==
 
As a rule of thumb, for models with [[resolution]] of 2.0 &Aring; or better (&Aring; values <2.0), free R should not exceed (resolution/10) by more than 0.05; that is, if the resolution is 2.0 &Aring;, free R should not significantly exceed 0.25. For resolutions around 3.0 &Aring;, the free R should not exceed (resolution/10). Free R values exceeding the worst 25% for the resolution of the model (see table below) raise doubts about the quality of the model.
 
Free R is positively correlated with [[resolution]]<ref name="validation4pdb" />:
 
:{| cellpadding="4" style="text-align: center; border: 1px solid darkgray;"
|- style="background: #e0e0e0;"
! Resolution, &Aring;
! Median Free R<ref name="validation4pdb" />
! Median Free R<br>- (Resolution/10)
! Free R<br>Best 25%
! Free R<br>Worst 25%
|-
| 1.0
| 0.15
| 0.05
| <0.14
| >0.17
|-
| 1.5
| 0.21
| 0.06
| <0.19
| >0.23
|-
| 1.8
| 0.23
| 0.05
| <0.21
| >0.245
|-
| 2.0
| 0.24
| 0.04
| <0.22
| >0.26
|-
| 2.2
| 0.25
| 0.03
| <0.23
| >0.265
|-
| 2.5
| 0.26
| 0.01
| <0.245
| >0.28
|-
| 3.0
| 0.28
| <font color="red">- 0.02</font>
| <0.26
| >0.30
|-
| 3.5
| 0.30
| <font color="red">- 0.05</font>
| <0.275
| >0.33
|-
| ~4.0
| 0.31
| <font color="red">- 0.09</font>
| <0.28
| >0.35
|}
:*Data in the above table for resolutions <= 3.5 &Aring; are approximated from Figure 3C in <ref name="validation4pdb" />, based on X-ray entries deposited in the [[PDB]] since 1990.
:*Data for resolution ~4.0 &Aring; are based on 199 X-ray entries, for which free R values are given, with resolutions of 3.9-4.1 &Aring; (Feb 2013).
:*The median resolution for X-ray entries in the [[PDB]] is 2.07 &Aring; (Feb 2013).
 
==Bad Values for Free R==
<!--In October, 2011, there are 79 X-ray crystallographic entries in the [[PDB]] deposited 1990 or later, with free R values of >= 0.40. The median resolution of these 79 entries is 4.2 &Aring;.-->
In October, 2011, there are 182 X-ray crystallographic entries in the [[PDB]] with free R values of >= 0.37. 45 of these have a resolution <= 3.0 &Aring;, and 5 have a resolution of <= 2.0 &Aring;. It seems likely that there are serious errors in most of these 45 models.
 
In one amazing case, [[1zff]], the resolution is given as 0.94 &Aring;, the R value as 0.428, and the free R as 0.444.
 
==No Value for Free R==
In October, 2011, there are 5,908 X-ray crystallographic entries in the PDB for which no free R value is given. These represent 8.8% of all 66,847 X-ray entries. Most of these were deposited before  the free R methodology was described by Brünger in 1992<ref name="bruenger" />.
 
:{| cellpadding="4" style="text-align: center; border: 1px solid darkgray;"
|- style="background: #e0e0e0;"
! Deposition Date
! X-ray entries
! No Free R
|-
| Through 1990
| 666
| 666 (100%)
|-
| 1991-1995
| 3,209
| 2,852 (89%)
|-
| 1996-2000
| 8,832
| 1,858 (21%)
|-
| 2001-2005
| 18,522
| 342 (1.8%)
|-
| 2006-<br>Oct 2011
| 35,618
| 190 (0.5%)
|}


==See Also==
==See Also==

Latest revision as of 01:49, 31 December 2021

Free R (also called Rfree) "is generally considered the most useful global measure of model-to-data agreement"[1]. It is a statistical quantity introduced in 1992 by Axel T. Brünger[2] to assess the quality of a model from X-ray crystallographic data. It is calculated in the same manner as the R value, but from a subset of the data set aside for the calculation of free R, and not used in the refinement of the model. It is a more reliable tool for assessing the model than the R value because it is not self-referential -- that is, as an estimation of errors, free R is free of any bias that may have been introduced during refinement. Free R is an example of the application of the more general method of cross validation.

Help Interpreting Free RHelp Interpreting Free R

When a crystallographic protein structure is visualized in FirstGlance in Jmol, its Free R will be automatically interpreted as Reliability with the following levels, each relative to its Resolution:

  • Much better than average
  • Better than average
  • Average
  • Worse than average
  • Unreliable

Good Values for Free RGood Values for Free R

As a rule of thumb, for models with resolution of 2.0 Å or better (Å values <2.0), free R should not exceed (resolution/10) by more than 0.05; that is, if the resolution is 2.0 Å, free R should not significantly exceed 0.25. For resolutions around 3.0 Å, the free R should not exceed (resolution/10). Free R values exceeding the worst 25% for the resolution of the model (see table below) raise doubts about the quality of the model.

Free R is positively correlated with resolution[1]:

Resolution, Å Median Free R[1] Median Free R
- (Resolution/10)
Free R
Best 25%
Free R
Worst 25%
1.0 0.15 0.05 <0.14 >0.17
1.5 0.21 0.06 <0.19 >0.23
1.8 0.23 0.05 <0.21 >0.245
2.0 0.24 0.04 <0.22 >0.26
2.2 0.25 0.03 <0.23 >0.265
2.5 0.26 0.01 <0.245 >0.28
3.0 0.28 - 0.02 <0.26 >0.30
3.5 0.30 - 0.05 <0.275 >0.33
~4.0 0.31 - 0.09 <0.28 >0.35
  • Data in the above table for resolutions <= 3.5 Å are approximated from Figure 3C in [1], based on X-ray entries deposited in the PDB since 1990.
  • Data for resolution ~4.0 Å are based on 199 X-ray entries, for which free R values are given, with resolutions of 3.9-4.1 Å (Feb 2013).
  • The median resolution for X-ray entries in the PDB is 2.07 Å (Feb 2013).

Bad Values for Free RBad Values for Free R

In October, 2011, there are 182 X-ray crystallographic entries in the PDB with free R values of >= 0.37. 45 of these have a resolution <= 3.0 Å, and 5 have a resolution of <= 2.0 Å. It seems likely that there are serious errors in most of these 45 models.

In one amazing case, 1zff, the resolution is given as 0.94 Å, the R value as 0.428, and the free R as 0.444.

No Value for Free RNo Value for Free R

In October, 2011, there are 5,908 X-ray crystallographic entries in the PDB for which no free R value is given. These represent 8.8% of all 66,847 X-ray entries. Most of these were deposited before the free R methodology was described by Brünger in 1992[2].

Deposition Date X-ray entries No Free R
Through 1990 666 666 (100%)
1991-1995 3,209 2,852 (89%)
1996-2000 8,832 1,858 (21%)
2001-2005 18,522 342 (1.8%)
2006-
Oct 2011
35,618 190 (0.5%)

See AlsoSee Also

Content DonorsContent Donors

The initial contents of this page were adapted from the Glossary of ProteinExplorer.Org, with the permission of the principal author, Eric Martz. The Free R entry there was written primarily by Diana Ditmore.

Literature CitedLiterature Cited

  1. 1.0 1.1 1.2 1.3 Read RJ, Adams PD, Arendall WB 3rd, Brunger AT, Emsley P, Joosten RP, Kleywegt GJ, Krissinel EB, Lutteke T, Otwinowski Z, Perrakis A, Richardson JS, Sheffler WH, Smith JL, Tickle IJ, Vriend G, Zwart PH. A new generation of crystallographic validation tools for the protein data bank. Structure. 2011 Oct 12;19(10):1395-412. PMID:22000512 doi:10.1016/j.str.2011.08.006
  2. 2.0 2.1 Brunger AT. Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature. 1992 Jan 30;355(6359):472-5. PMID:18481394

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Eran Hodis