Free R: Difference between revisions

Eric Martz (talk | contribs)
Eric Martz (talk | contribs)
 
(27 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Free R]] (also called R<sub>free</sub>) "is generally considered the most useful global measure of model-to-data agreement"<ref name="validation4pdb">PMID: 22000512</ref>. It is a statistical quantity introduced in 1992 by Axel T. Brünger<ref>PMID: 18481394</ref> to assess the quality of a model from X-ray crystallographic data. It is calculated in the same manner as the [[R value]], but from a subset of the data set aside for the calculation of free R, and not used in the refinement of the model. It is a more reliable tool for assessing the model than the R value because it is not self-referential -- that is, as an estimation of errors, free R is free of any bias that may have been introduced during refinement. Free R is an example of the application of the more general method of [http://en.wikipedia.org/wiki/Cross-validation_(statistics) cross validation].
[[Free R]] (also called R<sub>free</sub>) "is generally considered the most useful global measure of model-to-data agreement"<ref name="validation4pdb">PMID: 22000512</ref>. It is a statistical quantity introduced in 1992 by Axel T. Brünger<ref name="bruenger">PMID: 18481394</ref> to assess the quality of a model from X-ray crystallographic data. It is calculated in the same manner as the [[R value]], but from a subset of the data set aside for the calculation of free R, and not used in the refinement of the model. It is a more reliable tool for assessing the model than the R value because it is not self-referential -- that is, as an estimation of errors, free R is free of any bias that may have been introduced during refinement. Free R is an example of the application of the more general method of [http://en.wikipedia.org/wiki/Cross-validation_(statistics) cross validation].
 
==Help Interpreting Free R==
When a crystallographic protein structure is visualized in [[FirstGlance/Index|FirstGlance in Jmol]], its Free R will be automatically interpreted as ''Reliability'' with the following levels, each relative to its [[Resolution]]:
*<span style="color:#00e000;font-weight:bold;">Much better than average</span>
*<span style="color:green;font-weight:bold;">Better than average</span>
*<span style="color:black;font-weight:bold;">Average</span>
*<span style="color:#a0a000;font-weight:bold;">Worse than average</span>
*<span style="color:red;font-weight:bold;">Unreliable</span>


==Good Values for Free R==
==Good Values for Free R==


As a rule of thumb, free R should not exceed the R value by more than 0.05; that is, if the R value is 0.20, free R should not significantly exceed 0.25. Free R values exceeding 0.40 raise serious doubts about the model.
As a rule of thumb, for models with [[resolution]] of 2.0 &Aring; or better (&Aring; values <2.0), free R should not exceed (resolution/10) by more than 0.05; that is, if the resolution is 2.0 &Aring;, free R should not significantly exceed 0.25. For resolutions around 3.0 &Aring;, the free R should not exceed (resolution/10). Free R values exceeding the worst 25% for the resolution of the model (see table below) raise doubts about the quality of the model.  


Free R is positively correlated with [[resolution]]<ref name="validation4pdb" />:
Free R is positively correlated with [[resolution]]<ref name="validation4pdb" />:


{| cellpadding="4" style="text-align: center; border: 1px solid darkgray;"
:{| cellpadding="4" style="text-align: center; border: 1px solid darkgray;"
|- style="background: #e0e0e0;"
|- style="background: #e0e0e0;"
! Resolution, &Aring;
! Resolution, &Aring;
Line 13: Line 21:
! Median Free R<br>- (Resolution/10)
! Median Free R<br>- (Resolution/10)
! Free R<br>Best 25%
! Free R<br>Best 25%
! Free R Worst 25%
! Free R<br>Worst 25%
|-
|-
| 1.0
| 1.0
Line 26: Line 34:
| <0.19
| <0.19
| >0.23
| >0.23
|-
| 1.8
| 0.23
| 0.05
| <0.21
| >0.245
|-
|-
| 2.0
| 2.0
Line 32: Line 46:
| <0.22
| <0.22
| >0.26
| >0.26
|-
| 2.2
| 0.25
| 0.03
| <0.23
| >0.265
|-
| 2.5
| 0.26
| 0.01
| <0.245
| >0.28
|-
| 3.0
| 0.28
| <font color="red">- 0.02</font>
| <0.26
| >0.30
|-
| 3.5
| 0.30
| <font color="red">- 0.05</font>
| <0.275
| >0.33
|-
| ~4.0
| 0.31
| <font color="red">- 0.09</font>
| <0.28
| >0.35
|}
|}
*Data in the above table are from Figure 3C in <ref name="validation4pdb" />.
:*Data in the above table for resolutions <= 3.5 &Aring; are approximated from Figure 3C in <ref name="validation4pdb" />, based on X-ray entries deposited in the [[PDB]] since 1990.
*The median resolution in the [[PDB]] is 2.06 &Aring; (Oct 2011).
:*Data for resolution ~4.0 &Aring; are based on 199 X-ray entries, for which free R values are given, with resolutions of 3.9-4.1 &Aring; (Feb 2013).
:*The median resolution for X-ray entries in the [[PDB]] is 2.07 &Aring; (Feb 2013).


==Bad Values for Free R==
<!--In October, 2011, there are 79 X-ray crystallographic entries in the [[PDB]] deposited 1990 or later, with free R values of >= 0.40. The median resolution of these 79 entries is 4.2 &Aring;.-->
In October, 2011, there are 182 X-ray crystallographic entries in the [[PDB]] with free R values of >= 0.37. 45 of these have a resolution <= 3.0 &Aring;, and 5 have a resolution of <= 2.0 &Aring;. It seems likely that there are serious errors in most of these 45 models.


* At resolution 2.5 &Aring;, the median free R is about 0.26. ((R/10) + 0.01)
In one amazing case, [[1zff]], the resolution is given as 0.94 &Aring;, the R value as 0.428, and the free R as 0.444.
* At resolution 3.0 &Aring;, the median free R is about 0.28. ((R/10) <font color="red">- 0.01</font>)
 
* At resolution 3.5 &Aring;, the median free R is about 0.30. ((R/10) <font color="red">- 0.05</font>)
==No Value for Free R==
In October, 2011, there are 5,908 X-ray crystallographic entries in the PDB for which no free R value is given. These represent 8.8% of all 66,847 X-ray entries. Most of these were deposited before  the free R methodology was described by Brünger in 1992<ref name="bruenger" />.
 
:{| cellpadding="4" style="text-align: center; border: 1px solid darkgray;"
|- style="background: #e0e0e0;"
! Deposition Date
! X-ray entries
! No Free R
|-
| Through 1990
| 666
| 666 (100%)
|-
| 1991-1995
| 3,209
| 2,852 (89%)
|-
| 1996-2000
| 8,832
| 1,858 (21%)
|-
| 2001-2005
| 18,522
| 342 (1.8%)
|-
| 2006-<br>Oct 2011
| 35,618
| 190 (0.5%)
|}


==See Also==
==See Also==

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Eran Hodis