Practical Guide to Homology Modeling: Difference between revisions
Eric Martz (talk | contribs) |
Eric Martz (talk | contribs) |
||
Line 102: | Line 102: | ||
'''Full-length templates are unlikely to be found''' for larger proteins (>~200 residues). 89% of structures in the [[Protein Data Bank]] were determined by [[X-ray crystallography]]. Most crystallographic structures represent fragments of full-length proteins, because fragments generally give higher crystallization success<ref>The overall success rate for solving the 3D structure of a given protein sequence is about 5%. Failures commonly occur because the expressed protein is not sufficiently soluble (about half of expressed sequences), because soluble proteins fail to crystallize, or because crystals are not well ordered.</ref>. 10% of structures in the Protein Data Bank were determined by solution [[NMR]], but these tend to be small proteins or single domains. The median molecular mass of structures determined by NMR is 10 KD<ref name="mmm">Median molecular masses in the PDB were determined in December, 2014.</ref> (about 90 amino acids<ref>The average mass of an amino acid is 111.4 Daltons, weighted according to the frequencies of occurrences in proteins.</ref>). NMR is generally not able to determine atomic resolution structures for proteins >30 KD. | '''Full-length templates are unlikely to be found''' for larger proteins (>~200 residues). 89% of structures in the [[Protein Data Bank]] were determined by [[X-ray crystallography]]. Most crystallographic structures represent fragments of full-length proteins, because fragments generally give higher crystallization success<ref>The overall success rate for solving the 3D structure of a given protein sequence is about 5%. Failures commonly occur because the expressed protein is not sufficiently soluble (about half of expressed sequences), because soluble proteins fail to crystallize, or because crystals are not well ordered.</ref>. 10% of structures in the Protein Data Bank were determined by solution [[NMR]], but these tend to be small proteins or single domains. The median molecular mass of structures determined by NMR is 10 KD<ref name="mmm">Median molecular masses in the PDB were determined in December, 2014.</ref> (about 90 amino acids<ref>The average mass of an amino acid is 111.4 Daltons, weighted according to the frequencies of occurrences in proteins.</ref>). NMR is generally not able to determine atomic resolution structures for proteins >30 KD. | ||
In contrast, the median molecular mass of [[asymmetric units]] determined by X-ray crystallography is 50 KD<ref name="mmm" />, and a few are very large, such as virus capsids (e.g. 4qyk, ~2 million Daltons; 4v99, 10 million Daltons) or ribosomes (e.g. 4w2i, 4.5 million Daltons). | In contrast, the median molecular mass of [[Asymmetric unit|asymmetric units]] determined by X-ray crystallography is 50 KD<ref name="mmm" />, and a few are very large, such as virus capsids (e.g. 4qyk, ~2 million Daltons; 4v99, 10 million Daltons) or ribosomes (e.g. 4w2i, 4.5 million Daltons). | ||
===Errors and uncertainties in the sequence alignment produce errors in the homology model=== | ===Errors and uncertainties in the sequence alignment produce errors in the homology model=== |