Interpreting ConSurf Results: Difference between revisions

Eric Martz (talk | contribs)
Eric Martz (talk | contribs)
Line 15: Line 15:
At the ConSurf Server, click on ''Gallery'', then ''MHC Class I heavy chain'' (2VAA). In the finished results for chain A of 2VAA, under the subheading ''Sequence Data'', click on '''Sequences Used'''.
At the ConSurf Server, click on ''Gallery'', then ''MHC Class I heavy chain'' (2VAA). In the finished results for chain A of 2VAA, under the subheading ''Sequence Data'', click on '''Sequences Used'''.


The APD is '''0.99'''. The MSA has 150 sequences, largely limited to sequences for major histocompatibility complex class I proteins. The labels of 101 sequences (67% of 150) contain "class I" or "class 1". There is only one class II protein sequence. Three sequences are labeled "zinc-alpha-2-glycoprotein", clearly a different function. There are 22 sequences labeled "uncharacterized protein" which nevertheless have high similarity to the query. 19 sequences are labeled "UPI000... related cluster". (This result was obtained in April, 2019.)
The APD is '''0.99'''. The MSA has 150 sequences, largely limited to sequences for major histocompatibility complex class I proteins. The labels of 101 sequences (67% of 150) contain "class I" or "class 1". There is only one class II protein sequence. Three sequences are labeled "zinc-alpha-2-glycoprotein", clearly a different function. There are 22 sequences labeled "uncharacterized protein" which nevertheless have high similarity to the query. 19 sequences are labeled "UPI000... related cluster". If the uncharacterized and "UPI000..." sequences are in fact class I sequences, then '''up to 142/150 (95%) of the sequences could be MHC-I'''. (This ConSurf result was calculated in April, 2019.)


In contrast, ConSurfDB used 300 sequences for its 2VAA chain A result (obtained October, 2019). The APD is '''1.62''', suggesting that a number of non-MHC-I proteins were included in the MSA. The MSA has 146 sequences (49% of 300 total) whose labels include "class I" (excluding the count with "class II"). The MSA includes 62 sequences labeled "Ig-like domain-containing protein", 20 "T-cell surface glycoprotein" sequences of the CD1 family, 17 apparently unrelated proteins (one or a few each), 14 histocompatibility class II proteins, 8 sequences for "hereditary hemochromatosis protein", 8 for "zinc-alpha-2-glycoprotein", and 11 uncharacterized proteins.
In contrast, ConSurfDB used 300 sequences for its 2VAA chain A result (calculated in October, 2019). The APD is '''1.62''', suggesting that a number of non-MHC-I proteins were included in the MSA. '''Only 146/300 sequences (49% of 300 total) in the MSA have labels that include "class I"''' (excluding the count with "class II"). The MSA includes 62 sequences labeled "Ig-like domain-containing protein", 20 "T-cell surface glycoprotein" sequences of the CD1 family, 17 apparently unrelated proteins (one or a few each), 14 histocompatibility class II proteins, 8 sequences for "hereditary hemochromatosis protein", 8 for "zinc-alpha-2-glycoprotein", and 11 uncharacterized proteins. Excluding the uncharacterized proteins, that leaves 129 (43% of 300) that do not or may not function as MHC I proteins.

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz