CRISPR-Cas9: Difference between revisions

No edit summary
Michal Harel (talk | contribs)
No edit summary
 
(23 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<StructureSection load='' size='350' side='right' scene='74/742625/Cv3/12' caption=''>
<StructureSection load='' size='350' side='right' scene='74/742625/Cv3/13' caption='Cas9 complex with RNA (PDB code [[4zt0]])'>
SEE ALSO [[CRISPR-Cas]]
'''Part I'''
 
SEE ALSO  
*[[CRISPR-Cas9 Part II]]
*[[CRISPR-Cas]]
'''Background'''
'''Background'''


Highlights
Highlights


*'''CRISPR-Cas9''' is a powerful tool to modulate transcription in wide range of cell types.
*'''CRISPR-Cas9 (CRISPR type II-A)''' is a powerful tool to modulate transcription in wide range of cell types.
*An expanding set of CRISPR-based transcription effectors is available.
*An expanding set of CRISPR-based transcription effectors is available.
*Gene networks can be efficiently probed and modified for biotechnology applications.<ref name="Did">PMID:27344519</ref>
*Gene networks can be efficiently probed and modified for biotechnology applications.<ref name="Did">PMID:27344519</ref>
Line 44: Line 48:


'''Cas9-sgRNA-target DNA complexes from Streptococcus pyogenes:'''
'''Cas9-sgRNA-target DNA complexes from Streptococcus pyogenes:'''
*<scene name='74/746096/Cv3/1'>Domain organization of nuclease lobe of Cas9 from S. pyogenes</scene>, [[4zt0]])
*<scene name='74/746096/Cv3/1'>Domain organization of nuclease lobe of Cas9 from S. pyogenes</scene> ([[4zt0]])
*<scene name='74/742625/Cv/42'>Cas9-sgRNA-target DNA complex from Streptococcus pyogenes</scene> ([[5fw2]]).
*<scene name='74/742625/Cv/42'>Cas9-sgRNA-target DNA complex from Streptococcus pyogenes</scene> ([[5fw2]]).
*<scene name='74/742625/Cv2/12'>Cas9-sgRNA-target DNA complex from Streptococcus pyogenes</scene> ([[5b2s]]).
*<scene name='74/742625/Cv2/12'>Cas9-sgRNA-target DNA complex from Streptococcus pyogenes</scene> ([[5b2s]]).
Line 51: Line 55:
Other representatives: [[5y36]], [[4un3]].
Other representatives: [[5y36]], [[4un3]].


====Crystal Structure of Staphylococcus aureus Cas9<ref name="P5CZZ">PMID:26317473</ref>====
'''For continuation please see [[CRISPR-Cas9 Part II]]'''  
 
The RNA-guided DNA endonuclease Cas9 cleaves double-stranded DNA targets with a protospacer adjacent motif (PAM) and complementarity to the guide RNA. Recently, we harnessed ''Staphylococcus aureus'' Cas9 (''Sa''Cas9), which is significantly smaller than ''Streptococcus pyogenes'' Cas9 (''Sp''Cas9), to facilitate efficient ''in vivo'' genome editing. Here, the crystal structures of ''Sa''Cas9 in complex with a single guide RNA (sgRNA) and its double stranded DNA targets, containing the 5'-TTGAAT-3' PAM and the 5'-TTGGGT-3' PAM, at 2.6 and 2.7 A˚ resolutions, respectively, were reported. The structures revealed the mechanism of the relaxed recognition of the 5'-NNGRRT-3' PAM by SaCas9. A structural comparison of ''Sa''Cas9 with ''Sp''Cas9 highlighted both structural conservation and divergence, explaining their distinct PAM specificities and orthologous sgRNA recognition.
 
'''Overall Structure of the SaCas9–sgRNA–Target DNA Complex'''
 
The <scene name='74/742625/Cv/48'>crystal structure of full-length SaCas9</scene> (residues 1–1053; N580A/C946A) in complex with a 73-nucleotide (nt) sgRNA, a 28-nt target DNA strand and an 8-nt non-target DNA strand, containing the 5'-TTGAAT-3' PAM was solved ([[5czz]]). ''Sa''Cas9 adopts a <scene name='74/742625/Cv6/1'>bilobed architecture</scene> consisting of a REC lobe (residues 41–425) and a NUC lobe (residues 1–40 and 435–1053). The <scene name='74/742625/Cv6/2'>two lobes are connected</scene> by an arginine-rich bridge helix (residues 41–73) and a linker loop (residues 426–434). The <scene name='74/742625/Cv6/3'>NUC lobe consists of</scene> the RuvC (residues 1–40, 435–480 and 650–774), HNH (residues 520–628), WED (residues 788–909), and PI (residues 910–1053) domains. The <scene name='74/742625/Cv6/4'>PI domain</scene> can be divided into a Topoisomerase-homology (TOPO) domain and a C-terminal domain (CTD). The <scene name='74/742625/Cv6/5'>RuvC domain consists of</scene> three separate motifs (RuvC-I–III) and interacts with the HNH and PI domains. The <scene name='74/742625/Cv6/6'>HNH domain</scene> is connected to RuvC-II and RuvC-III by the L1 (residues 481–519) and L2 (residues 629–649) linker regions, respectively.  The active site of the HNH domain is distant from the cleavage site in the target DNA strand (the phosphodiester linkage between dC3 and dA4), indicating that the present structure represents the inactive state. The WED and RuvC domains are connected by a <scene name='74/742625/Cv6/7'>‘‘phosphate lock’’</scene> loop (residues 775–787). Previous structural studies revealed that SpCas9 undergoes conformational rearrangements upon guide RNA binding, to form the central channel between the REC and NUC lobes. In the absence of the guide RNA, SpCas9 and AnCas9 adopt a closed conformation, where the active site of the HNH domain is covered by the RuvC domain. In contrast, the ternary and quaternary complex structures of SpCas9 adopt an open conformation and have the central channel, which accommodates the guide RNA–target DNA heteroduplex (referred to as the guide:target heteroduplex). The present <scene name='74/742625/Cv6/9'>quaternary complex structure of SaCas9 adopts a similar open conformation to form the central channel, which accommodates the guide:target heteroduplex</scene>, suggesting that the guide RNA-induced conformational
activation is conserved between SaCas9 and SpCas9. A structural comparison between SaCas9 and SpCas9 revealed that, although their overall architectures are similar, there are notable differences in their REC, WED, and PI domains, as described in detail below, thereby explaining the significant sequence and size differences of the two Cas9 orthologs. 
 
'''Structure of the sgRNA–Target DNA Complex'''
 
The SaCas9 sgRNA consists of the guide region (G1–C20),repeat region (G21–G34), tetraloop (G35–A38), anti-repeat region (C39–C54), stem loop 1 (A56–G68), and single-stranded linker (U69–U73), with A55 connecting the anti-repeat region and stem loop 1. U73 at the 3' end is disordered in the present structure. The guide region (G1–C20) and the target DNA strand (dG1–dC20) form the <scene name='74/742625/Cv6/11'>guide:target heteroduplex</scene>, whereas the target DNA strand (dC(8)–dA(1)) and the non-target DNA strand (dT1*–dG8*) form a <scene name='74/742625/Cv6/12'>PAM-containing duplex</scene> (referred to as the PAM duplex). The repeat (G21–G34) and anti-repeat (C39–C54) regions form a distorted duplex (referred to as the <scene name='74/742625/Cv6/13'>repeat:anti-repeat duplex</scene>) via 13 Watson-Crick base pairs. <scene name='74/742625/Cv6/16'>Stem loop 1</scene> is formed via three Watson-Crick base pairs (G57:C67–C59:G65) and two non-canonical base pairs (A56:G68 and A60:A63). U64 does not base pair with A60 and is flipped out of the stem loop. The N1 and N6 of A63 hydrogen bond with the 2'-OH and N3 of A60, respectively. G68 stacks with G57:C67, with the G68 N2 interacting with the backbone phosphate group between A55 and A56. A55 adopts the ''syn'' conformation, and its adenine base stacks with U69. In addition, the N1 of A55 hydrogen bonds with the 2'-OH of G68, thus stabilizing the basal region of stem loop 1. An adenosine residue immediately after the repeat:anti-repeat duplex is highly conserved among CRISPR-Cas9 systems, and the equivalent adenosine in the SpCas9 sgRNA, A51, also adopts the ''syn'' conformation, suggesting that these adenosine residues play conserved key roles in connecting the repeat:anti-repeat duplex and stem loop 1.
 
'''Recognition Mechanism of the Guide:Target Heteroduplex'''
 
The <scene name='74/742625/Cv6/17'>guide:target heteroduplex is accommodated in the central channel</scene> formed between the REC and NUC lobes. The sugar-phosphate backbone of the PAM-distal region (A3–U6) of the sgRNA interacts with the <scene name='74/742625/Cv6/20'>REC lobe (Thr238, Tyr239, Lys248, Tyr256, Arg314, Asn394, and Gln414)</scene>. In SpCas9 and SaCas9, the RNA–DNA base pairing in the 8 bp PAM-proximal ‘‘seed’’ region in the guide:target heteroduplex is critical for Cas9-catalyzed DNA cleavage. Consistent with this, the phosphate backbone of the sgRNA seed region (C13–C20) is extensively recognized <scene name='74/742625/Cv6/23'>by the bridge helix (Asn44, Arg48, Arg51, Arg55, Arg59, and Arg60) and the REC lobe (Arg116, Gly117, Arg165, Gly166, Asn169, Arg208, Arg209, and Tyr211)</scene>, as in the case of SpCas9. These structural observations explain the RNA-guided DNA targeting mechanism of SaCas9. The C-terminal region of the REC lobe interacts with the PAM-distal region of the heteroduplex, whereas the N-terminal region of the REC lobe interacts with the repeat:anti-repeat duplex and the PAM-proximal region of the heteroduplex. Notably, the C-terminal region of the REC lobe of SaCas9 shares structural similarity with those of SpCas9 (PDB: [[4un3]], 26% identity, rmsd of 1.9 A˚ for 177 equivalent Ca atoms) and AnCas9 (PDB: [[4oge]], 16% identity, rmsd of 3.2 A˚ for 167 equivalent Ca atoms). These structural findings suggested that the Cas9 orthologs recognize the PAM-distal region of the guide:target heteroduplex in a similar manner.
 
'''Recognition Mechanism of the sgRNA Scaffold'''
 
The <scene name='74/742625/Cv6/32'>repeat:anti-repeat duplex is recognized by the REC and WED domains, primarily through interactions between the protein and the sugar-phosphate backbone</scene>. Consistent with our data showing that the distorted repeat:anti-repeat duplex is critical for Cas9-catalyzed DNA cleavage, the <scene name='74/742625/Cv6/33'>internal loop is recognized by the WED domain</scene>. The 2'-OH of C30 hydrogen bonds with <scene name='74/742625/Cv6/34'>Tyr868</scene>, and the backbone phosphate groups of U31, C45, and U46 interact with <scene name='74/742625/Cv6/35'>Lys870, Arg792, and Lys881</scene>, respectively. These structural observations explain the structure-dependent recognition of the repeat:anti-repeat duplex by SaCas9. Stem loop 1 is recognized by the bridge helix and the REC lobe. The phosphate backbone of <scene name='74/742625/Cv6/39'>stem loop 1</scene> interacts with the bridge helix (<scene name='74/742625/Cv6/40'>Arg47, Arg54, Arg55, Arg58, and Arg59</scene>) and the REC lobe (<scene name='74/742625/Cv6/41'>Arg209, Gly216, and Ser219</scene>). The <scene name='74/742625/Cv6/42'>2'-OH of A63 hydrogen bonds with His62</scene>. The flipped-out <scene name='74/742625/Cv6/43'>U64 is recognized by Arg209 and Glu213 via stacking and hydrogen-bonding interactions</scene>, respectively. A55 is extensively recognized by the phosphate lock loop. The <scene name='74/742625/Cv7/3'>N6, N7, and 2'-OH of A55 hydrogen bond with Asn780/Arg781, Leu783, and Lys906</scene>, respectively. <scene name='74/742625/Cv7/4'>Lys57 interacts with the backbone phosphate group between C54 and A55, and the side chain of Leu783 forms hydrophobic contacts with the nucleobases of A55 and A56</scene>. The phosphate backbone of the <scene name='74/742625/Cv7/5'>linker region electrostatically interacts with the RuvC domain (Arg452, Lys459, and Arg774) and the phosphate lock loop (Arg781)</scene>, and the nucleobase of <scene name='74/742625/Cv7/6'>G70 stacks with the side chain of Arg47</scene> on the bridge helix.
 
 
'''Recognition Mechanism of the 5'-NNGRRT-3' PAM'''
 
SaCas9 recognizes the 5'-NNGRRN-3' PAM, with a preference for a thymine base at the 6th position, which is distinct from the 5'-NGG-3' PAM of SpCas9. In the present structures containing either the <scene name='74/742625/Cv8/3'>5'-TTGAAT-3' PAM</scene> or the 5'-TTGGGT-3' PAM, the PAM duplex is sandwiched between the WED and PI domains, and the PAM in the non-target DNA strand is read from the major groove side by the PI domain. dT1* and dT2* do not directly contact the protein. Consistent with the observed requirement for the 3rd G in the 5'-NNGRRT-3' PAM, the O6 and N7 of dG3* form bidentate hydrogen bonds with the side chain of Arg1015, which is anchored via salt bridges with Glu993 in both complexes. In the 5'-TTGAAT-3' PAM complex, the <scene name='74/742625/Cv8/4'>N7 atoms of dA4* and dA5* form direct and water-mediated hydrogen bonds with Asn985 and Asn985/Asn986/Arg991</scene>, respectively. In addition, the N6 of dA5* forms a water-mediated hydrogen bond with Asn985. Similarly, in the 5'-TTGGGT-3' PAM complex, the N7 atoms of dG4* and dG5* form direct and water-mediated hydrogen bonds with Asn985 and Asn985/Asn986/Arg991, respectively. The O6 of dG5* forms a water-mediated hydrogen bond with Asn985. These structural features explain the ability of SaCas9 to recognize the purine nucleotides at positions 4 and 5 in the 5'-NNGRRT-3' PAM. The O4 of dT6* hydrogen bonds with Arg991, explaining the preference of SaCas9 for the 6th T in the 5'-NNGRRT-3' PAM. Single alanine mutations of these PAM-interacting residues reduced the cleavage activity in vivo, and double mutations abolished the activity, confirming the importance of Asn985, Asn986, Arg991, Glu993, and Arg1015 for PAM recognition. In addition, the phosphate backbone of the PAM duplex is recognized from the minor groove side by the WED domain (Tyr789, Tyr882, Lys886, Ans888, Ala889, and Leu909), in a distinct manner from that in SpCas9. Together, these structural and functional data have revealed the mechanism underlying the relaxed recognition of the 5'-NNGRRT-3' PAM by SaCas9.
=See aslo=
=See aslo=
*[[Cas9]]
*[[Cas9]]
*[[Endonuclease]]
*[[Endonuclease]]
*[[Cas9 (hebrew)]]
</StructureSection>
</StructureSection>
== References ==
== References ==

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Alexander Berchansky, Michal Harel