|Full Text + Links|
|PDF (160 K)|
|Save as Citation Alert|
Thermostability of membrane protein helix–helix interaction elucidated by
statistical analysis Edited by Gunnar von Heijne
Mark Gerstein and Donald M. Engelman,
Department of Molecular Biophysics and Biochemistry, Yale University, P.O. Box 208114, New Haven, CT 06520-8114, USA
Received 14 October 2002; revised 1 November 2002; accepted 4 November 2002. ; Available online 14 November 2002.
Dirk Schneider1, Yang Liu1, Mark Gerstein and Donald M. Engelman,
A prerequisite for the survival of (micro)organisms at high temperatures is an adaptation of protein stability to extreme environmental conditions. In contrast to soluble proteins, where many factors have already been identified, the mechanisms by which the thermostability of membrane proteins is enhanced are almost unknown. The hydrophobic membrane environment constrains possible stabilizing factors for transmembrane domains, so that a difference might be expected between soluble and membrane proteins. Here we present sequence analysis of predicted transmembrane helices of the genomes from eight thermophilic and 12 mesophilic organisms. A comparison of the amino acid compositions indicates that more polar residues can be found in the transmembrane helices of thermophilic organisms. Particularly, the amino acids aspartic acid and glutamic acid replace the corresponding amides. Cysteine residues are found to be significantly decreased by about 70% in thermophilic membrane domains suggesting a non-specific function of most cysteine residues in transmembrane domains of mesophilic organisms. By a pair-motif analysis of the two sets of transmembrane helices, we found that the small residues glycine and serine contribute more to transmembrane helix–helix interactions in thermophilic organisms. This may result in a tighter packing of the helices allowing more hydrogen bond formation.
Author Keywords: Thermostability; Sequence analysis;
Transmembrane helix; Genome; Mesophilic; Thermophilic
In contrast to soluble proteins, no rigorous analysis has been performed on integral membrane proteins of thermophiles. As with mesophiles, most isolated soluble enzymes from hyperthermophiles show maximal activities at temperatures close to the optimal growth temperatures of the host organisms . Previous biophysical and computational studies have revealed many factors that contribute to the thermostability of soluble proteins, including hydrogen bonding, hydrophobic internal packing, tighter packing and salt bridge optimization (for a recent review see ). From these studies, it became obvious that there is no single or most preferred mechanism for increasing thermostability. The hydrophobic environment of the membrane restricts the possibilities for adaptation of membrane proteins to high temperatures. It has been shown that the loops connecting the individual -helices of membrane proteins contribute to the stability of the interactions of the individual transmembrane helices (TM) , enhancement of the interactions within these loops could increase the stability of helix–helix interactions without changing the helices themselves. It is also known that microorganisms adapt to high temperatures by changing their lipid composition . Since the lipid environment contributes to the stability of membrane protein's helix–helix interactions [5 and 6], the different lipid composition might modulate the stability of the interaction of transmembrane -helices. But, what of the contribution of the helices themselves?
Here we present amino acid composition analysis and a statistical pair-motif analysis of the TMs from eight thermophilic and 12 mesophilic organisms. The results indicate that hydrogen bonding and tighter packing of the TMs add extra stability to TM–TM interactions at higher temperatures.
Open reading frames from eight (hyper)thermophilic and 12 mesophilic bacteria from a proteome analysis database  ( Table 1) were submitted to the TMHMM server  for TM prediction. To eliminate most signal sequences, predicted TM sequences were excluded if they had a charged residue within the first seven residues and a stretch of 14 residues with a GES hydrophobicity value under -1 kcal/mol [9 and 10].
Table 1. Thermophilic and mesophilic organisms whose proteoms were used for TM prediction and analysis
For the prediction of pair motifs, the TM sequences were screened for homology by eliminating each sequence that was highly similar to another sequence in the same TMs, as described in detail by Senes et al. . The pair analysis was performed on all combinations of amino acids separated by one to 10 residues. Pairs of amino acids at the positions i, i+k are separated by k-1 residues. In order to limit the pair analysis to the core amino acids of the TMs, the analysis was performed using a hydrophobic window of 18 amino acids instead of the entire TM.
The statistical significance of an identified amino acid pair from the respective average expected value of this pair was calculated by the two-tailed integral of the probably pair distribution as discussed in detail previously .
The statistical significance of the difference in the amino acid composition of the thermophilic and mesophilic sample means was calculated by a one-tailed t-test. The number of degrees of freedom was 18. The t-test assumes a normally distributed sample. A less than 5% possibility (|t|-values>1.73) that the difference between the means is due to chance (P<0.05) was accepted to be significant. At a 1% level of significance (P<0.01) |t|-values of <2.55 were rejected.
Using the TMHMM program, a total of 21473 TMs for the mesophilic set of proteins and 13340 for the thermophilic organism were predicted. The average length of the predicted TMs was 22 for both sets of proteins (thermophilic and mesophilic), which is in good agreement with the average length of TMs predicted in other studies [11 and 12].
Previous studies of the amino acid composition of all proteins in an organism [2 and 13] as well as a study of the amino acid compositions of soluble proteins from mesophilic and thermophilic organisms  have already indicated trends for the mechanism of thermostabilization of proteins. In these studies an increase in the content of charged residues was found in thermophiles, mostly at the expense of uncharged polar residues [2 and 14]. In thermophilic -helical proteins the charged residues are preferentially arranged in an i, i+4 helical repeat pattern, suggesting stabilization of the thermophilic proteins via intrahelical salt bridges . However, this trend may not apply to the thermophilic membrane proteins, since acidic and basic amino acids are often uncharged in the membrane environment and therefore are unlikely to form salt bridges. However, the presence of uncharged basic and acid residues could still result in strong interactions of TMs [15 and 16]. By analyzing the localization of polar residues in the TMs of solved membrane protein structures, it has been shown that these residues form hydrogen bonds between TMs, resulting in strong interactions between pairs of TMs [17 and 18]. Although the content of ionizable residues is generally very low in the hydrophobic membrane environments, the amino acid analysis of the sets of mesophilic and thermophilic TMs showed that the compositions of the acidic residues, Asp and Glu, are increased in the TMs of thermophiles ( Fig. 1) while those of the basic residues, Lys, His, and Arg, remains almost unchanged. As already observed in an overview of thermophilic proteins, the higher content of ionizable residues is coupled to the reduction of the uncharged polar residues Gln and Asn in the TMs. The reduction of the content of Asn and Gln could be explained by the fact that the amides are less stable under high temperature conditions and these residues can be deaminated to Asp and Glu. On the other hand, although Asp and Glu cannot contribute to the membrane protein's stability by forming salt bridges, they can form stronger interhelical hydrogen bonds than Asn and Gln , resulting in a more stable packing between two TMs. The enhanced strength of the hydrogen bonds is most likely caused by the higher electronegativity of the oxygen atom in the acids compared to the nitrogen atom in the corresponding amides. Generally, TMs contain fewer ionizable residues, because the free energy of partitioning ionizable amino acids into the non-polar membrane environment is not favorable . Nevertheless, the membrane incorporation of polar residues is thermodynamically more favorable under higher temperature conditions, which may explain the increased occurrence of the ionizable residues in TM helices of thermophilic organisms compared to the mesophilic TM domains. Since most proteoms of thermophiles are from archeae, we also cannot completely exclude the possibility that the observed differences in the composition of the polar residues is due to generic effects between archaea and bacteria.
Fig. 1. Relative amino acid composition of TM domains from mesophilic and thermophilic proteoms. A: Amino acid composition (%) of mesophilic and thermophilic TM domains. The variations within the mesophilic and thermophilic proteoms are given as error bars. B: Variation of the amino acid composition in thermophilic relative to mesophilic TM domains. Columns of differences that are statistically significant (t-statistics) at the 5% level (P<0.05) are dashed. Columns of differences statistically significant at a 1% level (P<0.01) are shown in black.
A somewhat higher content of Tyr and Trp is found in the TMs of thermophiles (5.44%) versus the mesophiles (4.88%), and Tyr is used more than Trp. It is well known that these amino acids are frequently found near the boundary between the non-polar bilayer interior and the lipid headgroups, where they may stabilize TMs by a combination of non-polar and hydrogen bonding interactions with the bilayer [20, 21, 22 and 23]. Further, Tyr may contribute to individual helix stability, as it is able to form intrahelical hydrogen bonds [17 and 18].
We found that the content of Cys in TM helices from thermophilic organisms is strikingly decreased, to about 30% of its occurrence in mesophiles. Statistical analysis of the soluble proteins of mesophiles and thermophiles revealed that the occurrence of Cys is, on average, reduced by 10% in thermophiles . Since most thermophiles used in this study are acrchaea, we compared the Cys content of the thermophilic eubacteria Aquifex aeolocus to the average Cys content of the mesophilic eubacteria. Compared to the mesophilic group, the Cys content in the membrane protein domains of Aquifex is decreased by over 80%, indicating that the observed difference in the Cys content is not caused by the analysis of the archaea vs. eubacteria. A further differentiation of the thermophilic group into moderately thermophilic (Methanobacterium thermoautotrophicum and Thermoplasma acidophilium) and hyperthermophilic organisms also did not show significant differences. Although there is a certain variation in the Cys content in mesophiles (0.87–2.98%) as well as in thermophiles (0.14–1.13%), the limits are strongly shifted to much lower percentages in the proteomes from thermophilic organisms. This general decrease may be explained by the higher reactivity and sensitivity of Cys at high temperatures . While in soluble proteins disulfide bonds can contribute to the stability of an extracytoplasmic protein by decreasing the entropy of the protein's unfolded state , the occurrence of a disulfide bridge in the membrane environment has not yet been observed and pairs of Cys residues close enough in space to interact with each other can only rarely be found in the transmembrane domains of membrane proteins . In the membrane, Cys can form hydrogen bonds stabilizing a TM (by forming an intrahelical hydrogen bond) or TM–TM interaction [17, 18 and 25]. Previous statistical analysis of the amino acid distribution within 22 protein families showed that a GC bias in the DNA sequence results in an enrichment of some amino acids while others are under-represented [26 and 27]. Though Cys is one of the amino acids found to be under-represented in the proteins from extreme thermophiles (Ile, Met, Phe, Ser, Thr, Cys, Trp) , only the content of Trp is also decreased in TMs from thermophilic organisms, which suggests that this factor does not generally affect the amino acid composition of TM domains. Since the thermolability of Cys alone cannot explain the different degree of Cys reduction in soluble and TM domains, its decreased occurrence might be explained by some specific function of this amino acid in the membrane environment. Besides contributing to the stability of a TM by forming hydrogen bonds, Cys also fulfills specific functions, like cofactor binding. A recent statistical analysis of Cys clustering in thermophilic versus mesophilic proteomes revealed that, even though the Cys residues involved in specific functions are conserved, the overall content of Cys residues was found to be decreased in the proteome of thermophiles . The high number of replaceable Cys residues in membrane proteins might reflect a minimal usage of this residue in specific functions in membrane proteins compared to soluble proteins.
While in soluble proteins the amino acid Pro is known not to be tolerated in -sheets as well as in -helices, this amino acid can frequently be found in TM -helices. The relatively high occurrence of Pro in TMs can be explained by distinct roles of Pro in membranes as discussed in . In addition, recent data suggest that Pro residues in TMs are destabilizers of misfolded states of TM domains helping to prevent misfolding of membrane proteins . In this context, the significantly higher content of Pro in TMs of thermophilic organisms ( Fig. 1) can be explained by the higher possibility of hydrophobic proteins to aggregate and misfold under high temperature conditions. The higher occurrence of Pro in TMs could facilitate to prevent TM protein misfolding in thermophiles.
To find possible pair-motifs that mediate TM–TM interactions in membrane domains from thermophiles vs. mesophiles, we analyzed frequently occurring pairs of residues in the two datasets of TM domains (Table 2). In the group of over-represented pair-motifs for mesophilic TMs, seven motifs can be found that contain a Cys residue, while in thermophilic TMs none can be found containing this residue. The occurrence of Cys residues in over-represented pair motifs suggests an architectural function of these residues in the TMs of mesophilic organisms, and the absence of Cys in the TM domains of thermophilic organisms suggests that the architectural function of the Cys residue containing motifs can be fulfilled by other motifs at high temperatures. This assumption again shows the strong disfavor of Cys in thermophilic proteins.
Table 2. The most significant over-represented pair-motifs of mesophilic and thermophilic TM helices sorted by significance down to 10-6
We also found that Ala, Gly and Ser residues have higher occurrences in pair motifs in thermophilic TM domains. At a 10-6 significance level 24 of 36 pairs (67%) contain one of these three in thermophiles, whereas only 16 of 38 pair motifs (42%) do so in mesophiles. The role of Gly residues for the interaction of membrane proteins was recently described , and it was also shown that pair-motifs containing Gly residues stabilize proteins under higher temperature conditions . It is likely that the presence of small residues (Ala, Gly, Ser) favors a tighter packing of TM helices which can allow the formation of more hydrogen bonds, especially with the backbone of a TM . Ser residues can also drive the interaction of TMs by forming hydrogen bonds [17 and 33], possibly explaining its higher occurrence in thermophilic helix motifs. Although seven out of the eight thermophilic proteoms are from archaea, while all mesophiles are bacteria, the generally higher preference for small residues due to generic effects of genetic drift between archaea and bacteria can be excluded since t-statistics on the amino acid composition did not prove any significant differences of these residues in thermophiles vs. mesophiles.
By analyzing eight thermophilic and 12 mesophilic proteomes, we found a significant increase in the relative amount of the charged residues Asp and Glu and a decrease of the polar, non-ionizable amino acids Asn and Gln. It is likely that the ionizable residues contribute to a network of hydrogen bonds for stabilizing the TM–TM interactions under high temperature conditions. The preference for the small residues Ala, Gly and Ser in pair-motifs suggests a more tightly packed TM interface in membrane proteins of thermophilic bacteria.
As observed already for soluble proteins, the content of Cys residues is
reduced in helical membrane proteins of thermophilic bacteria. This most likely
reflects a non-specific function of most Cys residues in helical membrane
proteins, allowing a replacement of this residue by less reactive residues.
We thank the members of the research groups for helpful comments and
critically reading of the manuscript and especially A. Senes for his help and
advice. This work was supported by grants from the NIH, the NSF and the
Bundesministerium für Bildung und Forschung (Leopoldina fellowship to D.S.).
1. C. Vieille, D.S. Burdette and J.G. Zeikus Biotechnol. Annu. Rev. 2 (1996), pp. 1–83. Abstract-MEDLINE
2. C. Vieille and G.J. Zeikus Microbiol. Mol. Biol. Rev. 65 (2001), pp. 1–43. Abstract-MEDLINE | Abstract-EMBASE | Abstract-Elsevier BIOBASE | Abstract-BIOTECHNOBASE
3. T.W. Kahn, J.M. Sturtevant and D.M. Engelman Biochemistry 31 (1992), pp. 8829–8839. Abstract-EMBASE | Abstract-MEDLINE
4. B. Tolner, B. Poolman and W.N. Konings Comp. Biochem. Physiol. A: Physiol. 118 (1997), pp. 423–428. Abstract | PDF (493 K)
5. P. Lague, M.J. Zuckermann and B. Roux Biophys. J. 81 (2001), pp. 276–284. Abstract-INSPEC
6. S. Mall, R. Broadbridge, R.P. Sharma, J.M. East and A.G. Lee Biochemistry 40 (2001), pp. 12379–12386. Abstract-MEDLINE | Abstract-EMBASE | Abstract-Elsevier BIOBASE | Full Text via CrossRef
7. R. Apweiler et al.Nucleic Acids Res. 29 (2001), pp. 44–48. Abstract-MEDLINE | Abstract-EMBASE | Abstract-Elsevier BIOBASE | Abstract-BIOTECHNOBASE | Full Text via CrossRef
8. A. Krogh, B. Larsson, G. von Heijne and E.L. Sonnhammer J. Mol. Biol. 305 (2001), pp. 567–580. SummaryPlus | Full Text + Links | PDF (937 K)
9. M. Gerstein Proteins 33 (1998), pp. 518–534. Abstract-EMBASE | Abstract-MEDLINE | Full Text via CrossRef
10. D.M. Engelman, T.A. Steitz and A. Goldman Annu. Rev. Biophys. Biophys. Chem. 15 (1986), pp. 321–353. Abstract-MEDLINE
11. A. Senes, M. Gerstein and D.M. Engelman J. Mol. Biol. 296 (2000), pp. 921–936. SummaryPlus | Full Text + Links | PDF (327 K)
12. E. Wallin and G. Von Heijne Protein Sci. 7 (1998), pp. 1029–1038. Abstract-BIOTECHNOBASE | Abstract-Elsevier BIOBASE | Abstract-MEDLINE | Abstract-EMBASE
13. D. Rajdeep and M. Gerstein Funct. Integr. Genomics 1 (2000), pp. 76–88.
14. S. Chakravarty and R. Varadarajan FEBS Lett. 470 (2000), pp. 65–69. Abstract | PDF (74 K)
15. F.X. Zhou, M.J. Cocco, W.P. Russ, A.T. Brunger and D.M. Engelman Nat. Struct. Biol. 7 (2000), pp. 154–160. Abstract-MEDLINE
16. F.X. Zhou, H.J. Merianos, A.T. Brunger and D.M. Engelman Proc. Natl. Acad. Sci. USA 98 (2001), pp. 2250–2255. Abstract-MEDLINE | Full Text via CrossRef
17. L. Adamian and J. Liang Proteins 47 (2002), pp. 209–218. Abstract-EMBASE | Abstract-MEDLINE | Abstract-Elsevier BIOBASE | Full Text via CrossRef
18. L. Adamian and J. Liang J. Mol. Biol. 311 (2001), pp. 891–907. SummaryPlus | Full Text + Links | PDF (336 K)
19. W.C. Wimley and S.H. White Nat. Struct. Biol. 3 (1996), pp. 842–848. Abstract-Elsevier BIOBASE | Abstract-MEDLINE
20. I.T. Arkin and A.T. Brunger Biochim. Biophys. Acta 1429 (1998), pp. 113–128. SummaryPlus | Full Text + Links | PDF (916 K)
21. R.A. Reithmeier Curr. Opin. Struct. Biol. 5 (1995), pp. 491–500. SummaryPlus | Full Text + Links | PDF (1249 K)
22. H.A. Rinia et al.Biochemistry 41 (2002), pp. 2814–2824. Abstract-MEDLINE | Abstract-Elsevier BIOBASE | Abstract-EMBASE | Full Text via CrossRef
23. W.M. Yau, W.C. Wimley, K. Gawrisch and S.H. White Biochemistry 37 (1998), pp. 14713–14718. Abstract-EMBASE | Abstract-MEDLINE | Full Text via CrossRef
24. M. Matsumura, G. Signor and B.W. Matthews Nature 342 (1989), pp. 291–293. Abstract-MEDLINE | Abstract-INSPEC
25. J.L. Popot and D.M. Engelman Annu. Rev. Biochem. 69 (2000), pp. 881–922. Abstract-MEDLINE | Abstract-BIOTECHNOBASE | Abstract-EMBASE | Abstract-Elsevier BIOBASE | Full Text via CrossRef
26. M. Di Giulio Gene 261 (2000), pp. 189–195. SummaryPlus | Full Text + Links | PDF (130 K)
27. V. Wilquet and M. Van de Casteele Res. Microbiol. 150 (1999), pp. 21–32. Abstract | PDF (840 K)
28. V. Rosato, N. Pucello and G. Giuliano Trends Genet. 18 (2002), pp. 278–281. SummaryPlus | Full Text + Links | PDF (61 K)
29. C.M. Deber and A.G. Therien Nat. Struct. Biol. 9 (2002), pp. 318–319. Abstract-MEDLINE | Abstract-EMBASE | Abstract-Elsevier BIOBASE
30. W.C. Wigley et al.Nat. Struct. Biol. 9 (2002), pp. 381–388. Abstract-MEDLINE | Abstract-EMBASE | Abstract-Elsevier BIOBASE | Abstract-BIOTECHNOBASE
31. G. Kleiger, R. Grothe, P. Mallick and D. Eisenberg Biochemistry 41 (2002), pp. 5990–5997. Abstract-MEDLINE | Abstract-EMBASE | Abstract-Elsevier BIOBASE | Full Text via CrossRef
32. A. Senes, I. Ubarretxena-Belandia and D.M. Engelman Proc. Natl. Acad. Sci. USA 98 (2001), pp. 9056–9061. Abstract-MEDLINE | Abstract-EMBASE | Abstract-Elsevier BIOBASE | Abstract-BIOTECHNOBASE | Full Text via CrossRef
J.P. Dawson, J.S. Weinger and D.M. Engelman J. Mol. Biol.
316 (2002), pp. 799–805. SummaryPlus
Text + Links | PDF
1 These authors contributed equally to this work.
Corresponding author: Fax: (1)-203-432 6381; email: firstname.lastname@example.org
Volume 532, Issues 1-2, 4 December 2002, Pages 231-236
Send feedback to ScienceDirect