Calculating populations of subcellular compartments using density matrix formalism |
Vadim Alexandrov 1 *, Mark Gerstein 1 2 |
1Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Ave., New Haven, Connecticut 06511 2Department of Computer Science, Yale University, 266 Whitney Ave., New Haven, Connecticut 06511
|
email: Vadim Alexandrov (vadim.alexandrov@yale.edu) |
*Correspondence to Vadim Alexandrov, Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Ave., New Haven, Connecticut 06511
bioinformatics; quantum; localization; density matrix |
In our earlier paper (Ref. [1]) we developed an integrated probabilistic system for predicting the subcellular localization of proteins and estimating the relative populations of the various compartments in yeast. To justify our formulas we show here that there is a one-to-one correspondence between our previous calculations and the prediction of a state of a many-particle quantum system. The equivalence between these two types of predictions can be easily established if one maps the probability of finding a particular protein in a certain subcellular compartment to the probability of measuring the corresponding quantum particle in one of the possible quantum states (the number of proteins being equal to the number of particles in the system, and the number of compartments being equal to the number of achievable quantum states). Once the sought correspondence is established, we can utilize a well-known formula from quantum statistical mechanics to calculate the overall occupation of a particular quantum state, associating the state with the corresponding subcellular compartment. In the present work we present the details of how we arrived at the formula for the compartment population, borrowing the tools from quantum statistical mechanics. © 2001 John Wiley & Sons, Inc. Int J Quantum Chem, 2001 |
Received: 13 March 2001; Accepted: 24 May 2001
10.1002/qua.1704 About DOI
Introduction
In every living cell there is a complex machinery that sorts newly synthesized proteins and sends them to their final locations (subcellular compartments). Without an accurate estimate of the compartmental populations, it is hard to comprehend how a particular type of cell functions as a whole at the molecular level. However, with the advent of whole-genome sequencing, we now know the sequences of many proteins without knowing their localization.
In our earlier work [1] we developed a way of estimating the overall compartment population (i.e., the total number of proteins in a compartment) without rigidly localizing proteins to a single compartment. For each protein n in the cell we constructed a discrete likelihood distribution that we called a probability state vector, Pn, whose coordinates are the probabilities to find a particular protein in each of the subcellular compartments. In order to estimate the total protein population for a particular compartment, we proposed to extract the corresponding components from the probability state vectors and add them up. The closest integer to the result of this summation would be the sought occupation number (population) for the given compartment. This approach differs from those previously suggested [2-8] in that it estimates the overall population without requiring individual predictions for every individual protein. While this summation of probabilities may appear to be intuitively obvious, its formal justification is not trivial. Fortunately, the necessary mathematical formalism to accomplish a very similar task of estimating the state occupation probabilities (and, thus, overall populations) was developed and successfully applied a long time ago in quantum mechanics [9-11]. We recognized this simple fact and established a one-to-one correspondence between the probability state vectors and the set of predictions regarding the results of measurements performed on a hypothetical quantum mechanical system.
Theory
CALCULATION OF STATE POPULATIONS IN STATISTICAL QUANTUM MECHANICS
In general, the incomplete information one has about the system usually presents itself in statistical quantum mechanics in the following way: the state of this system may be either the state 1 with a probability p1 or the state 2 with a probability p2, etc., such that the set of these probabilities pj forms a distribution, i.e., jpj=1. We then say that we are dealing with a statistical mixture of states 1,2, with probabilities p1,p2, (i.e., the initial state vector of the system is not perfectly known). Note that these states need not necessarily be eigenstates of any operator, or be orthogonal, but they can always be chosen normalized.
In case of the so-called pure state l (when all probabilities pj are zero except pj=l), the probability P(an) to obtain a measurement an of an observable A can be found simply from the fourth postulate of quantum mechanics:
| 1 |
where is a projector onto the eigensubspace associated with an. Introducing the density operator for the pure state l, the above expression can also be written as
| 2 |
In case of the statistical mixture of states, the probability of system to be found in state l has probability pl, therefore it is clear that the result in (1) must be weighted by pl and summed over all values of l, i.e.,
| 3 |
the formula widely used in quantum statistical mechanics. In the above equation we also introduced the density operator for the whole system,
| 4 |
Let us mention here that if the states l are orthogonal, then the density operator in this representation will have a diagonal form.OUR APPROACH: BUILDING THE BRIDGE
Now, we notice that the state vector for a particular protein n (total number of proteins equal to N),
| 5 |
representing the probabilities pn(l) of this protein to be found in one of the M subcellular compartments, can be thought of as a set of predictions concerning the results of measurements performed on a quantum mechanical system consisting of N distinguishable particles, each of which can be found in one of the M states. Since the subcellular compartments do not overlap (one particular protein can not simultaneously belong to two different compartments), we can claim that the possible states of the introduced hypothetical quantum mechanical system are orthogonal. Therefore, the density matrix corresponding to each individual protein will necessarily have the diagonal form
| 6 |
Note that Tr{n}=1 for each n, manifesting the conservation of probability for each particle.The projector onto a particular state subspace (corresponding to a particular subcellular compartment, for example, the first one) then assumes the following matrix form:
| 7 |
One can see how this projector onto the subspace spanned by the first state (compartment) effectively extracts the probability for particle n (protein n) to be found in state one (compartment one):
| 8 |
The projector onto this subspace for the whole system of N distinguishable particles will be just a direct sum (denoted by below) of projectors onto this subspace summed over all particles,
| 9 |
or, in the matrix notation,
| 10 |
or, in the matrix form,
| 11 |
The corresponding occupation number (population) for this state will be just
| 12 |
What might seem to be natural even without using the density matrix formalism, i.e., summing over the probabilities related only to the first compartment in (12) to get the total occupancy for this compartment, is now rigorously proved to be a valid mathematical procedure justifying our heuristic protein localization approach described earlier.
One can easily think of the further extensions of the discussed formalism to study protein bulk traffic among compartments. For instance, if we assume that proteins in the cell are not necessarily confined to a particular compartment (e.g., a particular protein may contain several different localization signal sequences), the quantum states of the corresponding quantum system will not be orthogonal any more, introducing the nondiagonal elements in the density matrix. Therefore, by diagonalizing the density matrix it should be possible to find the natural basis vectors of intercompartmental traffic, which may provide further insight for a biologist about the mysterious dynamics of the cell functions.
Conclusion
By establishing a one-to-one correspondence of the original bioinformatics problem with a well-known formula for the occupation probability in statistical quantum mechanics we justified our previously proposed method for calculating populations of subcellular compartments. Being intrinsically probabilistic by nature, quantum theory turns out to be useful in contemplating the nascent statistical laws of bioinformatics and providing a new perspective for exploration of this emerging field.
1 | Drawid, A.; Gerstein, M. J Mol Biol 2000, 301, 1059-1075. Links |
2 | Nakai, K.; Kanehisa, M. Prot Struct Funct Genet 1991, 11, 95-110. Links |
3 | Nakai, K.; Kanehisa, M. Genomics 1992, 14, 897-911. Links |
4 | Nakai, K.; Horton, P. Trends Biochem Sci 1999, 24, 34-36. Links |
5 | Nakai, K.; Horton, P. Intelligent Systems for Molecular Biology 1996, 4, 109-115. Links |
6 | Nakai, K.; Horton, P. Intelligent Systems for Molecular Biology 1997, 5, 147-152. Links |
7 | Reinhardt, A.; Hubbard, T. Nucleic Acids Res 1998, 26, 2230-2236. Links |
8 | Andrade, M.; O'Donoghue, S.; Rost, B. J Mol Biol 1998, 276, 517-525. Links |
9 | Bohm, A. in Quantum Mechanics: Foundations and Applications, 2nd ed.; Springer: New York, 1986. |
10 | Cohen-Tannoudji, C.; Diu, B.; Laloe, F. in Méchanique Quantuque, 2nd ed.; Hermann: Paris, 1977. |
11 | Landau, D.; Lifshitz, E. in Kvantovaia Mehanika (Course of Theoretical Physics; Vol. 3); Nauka: Moscow, 1980. |
Copyright © 1999 by John Wiley & Sons, Inc. All rights reserved.