HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | SEARCH RESULT |
| |||||||
Collections under which this article appears: Computational methods |
1 Department of Molecular Biophysics and Biochemistry, 2 Computer Science, 266 Whitney Avenue, Yale University, PO Box 208114, New Haven, CT 06520, USA, 3 Department of Biological Sciences and Center for Computational Biology and Bioinformatics, Columbia University, 1212 Amsterdam Avenue MC2441, New York, NY 10027, USA
Ronald Jansen, Computational Biology Center, Memorial Sloan-Kettering Cancer Center, 307 East 63rd Street, New York, NY 10021, USA
Received August 5, 2002; Revised January 23, 2003; Accepted February 18, 2003
ABSTRACT |
---|
TOP ABSTRACT INTRODUCTION MATERIALS AND METHODS RESULTS DISCUSSION SUPPORTING WEBSITE SUPPLEMENTARY MATERIAL REFERENCES |
---|
INTRODUCTION |
---|
TOP ABSTRACT INTRODUCTION MATERIALS AND METHODS RESULTS DISCUSSION SUPPORTING WEBSITE SUPPLEMENTARY MATERIAL REFERENCES |
---|
In 1987, the codon adaptation index (CAI) was proposed as a quantitative way of predicting the expression level of a gene based on its codon sequence (1). More recently, the codon usage was introduced as an alternative quantitative indicator (3). It also uses the occurrence of codons in a gene sequence to predict whether genes are likely to be highly expressed, although the formalism is quite different from the one used for the CAI. A related method, the codon bias formalism, is based on similar principles (6).
Expression level indicators such as these are widely used and are important in a variety of contexts. First, there is the annotation of genome sequences. The expression level indicators can serve as one of the variables to determine how likely the transcription and translation of an open reading frame (ORF) into a protein product is. Secondly, in heterologous gene expression, the codon-based expression indicators are helpful for finding the codon sequences that are most likely to yield high expression. The codon-based expression indicators and related methods are also often used as convenient rules of thumb in other applications.
Given that the codon-based expression models have these important applications, it is perhaps surprising that they are still based on rather qualitative assumptions about gene expression. For instance, the parameters underlying the CAI model rely on the codon composition of only a limited set of highly expressed genes; to define the parameters in the CAI model (see below), Sharp and Li counted the codon frequency in only 24 highly expressed genes (1). About half of these genes are ribosomal; the remaining ones are mostly metabolic enzymes.
In the codon usage model, the parameters are based on a somewhat broader set of highly expressed genes. The codon usage model has mainly been applied to fast growing bacteria, for which, as Karlin et al. have shown, it is a reasonable assumption that ribosomal genes, chaperones, and translation processing factors are highly expressed (7,8).
In summary, the codon-based expression models are based on qualitative estimates of the expression levels of limited gene sets. But since these models were first proposed, several quantitative expression data sets, covering the majority of genes in a genome, have become available. This raises the natural question whether we could improve the parameters of the codon-based expression indicators by considering larger sets of genes with more accurate expression data. We present the results of such a procedure here, using the expression information available for the organism yeast.
In the following sections we briefly recap the CAI and codon usage formalisms. Later, we show how to calculate new parameters for these models. We also propose an alternative linear model to predict the expression levels from the codon composition of genes.
The CAI model
The CAI model assigns a parameter, termed relative adaptiveness by Sharp and Li, to each of the 61 codons (stop codons excluded) (1). The relative adaptiveness of a codon is defined as its frequency relative to the most often used synonymous codon; note that this parameter is computed from a set of highly expressed genes G (we leave aside the question of how to define this set of genes for now). It is given by:
where faa,i is the frequency of codon i (which encodes amino acid aa), and faa,max the frequency of the codon most often used for encoding amino acid aa in a set of highly expressed genes G. The relative adaptiveness parameter waa,i ranges from 0 to 1, with 0 indicating that a codon is not present at all in G, and 1, a codon that occurs most often in G for a given amino acid.
The CAI of a gene g is then simply the geometric average of the relative adaptiveness of all codons in a gene sequence:
Here, wi is the relative adaptiveness of the ith codon in a gene with N codons. This formula can be transformed into:
where wk now represents the relative adaptiveness of the kth out of the 61 codons in the genetic code (excluding stop codons); Xk,g is the fraction of codon k among the total number of codons in gene g:
where Ck,g is the number of times codon k appears in gene g. Note that wk = wk(G) in equation 3 is dependent on the set of highly expressed genes G.
Like the relative adaptiveness, the CAI also ranges from 0 to 1. Higher CAI values indicate genes that are more likely to be highly expressed.
The codon usage model
Karlin et al. define the codon bias of a gene g relative to a gene set G as (4):
where paa(f) is the fraction of amino acid aa in gene g; f(x, y, z) the frequency of a codon triplet (x, y, z) in gene g normalized such that f(x, y, z) = 1 if (x, y, z) is the most common synonymous codon; g(x, y, z) is the corresponding normalized codon frequency in gene set G. Equation 5 is written in the notation of Karlin et al. We can rewrite equation 5 in our own notation as follows:
where Xk,g and Xk,G are defined as in equation 4. Note that k has replaced (x, y, z) as the summation index. Given these definitions, Karlin et al. define an expression level measure E(g) as follows (8):
where the gene set C comprises all genes in the genome, RP the ribosomal proteins, Ch chaperones, and Tf translation processing factors. E(g) is close to zero if gene g has a codon composition close to the average composition of the genome [E(g) 0 because B(g|C) 0], while E(g) would take on very large values if the codon composition of gene g is close to the composition of ribosomal genes, chaperones and translation processing factors [E(g) >> 1 because B(g|RP), B(g|Ch), B(g|Tf) 0]. The idea is that highly expressed genes tend to have higher values of E than lowly expressed genes.
Karlin et al. have shown that highly expressed genes can best be differentiated from lowly expressed genes in the multidimensional space of the different codon bias terms B(g|RP), B(g|Ch) and B(g|Tf) (8). However, in this study, we use the simplified expression measure E(g|G), defined as:
where G is a set of highly expressed genes. Thus, E is dependent on the set G that can be chosen in different ways. In other words, the parameters of the model are the 61 codon fractions Xk,G in the gene set G (see equation 6).
Given this formal description of the CAI and the codon usage, the question is how we can use the genome-wide expression data to optimize the 61 parameters in the two models with respect to the prediction of expression levels.
MATERIALS AND METHODS |
---|
TOP ABSTRACT INTRODUCTION MATERIALS AND METHODS RESULTS DISCUSSION SUPPORTING WEBSITE SUPPLEMENTARY MATERIAL REFERENCES |
---|
|
|
Parameterization of the CAI and codon usage models with whole-genome expression data
Figure 1 schematically shows the procedure we used to parameterize the CAI and codon usage models with the expression data. We start by selecting one of the three populations mentioned above as an evaluation set. The evaluation set is later used to evaluate how well the CAI or codon usage model predicts actual expression levels. We also need to define a parameterization set. The parameterization set is the set of highly expressed genes G (see Introduction); it is used to calculate the parameters wk(G) for the CAI (see equation 3) and the parameters Xk,G for the codon usage (see equation 6). To define the parameterization set, we choose one of the three populations and an expression level threshold T. We only include those genes of the population in the parameterization set whose expression level exceeds this threshold. With the parameters in hand, we are able to compute CAI and codon usage values for all genes in the evaluation set. We evaluate how well the CAI and codon usage models predict expression levels with two figures of merit: the Pearson correlation and the Spearman rank correlation. {Given a set of abundance levels a in the evaluation set, and a vector of CAI or codon usage values (C), we calculate the Pearson correlation as corr[log(a),log(C)] and the rank correlation as corr[rank(a),rank(C)]}.
|
We can iterate the procedure by changing the expression level threshold T and repeating the subsequent steps until we arrive at an optimal figure of merit. This gives us optimal parameters for the CAI and codon usage models.
Example of the CAI parameterization
Figure 2 shows a specific example of the parameterization of the CAI with [GProt, aProt] as both the parameterization and evaluation population and illustrates how the figure of merit (Pearson correlation of the CAI values and the evaluation set) changes as a function of the expression level threshold T. When the threshold reaches T = 66 200 protein copies/cell the Pearson correlation reaches a maximum. At this point, there are only 21 genes in the parameterization set. The maximum correlation is slightly greater than the correlation between the CAI based on the original parameters by Sharp and Li (1) and the same evaluation set.
|
The CAI formalism itself, slightly modified, suggests a multivariate linear model for doing this. Starting with equation 3, we can take the logarithm on both sides to obtain:
If we introduce vk log(wk) and consider that the log(CAI) is related to the logarithm of the gene expression levels, we can suggest the following linear model to predict the expression level ag of a gene g:
with the residuals
In equation 10, yg is the predicted expression level, the codon fractions Xk,g are the predictor variables and v0, ..., v61 the parameters. Note that we have introduced an intercept parameter v0 in equation 10, for which there is no equivalent in equation 9. We can then perform a standard multivariate linear regression to estimate the model parameters v0, ..., v61 by minimizing the deviance:
Reducing the number of parameters in the linear model. One problem of this regression approach is obviously the large number of parameters. This may result in overfitting, even when the regression is applied to the largest population [GmRNA, amRNA], which contains 4270 data points.
We avoided this problem by deriving a linear model that consists of fewer parameters. This is done via a forward selection of parameters, adding one predictor variable at a time (16). A similar procedure has previously been used in finding significant promoter sequence motifs (17).
We start with a model of just one predictor variable (codon fraction Xk):
which gives the residuals:
and the deviance:
Note that the deviance is dependent on the codon k. This allows us to find the codon that produces the smallest error and thus select the first predictor variable. We add this codon to a model set M.
Then we iterate this procedure. Given a model set M of codons with optimal parameter estimates, the linear model is:
This model gives the new residuals:
We then choose the next predictor variable by finding the codon k that minimizes:
This codon is then added to model set M, and we iterate the procedure described in equations 1618. Note that the interpretation of equation 18 is that the optimal predictor variable is orthogonal to the linear model of equation 16.
Significance of predictor variables. Each time we add a new predictor variable to the model, we need to check whether the corresponding parameter is significant. We can do this by observing the t statistic for a parameter estimate vk. The ratio of a parameter estimate to its standard deviation follows a t-distribution and a P-value based on this distribution can be used for testing the hypothesis that vk = 0. The t statistic and its corresponding P-values can be gathered from the standard output of a linear regression when performed in various statistical software packages (here, we used the publicly available R statistical computing environment, http://www. r-project.org/, as well as MATLAB for these computations).
To accept a predictor variable as significant we required that the P-value of the t statistic stay below = 0.05. Since we were choosing from several possible predictor variables at each step, a Bonferroni correction is necessary for this statistical test. This is equivalent to multiplying the P-value for a parameter with the number of remaining possible predictor variables. Given that there are already NM parameters in the model set M, we have a choice of 61 NM remaining predictor variables, and the condition for significance thus becomes:
P' = (61 NM)P < 19
RESULTS |
---|
TOP ABSTRACT INTRODUCTION MATERIALS AND METHODS RESULTS DISCUSSION SUPPORTING WEBSITE SUPPLEMENTARY MATERIAL REFERENCES |
---|
Table 2 generalizes the example shown in Figure 1 by listing all possible evaluation and parameterization populations for both the CAI and the codon usage. Note that the parameters of the CAI and the codon usage are in each case dependent on the parameterization population and the expression level threshold T. (The threshold T defines the number of ORFs with expression levels greater than T.) The table shows the maximum Pearson and rank correlations that can be achieved by varying T, the increase of the correlation compared with the original models ( correlation), and the size of the parameterization set at the maximum (rank) correlation, measured in number of ORFs.
A mixed picture emerges from this comprehensive collection of statistics. In many cases the new parameters improve the performance of the CAI and the codon usage (gray and black shaded squares in Table 2), but sometimes the performance is also slightly lower.
The codon usage models with the new parameters generally perform better than the model with the original parameters ( correlation is >1% six out of nine times for both the Pearson and rank correlations), whereas the improvements for the CAI are less obvious ( correlation >1% three out of nine times for both the Pearson and rank correlations).
One important observation is that the parameterization sets for which we found optimal parameters are usually very small (on the order of 100 genes or less) for both the CAI and the codon usage. This is despite the fact that we used whole-genome expression data in our calculations. An extreme example is the codon usage with parameterization population [GProt, aProt] and evaluation population [GProt, amRNA]: here, the optimal parameterization set contains only one gene (the phosphopyruvate hydratase ENO2). This alone yields a rank correlation of 0.66 with the expression data.
Linear model
We fitted the linear model of equation 16 to the population [GmRNA, amRNA] according to the iterative procedure described in the Materials and Methods. We tested models ranging from one to 61 codons (= predictor variables). The largest model for which all parameters were significant was a model with 20 codons. (The results for each model are shown in the Supplementary Material.) The values of these 20 codon parameters are shown in Figure 3. We have only used [GmRNA, amRNA] as the parameterization set because the other possible populations are too small (150 genes) relative to the possible number of parameters. When we used the reduced parameter procedure with [GProt, aProt] or [GProt, amRNA] as the parameterization populations, we found that linear models with only two predictor variables are already superseding the critical P-value of 5% (see Materials and Methods), thus making them of little use for predicting expression levels.
|
The bottom of Table 2 shows the performance of the linear model compared with the CAI and codon usage. There is no possible comparison to a set of original parameters, as in the case of the CAI and the codon usage. Instead, we compared the performance of the linear model with the performance of the original CAI and codon usage models on the same evaluation sets. The left half of the correlation column in Table 2 refers to the difference with the CAI correlation, whereas the right half gives the difference with the codon usage correlation. (There are three possible choices for the evaluation set.) It is clear from the results that the best performance is obtained when the parameterization and evaluation populations are both [GmRNA, amRNA]. (This should be expected, given that the model parameters were optimized on this set.)
When [GmRNA, amRNA] is both the parameterization and evaluation population, the Pearson correlation of the linear model with the expression data is 0.75. This is slightly higher than the best Pearson correlations for the CAI and codon usage models. (The CAI has a maximum Pearson correlation of 0.72, while the codon usage has a maximum Pearson correlation of 0.71.) In terms of the rank correlation, the best codon usage model is somewhat better than the linear model (0.60 versus 0.56), while the CAI performs worse than both of the other methods (0.46).
Preferential codons in yeast
As mentioned at the beginning, it is important for heterologous gene expression to encode proteins with sequences that yield optimal expression. A good rule of thumb for finding such an optimal sequence is to choose codons that are most frequent in highly expressed genes. The CAI model provides an explicit way of finding such codons; the most frequent codons simply have the highest relative adaptiveness values, and sequences with higher CAIs are preferred over those with lower CAIs. The codon usage formalism does not explicitly use relative adaptiveness values, but they can be easily calculated with equation 1 from the parameterization sets that yield optimal codon usage parameters. A third possibility is to look at the parameters of the linear regression with respect to which codons are more preferred. (This is of course only possible for those codons that are predictor variables in the linear model.)
Figure 3 shows the relative adaptiveness values for the CAI and codon usagewhen the parameterization and evaluation populations are both [GProt, aProt] with the Pearson correlation as the figure of merittogether with the parameter values of the linear regression (LM) with [GmRNA, amRNA]. For comparison, we also show the relative adaptiveness values for the genome as a whole. Codons with relative adaptiveness values of 100% (= preferential codons) are shown in black. It is evident that both the CAI and the codon usage give the same preferential codons.
The relative adaptiveness values for the CAI are computed from the 21 most abundant proteins in aProt, whereas the codon usage values stem from the four most abundant proteins (see Table 2). Note that the preferential codons for both the CAI and the codon usage stay the same regardless of which parameterization and evaluation sets we choose (with the Pearson correlation as the figure of merit). The only exception is when we choose [GmRNA, amRNA] as both the parameterization and evaluation set for the codon usage. In that case, the optimal parameterization set becomes relatively large (253 ORFs) such that several of the preferential codons are the same as the ones for the genome as a whole.
The parameters of the linear model are shown in the third column for each codon in Figure 3. Note that the parameters vk give the expected change of expression level for an increase in the composition of the corresponding codon k, given that the composition of the other codons in the model stays the same:
One would expect the regression parameters to roughly correlate with the relative adaptiveness values of the CAI and codon usage. Because the number of parameters in the linear model is less than the total number of codons, this comparison is only possible for synonymous codons of seven amino acids (see Fig. 3).
Contrary to our expectation, the rank order of the regression parameters was different than that of the relative adaptiveness values of the CAI and codon usage for three of these seven amino acids (Val, Cys and Arg). One (non-biological) explanation for this different order might be the sensitivity of the parameters. This is in fact the case for Val and Cys where the 95% confidence intervals of the parameter values overlap (see Supplementary Material). However, parameter sensitivity does not explain the different codon order for arginine; the codon CGT has a much higher parameter value than the codon AGA (9.7 as opposed to 4.7), contrary to the ranking of relative adaptiveness values (see Fig. 3).
We suggest the following explanation: in contrast to the linear model parameters, the relative adaptiveness values describe the global enrichment of a codon in highly expressed genes with no restrictions on the compositions of the other codons. (This is confirmed by the fact that the Pearson correlation between the logarithms of amRNA and the codon composition of AGA is larger than that between amRNA and CGT). Thus, in the case of arginine, the reason for the discrepancy between the linear model and the CAI/codon usage might be that yeast cells preferentially use AGA codons for arginine in highly expressed genes (explaining the CAI value), but that the supply of the corresponding tRNA is already strongly exhausted for fast growing cells. Thus, to achieve additional translation of arginine at high rates, the cell might need to use the supply of another tRNA for arginine (explaining the higher regression parameter for AGA). Note that the tRNA gene copy number is 11 for the AGA codon and 6 for the CGT codon (the highest and second highest among all arginine codons). This way, the cell would make optimal use of the supply of arginine tRNAs when it is already growing fast.
DISCUSSION |
---|
TOP ABSTRACT INTRODUCTION MATERIALS AND METHODS RESULTS DISCUSSION SUPPORTING WEBSITE SUPPLEMENTARY MATERIAL REFERENCES |
---|
Small parameterization sets are sufficient
Furthermore, the parameterization sets that yielded optimal parameters for the CAI and codon usage are often very small compared to the number of genes in the genomevery much in the same way that the original parameterization sets were small (see Table 1). Thus, very few highly expressed genes seem to be sufficient to describe the overall codon bias in yeast. This shows that the original procedures for determining the parameters of the CAI and codon usage were indeed quite prescient. The CAI and codon usage models are relatively insensitive to the exact choice of highly expressed genes.
One explanation for this observation might be that although the optimal parameterization sets are small compared to the size of the genome, their share of the overall number of transcripts and protein copies in the cell is much larger; they may in fact dominate the overall codon composition of transcripts and proteins (18). This situation can be compared with the way a financial market index, composed of very few stocks with very high market capitalization, can be a very good approximation for the value of a total market, which consists of perhaps thousands of individual stocks.
Thus, to obtain robust parameters for the CAI and codon usage models, it often seems sufficient to infer them from rather qualitative information about gene expression levels. For instance, it may be enough to infer from information about biological function whether a group of genes is highly expressed. Note that, using our parameterization procedure, we achieved a Pearson correlation of 0.72 between the codon usage model and the expression data (when both the evaluation and parameterization population are [GmRNA, amRNA], see Table 2). This is only a marginal improvement over the original parameters (Pearson correlation 0.71, see Table 1) that were derived from the codon composition of the 128 ribosomal proteins in yeast.
Comparison of the CAI, codon usage and linear models
In contrast to the linear model and the codon usage, the parameters of the CAI are normalized by synonymous codon usage, a constraint that is not present in the other two models. It is therefore remarkable that the CAI model (given the best parameterization set) usually performs as well as the other two models. The only notable exception to this general rule is perhaps the relatively low rank correlation of the CAI with [GmRNA, amRNA], which is only 0.49 under the best circumstances (compared with 0.60 for the codon usage and 0.56 for the linear model).
The linear model achieves the highest Pearson correlation (0.75) with [GmRNA, amRNA], while the comparable values for the CAI and codon usage are slightly lower (0.72 and 0.71).
Can the models be improved?
The main motivation of our study was the question whether it would be possible to improve on existing and commonly used codon-based models for predicting expression levels. The results showed that the original models are relatively robust to the exact way they are parameterized. Perhaps such models could still be improved if other protein properties were included as additional features in the prediction.
We have explicitly tested whether one protein property, namely protein length, can aid in improving the prediction performance. It has previously been observed that longer proteins often tend to be less highly expressed than shorter ones (18,19). For instance, in the linear regression model one could explicitly consider protein length by replacing the codon fractions Xk with the number of codons (equation 16). However, we found that this severely decreases the correlation between the model predictions and actual expression data (data not shown).
Codon composition is often the strongest predictor of expression levels
Pavesi (20) proposed a model for predicting expression levels based on several different protein properties (the CAI, the codon bias index, an entropy score relating to synonymous codon usage, a TATA-box score and a pyrimidine bias index) (21). He showed in a regression analysis that the two significant parameters of the model were the CAI and the entropy score, both measures relating to synonymous codon usage. Pavesi reported a Pearson correlation of 0.76 with a select set of 621 expression levels derived from SAGE data.
Linear model
As an alternative to the CAI and codon usage models, we have proposed a simple linear model that relates codon fractions and expression levels of genes. An advantage of the linear model is that, unlike the numerical values from the CAI and the codon usage, the predicted expression levels have the same dimension as the logarithm of the actual expression levels and are directly comparable with them. The linear model predicts an expression level of 1.7 copies/cell for transcripts from sequences with average codon fractions; this is equal to the average expression level in amRNA. (This follows from equation 11 and the fact that the average residual in the model is equal to zero.)
We have suggested a natural, intuitive justification for the linear model, based on the CAI formalism. Of course there might be better alternatives than the linear model. From a mathematical standpoint, the linear regression is relatively simple and involves much less complex computations than non-linear regressions.
Applications
Overall, it seems justified to use the CAI, codon usage or related measures as rules of thumb in a variety of applications such as heterologous gene expression, either based on the original parameters or on our newly optimized ones. For the annotation of genomes, all three models seem to be useful, however, they should of course only be used in conjunction with other gene-finding criteria (22).
The 20-parameter linear model allows us to compare the codon parameters for seven amino acids. Surprisingly, the linear model parameters suggest a different rank order for the codons of the amino acid arginine. We have suggested the explanation that fast growing yeast cells have already exhausted the supply of the most abundant tRNA, and thus have to make use of the tRNA corresponding to the second best codon.
General issues of data quality
The value of the codon-based expression indicators can perhaps be appreciated by comparing them to the correlation of mRNA and protein abundance data in general. The correlation for the two populations [GProt, amRNA] and [GProt, aProt] is 0.67, well within the range of the correlations in Tables 1 and 2 (1315). One interpretation of this is that the codon-based expression indicators are actually just as good as mRNA expression data as an approximation of protein abundance levels.
Of course, the codon-based expression indicators yield static values, whereas gene expression is a dynamic process, with very different expression levels under different conditions. The expression data that we used in this study stems from experiments under very similar conditions, that is, yeast cells in vegetative growth on rich media (912). Thus, the prediction of expression levels based on codon composition should work best for these physiological situations, but might work less well for others. Coghlan et al. have pointed to the example of ENO1 and ENO2, which both exhibit strong codon biasesthe former is repressed by high glucose concentrations whereas the latter is strongly induced (19). In general, the regulation of translation might be less flexible than the regulation of transcription because the abundance of charged tRNAs cannot be changed as flexibly as the abundance of transcription factors [there are 33 cognate tRNAs in yeast, but perhaps hundreds of transcription factors (23,24)].
Of course, there are many limitations of the expression data itself that might confound the relationship between expression levels and codon composition. The 2D-gel data is subject to many biophysical and biochemical constraints (13,14,25). The situation is somewhat better for the mRNA expression data, where we have more data resources that we combined in this study.
SUPPORTING WEBSITE |
---|
TOP ABSTRACT INTRODUCTION MATERIALS AND METHODS RESULTS DISCUSSION SUPPORTING WEBSITE SUPPLEMENTARY MATERIAL REFERENCES |
---|
SUPPLEMENTARY MATERIAL |
---|
TOP ABSTRACT INTRODUCTION MATERIALS AND METHODS RESULTS DISCUSSION SUPPORTING WEBSITE SUPPLEMENTARY MATERIAL REFERENCES |
---|
ACKNOWLEDGEMENTS |
---|
REFERENCES |
---|
TOP ABSTRACT INTRODUCTION MATERIALS AND METHODS RESULTS DISCUSSION SUPPORTING WEBSITE SUPPLEMENTARY MATERIAL REFERENCES |
---|
| |||||||
Collections under which this article appears: Computational methods |
HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | SEARCH RESULT |