Genes, ORFs, and 'omes (Top) The initial published yeast genome claimed 6274 genes (22), but this has been revised many times since then. The time series data on numbers of genes are based on the SGD and MIPS databases: http://genome-www.stanford.edu/Saccharomyces and http://mips.gsf.de/proj/yeast/CYGD/db. These databases use different criteria for ORF inclusion: MIPS adds all candidate ORFs whereas SGD limits inclusion. Also shown are other estimates for the number of genes in the yeast genome (26-29). The central column shows the types of ORFs in the current yeast annotation. These include eORF (essential ORF) (13), kORF (known ORF with a well-characterized function), hORF (ORF validated by homology only), shORF (short ORF), tORF (transposon identified ORF), qORF (questionable ORF), and dORF (disabled ORF or pseudogene) (21). (The numbers are based on the ORF classes defined in the MIPS database.) Compared with the initial annotation, the current estimate of 6128 ORFs reflects two opposing trends: (i) the addition of new shORFs (9) found either through transcription experiments (tORFs) or from sequence comparisons with proteins newly deposited in the databases (hORFs); (ii) the removal of qORFs with no evidence of being transcribed (that is, lacking SAGE or transposon tags, and not expressed on microarrays) and with no sequence similarity to any other protein. (For simplicity, we include in the qORFs 10 ORFs associated with Ty elements in the original annotation. Further information is at http://bioinfo.mbb.yale.edu/genome/yeast/orfome.) (Bottom left) The explosion in defining shORFs. The first bar depicts the potential ORFs in the raw DNA sequence of the yeast genome that are >15 codons. The second bar shows the large number that are also <100 codons in length. The third bar demonstrates that the number of ORFs is not reduced by requiring a high codon adaptation index (CAI > 0.11). The remaining bars illustrate how the number of potential ORFs is radically reduced by selecting only those shORFs that show evidence of transcription (transposons and SAGE). (Bottom right) Functional genomics information is best used in a combined fashion. Illustrated is the number of ORFs in the yeast genome that are transcribed according to data from microarray hybridization, SAGE, and transposon tagging.