Genes that are differentially expressed in tumor tissues are potential diagnostic markers and drug targets. The DNA sequence information available in the public databases can be used to identify transcripts differentially expressed in cancer. We report here the combined use of the ORESTES sequences generated in the FAPESP/LICR Human Cancer Genome Project and information available in the UniGene and SAGE databases to characterize the transcriptome of normal and breast tumor cells. We have identified 154 genes as candidates for overexpression in breast tumor cells. Among these, 28 genes have been shown by others to be overexpressed in breast or other tumors. Using RT-PCR, we tested 11 candidate genes and found that 9 were indeed overexpressed in breast tumor cells.