Objectives/hypothesis: The objective was to identify a genomic profile that predicts the likelihood of oral squamous cell carcinoma compared with normal oral mucosa in unknown tissue samples.
Study design: Using a training set of tissue samples that were histologically classified as oral squamous cell carcinoma or normal mucosa, the authors used principal component analysis to develop a genomic predictor for oral squamous cell carcinoma. On a separate test set of unclassified samples, the authors used the predictor to classify the samples, then evaluated the performance of the predictor using histological diagnosis.
Methods: The authors used a data set consisting of messenger RNA extracted from 29 oral squamous cell carcinoma and 19 normal oral mucosa tissue samples and hybridized to Affymetrix oligonucleotide microarrays containing probe sets for 7070 genes and expressed sequence tags. The samples were divided into a training set of 15 oral squamous cell carcinoma and 10 normal samples and a test set consisting of the remaining samples. Using principal component analysis on the training set, the authors found a composite gene expression vector (principal component vector), which they used to compute likelihood ratios for oral squamous cell carcinoma on the test set. By calculating the contribution of each gene to the principal component vector, the authors identified genes with the greatest predictive value.
Results: Using the likelihood ratio, the authors correctly classified all 23 samples in the test set as either oral squamous cell carcinoma or normal. The authors found that many of the most predictive genes are known to be markers of squamous cell carcinoma or normal mucosa.
Conclusion: Principal component analysis can be used with genomic microarray data to correctly predict the presence of oral squamous cell carcinoma in unknown tissue samples.