Background: Providing quantitative microarray data that is sensitive to very small differences in target sequence would be a useful tool in any number of venues where a sample can consist of a multiple related sequences present in various abundances. Examples of such applications would include measurement of pseudo species in viral infections and the measurement of species of antibodies or T cell receptors that constitute immune repertoires. Difficulties that must be overcome in such a method would be to account for cross-hybridization and for differences in hybridization efficiencies between the arrayed probes and their corresponding targets. We have used the memory T cell repertoire to an influenza-derived peptide as a test case for developing such a method.
Results: The arrayed probes were corresponded to a 17 nucleotide TCR-specific region that distinguished sequences differing by as little as a single nucleotide. Hybridization efficiency between highly related Cy5-labeled subject sequences was normalized by including an equimolar mixture of Cy3-labeled synthetic targets representing all 108 arrayed probes. The same synthetic targets were used to measure the degree of cross hybridization between probes. Reconstitution studies found the system sensitive to input ratios as low as 0.5% and accurate in measuring known input percentages (R2 = 0.81, R = 0.90, p < 0.0001). A data handling protocol was developed to incorporate the differences in hybridization efficiency. To validate the array in T cell repertoire analysis, it was used to analyze human recall responses to influenza in three human subjects and compared to traditional cloning and sequencing. When evaluating the rank order of clonotype abundance determined by each method, the approaches were not found significantly different (Wilcoxon rank-sum test, p > 0.05).
Conclusion: This novel strategy appears to be robust and can be adapted to any situation where complex mixtures of highly similar sequences need to be quantitatively resolved.