With the emergence of combinatorial chemistry, whether based on parallel, mixture, solution, or solid phase chemistry, it is now possible to generate large numbers of diverse or focused compound libraries. In this paper we aim to demonstrate that it is possible to design targeted libraries by applying nonparametric statistical methods, recursive partitioning in particular, to large data sets containing thousands of compounds and their associated biological data. Moreover, when applied to an experimental high-throughput screening (HTS) data set, our data strongly suggest that this method can improve the hit rate of our primary screens (about 4- to 5-fold) while increasing screening efficiency: less than one-fifth of the complete selection needs to be screened in order to identify about 75% of all actives present.