Unbiased assays such as shotgun proteomics and RNA-seq provide high-resolution molecular characterization of tumors. These assays measure molecules with highly varied distributions, making interpretation and hypothesis testing challenging. Samples with the most extreme measurements for a molecule can reveal the most interesting biological insights yet are often excluded from analysis. Furthermore, rare disease subtypes are, by definition, underrepresented in cancer cohorts. To provide a strategy for identifying molecules aberrantly enriched in small sample cohorts, we present BlackSheep, a package for nonparametric description and differential analysis of genome-wide data, available from Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/blacksheepr.html) and Bioconda (https://bioconda.github.io/recipes/blksheep/README.html). BlackSheep is a complementary tool to other differential expression analysis methods, which is particularly useful when analyzing small subgroups in a larger cohort.
Keywords: differential expression; extreme values; outliers; phosphoproteomics; proteomics.