Understanding the quality of a screening collection is the first step to improving it and, as a result, the quality of the screening process. This article outlines how this issue was approached at GlaxoSmithKline and some of the hurdles that needed to be overcome to achieve success. The article focuses specifically on the necessary software and hardware infrastructure needed, and at some of the extra benefits of such a project in terms of data mining and data modelling.