QSAR models have been under development for decades but acceptance and utilization of model results have been slow, in part, because there is no widely accepted metric for assessing their reliability. We reapply a method commonly used in quantitative epidemiology and medical decision-making for evaluating the results of screening tests to assess reliability of a QSAR model. It quantifies the accuracy (expressed as sensitivity and specificity) of QSAR models as conditional probabilities of correct and incorrect classification of chemical characteristic, given a true characteristic. Using Bayes formula, these conditional probabilities are combined with prior information to generate a posterior distribution to determine the probability a specific chemical has a particular characteristic, given a model prediction. As an example, we apply this approach to evaluate the predictive reliability of a CATABOL model and base on it a "ready" and "not ready" biodegradability classification. Finally, we show how predictive capability of the model can be improved by sequential use of two models, the first one with high sensitivity and the second with high specificity.