A range of good quality, local QSARs for mutagenicity and carcinogenicity have been assessed and challenged for their predictivity in respect to real external test sets (i.e., chemicals never considered by the authors while developing their models). The QSARs for potency (applicable only to toxic chemicals) generated predictions 30-70% correct, whereas the QSARs for discriminating between active and inactive chemicals were 70-100% correct in their external predictions: thus the latter can be used with good reliability for applicative purposes. On the other hand internal, statistical validation methods, which are often assumed to be good diagnostics for predictivity, did not correlate well with the predictivity of the QSARs when challenged in external prediction tests. Nonlocal models for noncongeneric chemicals were considered as well, pointing to the critical role of an adequate definition of the applicability domain.