Discovering structural alerts for mutagenicity using stable emerging molecular patterns

Jean-Philippe Métivier; Alban Lepailleur; Aleksey Buzmakov; Guillaume Poezevara; Bruno Crémilleux; Sergei O Kuznetsov; Jérémie Le Goff; Amedeo Napoli; Ronan Bureau; Bertrand Cuissart

doi:10.1021/ci500611v

Discovering structural alerts for mutagenicity using stable emerging molecular patterns

J Chem Inf Model. 2015 May 26;55(5):925-40. doi: 10.1021/ci500611v. Epub 2015 May 7.

Authors

Jean-Philippe Métivier^{1

2}, Alban Lepailleur^{1

3}, Aleksey Buzmakov^{4

5}, Guillaume Poezevara^{1

2

3}, Bruno Crémilleux^{1

2}, Sergei O Kuznetsov⁵, Jérémie Le Goff⁶, Amedeo Napoli⁴, Ronan Bureau^{1

3}, Bertrand Cuissart^{1

2}

Affiliations

¹ †Normandie Université, Caen, France.
² ‡UNICAEN, GREYC, UMR CNRS 6072, F-14032 Caen, France.
³ §UNICAEN, CERMN, UPRES EA 4258, FR CNRS 3038, F-14032 Caen, France.
⁴ ∥Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), University of Lorraine, Nancy, France.
⁵ ⊥National Research University Higher School of Economics (HSE), Moscow, Russia.
⁶ #ADn'tox, Caen, France.

PMID: 25871768
DOI: 10.1021/ci500611v

Abstract

This study is dedicated to the introduction of a novel method that automatically extracts potential structural alerts from a data set of molecules. These triggering structures can be further used for knowledge discovery and classification purposes. Computation of the structural alerts results from an implementation of a sophisticated workflow that integrates a graph mining tool guided by growth rate and stability. The growth rate is a well-established measurement of contrast between classes. Moreover, the extracted patterns correspond to formal concepts; the most robust patterns, named the stable emerging patterns (SEPs), can then be identified thanks to their stability, a new notion originating from the domain of formal concept analysis. All of these elements are explained in the paper from the point of view of computation. The method was applied to a molecular data set on mutagenicity. The experimental results demonstrate its efficiency: it automatically outputs a manageable number of structural patterns that are strongly related to mutagenicity. Moreover, a part of the resulting structures corresponds to already known structural alerts. Finally, an in-depth chemical analysis relying on these structures demonstrates how the method can initiate promising processes of chemical knowledge discovery.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Data Mining / methods*
Drug Discovery*
Mutagens / chemistry*
Pattern Recognition, Automated / methods*

Substances

Mutagens