Flavonoids, especially their inhibitory effect on DPP-IV activity, have been widely recognized for their antidiabetic effects. However, the variety of natural flavonoid derivatives is very rich, and even subtle structural differences can lead to several orders of magnitude differences in their inhibitory activities against DPP-IV, which makes it challenging to find novel and potent anti-DPP-IV flavonoid derivatives experimentally. Therefore, there is an urgent need to develop an efficient screening pipeline that targets active natural products. Here, we propose a fusion strategy based on a QSAR model, and to simplify this process, it was applied to the discovery of flavonoid derivatives with potent anti-DPP-IV activity. First, the high-quality QSAR model ( = 0.816, MAEtest = 0.14, MSEtest = 0.026) was composed of seven key molecular property parameters, which were constructed with the genetic algorithm (GA) and passed the leave-one-out cross-validation evaluation. A total of 1,668 flavonoid derivatives were obtained from the natural product enriched by NPCD based on molecular fingerprint similarity (> 0.8). Further, the enriched flavonoid derivatives were further predicted and screened using the QED score combined with the QSAR model, and a total of 33 flavonoid derivatives (IC50pre < 6.5 μM) were found. Subsequently, three flavonoid derivatives (5,7,3',5'-tetrahydroxyflavone, 3,7-dihydroxy-5,3',4'-trimethoxyflavone, and 5,7,2',5'-tetrahydroxyflavone) with highly effective anti-DPP-IV activity were obtained by ADMET analysis. Finally, the DPP-IV inhibitory potential of these three flavonoid derivatives was verified by 100 ns MD simulation and MM/PB(GB)SA.Communicated by Ramaswamy H. Sarma.
Keywords: DPP-IV; LBDD; QSAR; Virtual screening; flavonoid compounds; machine learning.