Simplified molecular-input line-entry system (SMILES) notation and inbuilt Monte Carlo algorithm of CORAL software were employed to construct generative and prediction QSPR models for the analysis of the power conversion efficiency (PCE) of 215 phenothiazine derivatives. The dataset was divided into four splits and each split was further divided into four sets. A hybrid descriptor, a combination of SMILES and hydrogen suppressed graph (HSG), was employed to build reliable and robust QSPR models. The role of the index of ideality of correlation (IIC) was also studied in depth. We performed a comparative study to predict PCE using two target functions (TF1 without IIC and TF2 with IIC). Eight QSPR models were developed and the models developed with TF2 was shown robust and reliable. The QSPR model generated from split 4 was considered a leading model. The different statistical benchmarks were computed for the lead model and these were ; ; ; ; ; ; ; ; IICtraining set = 0.8590; IICinvisible training set = 0.8297; IICcalibration set = 0.8796; IICvalidation set = 0.8293, etc. The promoters of increase and decrease of endpoint PCE were also extracted.
Keywords: CORAL; IIC; PCE; QSPR; phenothiazine.