ИСТИНА |
Войти в систему Регистрация |
|
ИСТИНА ИНХС РАН |
||
In this presentation, we report implementation of consensus naïve Bayesian classifier ISIDA_NB into the ISIDA_QSPR program package. Ensemble modeling workflow implemented into the ISIDA_NB program includes (i) generation of ensemble of individual classification models, (ii) selection of the most predictive ones using internal and external cross‐validations and (iii) consensus application of selected models to a test set compounds. Ensembles of individual models are generated by varying parameters of naïve Bayesian model (weight of Laplace correction, score threshold), variable selection techniques (variable filtering, excluding concatenated fragments) and using of multiple types of molecular fragments as descriptors. The program runs under the Windows operating system. The graphical interface pilots this workflow and supports the analysis of the obtained results. The Consensus Predictor tool realizes property predictions and virtual screening using previously obtained consensus models. The program was used to classify organic ligands (L) onto weak and strong binders of different metal cations (M) in water. Metal binders were classified according to stability constant (logK) threshold for 1:1 (M:L) complexes. For strong binders, logK was assumed to be more than 5.5. The ISIDA_NB consensus models demonstrate a good predictive performance in the fivefold cross-validation procedure: the balanced accuracy on predictions vary from 0.8 to 0.9 for studied metal cations: Ag+, Ba2+, Ca2+, Cd2+, Ce3+, Co2+, Cu2+, Dy3+, Er3+, Eu3+, Fe2+, Gd3+, Ho3+, La3+, Lu3+, Mg2+, Mn2+, Nd3+, Ni2+, Pb2+, Pr3+, Sm3+, Sr2+, Tb3+, Tm3+, UO22+, VO2+, Yb3+, Y3+, Zn2+.