ИСТИНА |
Войти в систему Регистрация |
|
ИСТИНА ИНХС РАН |
||
A novel method for the prediction of SNVs regulatory potential was proposed earlier during the CAGI 2018 "Regulation Saturation" challenge (Penzar et al., 2018). The contest data on 9 promoters and 5 enhancers including gene expression change and confidence values for every possible SNV was used to train a Random Forest regressor atop DeepSEA processed data (baseline), which reached the top prediction quality compared to other CAGI solutions. Here, we developed a set of features that positively contributed to the prediction quality and reduced overfitting of the baseline model, thus making it more robust.