Tree Ensembles with Gradient Descent: Loss Landscape, Intrinsic Dimension, and Split Permutation Invariance - доклад на конференции | ИСТИНА – Интеллектуальная Система Тематического Исследования НАукометрических данных

Авторы: Sergei Ivanov, Litovchenko Leonid, Sokolov Andrey, Prokhorenkova Liudmila
Международная Конференция : OpenTalks.AI
Даты проведения конференции: 6-7 марта 2023
Дата доклада: 6 марта 2023
Тип доклада: Стендовый
Докладчик: не указан
Место проведения: Ереван, Armenia
Аннотация доклада:
Loss landscapes have been actively studied for parametric models such as deep neural networks, offering theoretical and practical insights. However, another popular type of machine learning algorithms, namely ensembles of decision trees (e.g., GBDT), lacks such analysis due to the complex nature of training and absence of parameters for optimization by gradient descent methods. To overcome this challenge, we consider an optimization problem that optimizes leaf weights for a given set of trees by gradient descent, which reveals several surprising phenomena about tree-based ensembles. First of all, we show that optimizing leaves of decision trees by gradient descent starting from a random point attains the same or better test performance than originally trained models while uncovering a new set of weights. Furthermore, we identify that the intrinsic dimension, i.e., the smallest number of parameters achieving solutions, is much smaller than the number of leaves and often leads to superior performance than of the trained ensemble. By comparing intrinsic dimension across ensembles, we find that models with various depths of decision trees preserve the high quality of the solution with significant speedup during training. Finally, contrary to the common belief that the first trees of gradient boosting are more powerful than the last ones, we argue that all trees are created equal for GBDT instances trained with gradient descent. It has a profound implication that different ensembles explore different families of decision trees and once the right family has been chosen by the model it is trivial to set the optimal weights for the trees. This latter result suggests new ways of designing ensembles and elucidates differences between state-of-the-art decision tree models.

Доклад на конференции выполнен в рамках проекта (проектов):

Теория интеллектуальных систем и автоматов 2021-2025

Добавил в систему: Соколов Андрей Павлович

	ИСТИНА	Войти в систему Регистрация
	ИСТИНА ИНХС РАН
	Главная Поиск Статистика О проекте Помощь

ИСТИНА

ИСТИНА ИНХС РАН

Tree Ensembles with Gradient Descent: Loss Landscape, Intrinsic Dimension, and Split Permutation Invarianceдоклад на конференции