ИСТИНА |
Войти в систему Регистрация |
|
ИСТИНА ИНХС РАН |
||
Search engine optimization (SEO) is the process of affecting the visi- bility of a web page in the engine's search results. SEO specialists must under- stand how search engines work and which features of the web-page affect its position in the search results. This paper employs machine learning ranking al- gorithms to constructing the rank model of a web-search engine. Ranking a set of retrieved documents according to their relevance to a given query has be- come a popular problem at the intersection of web search, machine learning and information retrieval. Feature selection in learning to rank has recently emerged as a crucial issue. Recent work on ranking, focused on a number of different paradigms, namely, point-wise, pair-wise, and list-wise approaches, for which several preprocessing feature section methods have been proposed. Unfortu- nately, only a few works have been focused on integrating the feature selection into the learning process and all of these embedded methods are based on l1 regularization technique. Such type of regularization does not possess many properties, essential for SEO, such as unbiasedness, grouping effect and oracle property. In this paper we suggest a new Bayesian framework for feature selec- tion in learning-to-rank problem. The proposed approach gives the strong prob- abilistic statement of shrinkage criterion for features selection. The proposed regularization is unbiased, has grouping and oracle properties, its maximal risk diverges to finite value. Experimental results show that the proposed framework is competitive on both artificial data and publicly available LETOR data sets.