Egor Samosvat
Yandex
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Egor Samosvat.
workshop on algorithms and models for the web graph | 2013
Liudmila Ostroumova; Alexander Ryabchenko; Egor Samosvat
We propose a common framework for analysis of a wide class of preferential attachment models, which includes LCD, Buckley–Osthus, Holme–Kim and many others. The class is defined in terms of constraints that are sufficient for the study of the degree distribution and the clustering coefficient. We also consider a particular parameterized model from the class and illustrate the power of our approach as follows. Applying our general results to this model, we show that both the parameter of the power-law degree distribution and the clustering coefficient can be controlled via variation of the model parameters. In particular, the model turns out to be able to reflect realistically these two quantitative characteristics of a real network, thus performing better than previous preferential attachment models. All our theoretical results are illustrated empirically.
workshop on algorithms and models for the web graph | 2014
Liudmila Ostroumova Prokhorenkova; Egor Samosvat
In this paper, we analyze the behavior of the global clustering coefficient in scale free graphs. We are especially interested in the case of degree distribution with an infinite variance, since such degree distribution is usually observed in real-world networks of diverse nature. There are two common definitions of the clustering coefficient of a graph: global clustering and average local clustering. It is widely believed that in real networks both clustering coefficients tend to some positive constant as the networks grow. There are several models for which the average local clustering coefficient tends to a positive constant. On the other hand, there are no models of scale-free networks with an infinite variance of degree distribution and with a constant global clustering. In this paper we prove that if the degree distribution obeys the power law with an infinite variance, then the global clustering coefficient tends to zero with high probability as the size of a graph grows.
conference on information and knowledge management | 2013
Damien Lefortier; Liudmila Ostroumova; Egor Samosvat; Pavel Serdyukov
In this paper, we study the problem of timely finding and crawling of \textit{ephemeral} new pages, i.e., for which user traffic grows really quickly right after they appear, but lasts only for several days (e.g., news, blog and forum posts). Traditional crawling policies do not give any particular priority to such pages and may thus crawl them not quickly enough, and even crawl already obsolete content. We thus propose a new metric, well thought out for this task, which takes into account the decrease of user interest for ephemeral pages over time. We show that most ephemeral new pages can be found at a relatively small set of content sources and suggest a method for finding such a set. Our idea is to periodically recrawl content sources and crawl newly created pages linked from them, focusing on high-quality (in terms of user interest) content. One of the main difficulties here is to divide resources between these two activities in an efficient way. We find the adaptive balance between crawls and recrawls by maximizing the proposed metric. Further, we incorporate search engine click logs to give our crawler an insight about the current user demands. The effectiveness of our approach is finally demonstrated experimentally on real-world data.
workshop on algorithms and models for the web graph | 2013
Damien Lefortier; Liudmila Ostroumova; Egor Samosvat
We present a detailed study of the part of the Web related to media content, i.e., the Media Web. Using publicly available data, we analyze the evolution of incoming and outgoing links from and to media pages. Based on our observations, we propose a new class of models for the appearance of new media content on the Web where different \textit{attractiveness} functions of nodes are possible including ones taken from well-known preferential attachment and fitness models. We analyze these models theoretically and empirically and show which ones realistically predict both the incoming degree distribution and the so-called \textit{recency property} of the Media Web, something that existing models did not do well. Finally we compare these models by estimating the likelihood of the real-world link graph from our data set given each model and obtain that models we introduce are significantly more likely than previously proposed ones. One of the most surprising results is that in the Media Web the probability for a post to be cited is determined, most likely, by its quality rather than by its current popularity.
workshop on algorithms and models for the web graph | 2018
Aleksandr Dorodnykh; Liudmila Ostroumova Prokhorenkova; Egor Samosvat
Various models have been recently proposed to reflect and predict different properties of complex networks. However, the community structure, which is one of the most important properties, is not well studied and modeled. In this paper, we suggest a principle called “preferential placement”, which allows to model a realistic community structure. We provide an extensive empirical analysis of the obtained structure as well as some theoretical heuristics.
Computational Social Networks | 2016
Akmal Artikov; Aleksandr Dorodnykh; Yana Kashinskaya; Egor Samosvat
BackgroundSeveral models for producing scale-free networks have been suggested; most of them are based on the preferential attachment approach. In this article, we suggest a new approach for generating scale-free networks with an alternative source of the power-law degree distribution.MethodsThe model derives from matrix factorization methods and geographical threshold models that were recently proven to show good results in generating scale-free networks. We associate each node with a vector having latent features distributed over a unit sphere and with a weight variable sampled from a Pareto distribution. We join two nodes by an edge if they are spatially close and/or have large weights.Results and conclusionThe network produced by this approach is scale free and has a power-law degree distribution with an exponent of 2. In addition, we propose an extension of the model that allows us to generate directed networks with tunable power-law exponents.
european conference on information retrieval | 2015
Liudmila Ostroumova Prokhorenkova; Yury Ustinovskiy; Egor Samosvat; Damien Lefortier; Pavel Serdyukov
In this paper, we study the problem of caching search results with a rapid rate of their degradation. We suggest a new caching algorithm, which is based on queries’ frequencies and the predicted staleness of cached results. We also introduce a new performance metric of caching algorithms called staleness degree, which measures the level of degradation of a cached result. In the case of frequently changing search results, this metric is more sensitive to those changes than the previously used stale traffic ratio.
Journal of Complex Networks | 2016
Liudmila Ostroumova Prokhorenkova; Egor Samosvat
web search and data mining | 2016
Liudmila Ostroumova Prokhorenkova; Petr Vladislavovich Prokhorenkov; Egor Samosvat; Pavel Serdyukov
Archive | 2015
Liudmila Ostroumova Prokhorenkova; Egor Samosvat; Petr Vladislavovich Prokhorenkov; Pavel Serdyukov