Filippo Menczer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Filippo Menczer is active.

Explore More

Publication

Featured researches published by Filippo Menczer.

international world wide web conferences | 2011

Truthy: mapping the spread of astroturf in microblog streams

Jacob Ratkiewicz; Michael Conover; Mark R. Meiss; Bruno Gonçalves; Snehal Patil; Alessandro Flammini; Filippo Menczer

Online social media are complementing and in some cases replacing person-to-person social interaction and redefining the diffusion of information. In particular, microblogs have become crucial grounds on which public relations, marketing, and political battles are fought. We demonstrate a web service that tracks political memes in Twitter and helps detect astroturfing, smear campaigns, and other misinformation in the context of U.S. political elections. We also present some cases of abusive behaviors uncovered by our service. Our web service is based on an extensible framework that will enable the real-time analysis of meme diffusion in social media by mining, visualizing, mapping, classifying, and modeling massive streams of public microblogging events.

Scientific Reports | 2012

Competition among memes in a world with limited attention

Lilian Weng; Alessandro Flammini; Alessandro Vespignani; Filippo Menczer

The wide adoption of social media has increased the competition among ideas for our finite attention. We employ a parsimonious agent-based model to study whether such a competition may affect the popularity of different memes, the diversity of information we are exposed to, and the fading of our collective interests for specific topics. Agents share messages on a social network but can only pay attention to a portion of the information they receive. In the emerging dynamics of information diffusion, a few memes go viral while most do not. The predictions of our model are consistent with empirical data from Twitter, a popular microblogging platform. Surprisingly, we can explain the massive heterogeneity in the popularity and persistence of memes as deriving from a combination of the competition for our limited attention and the structure of the social network, without the need to assume different intrinsic values among ideas.

international world wide web conferences | 2009

Evaluating similarity measures for emergent semantics of social tagging

Benjamin Markines; Ciro Cattuto; Filippo Menczer; Dominik Benz; Andreas Hotho; Gerd Stumme

Social bookmarking systems are becoming increasingly important data sources for bootstrapping and maintaining Semantic Web applications. Their emergent information structures have become known as folksonomies. A key question for harvesting semantics from these systems is how to extend and adapt traditional notions of similarity to folksonomies, and which measures are best suited for applications such as community detection, navigation support, semantic search, user profiling and ontology learning. Here we build an evaluation framework to compare various general folksonomy-based similarity measures, which are derived from several established information-theoretic, statistical, and practical measures. Our framework deals generally and symmetrically with users, tags, and resources. For evaluation purposes we focus on similarity between tags and between resources and consider different methods to aggregate annotations across users. After comparing the ability of several tag similarity measures to predict user-created tag relations, we provide an external grounding by user-validated semantic proxies based on WordNet and the Open Directory Project. We also investigate the issue of scalability. We find that mutual information with distributional micro-aggregation across users yields the highest accuracy, but is not scalable; per-user projection with collaborative aggregation provides the best scalable approach via incremental computations. The results are consistent across resource and tag similarity.

ACM Transactions on Internet Technology | 2004

Topical web crawlers: Evaluating adaptive algorithms

Filippo Menczer; Gautam Pant; Padmini Srinivasan

Topical crawlers are increasingly seen as a way to address the scalability limitations of universal search engines, by distributing the crawling process across users, queries, or even client computers. The context available to such crawlers can guide the navigation of links with the goal of efficiently locating highly relevant target pages. We developed a framework to fairly evaluate topical crawling algorithms under a number of performance metrics. Such a framework is employed here to evaluate different algorithms that have proven highly competitive among those proposed in the literature and in our own previous research. In particular we focus on the tradeoff between exploration and exploitation of the cues available to a crawler, and on adaptive crawlers that use machine learning techniques to guide their search. We find that the best performance is achieved by a novel combination of explorative and exploitative bias, and introduce an evolutionary crawler that surpasses the performance of the best nonadaptive crawler after sufficiently long crawls. We also analyze the computational complexity of the various crawlers and discuss how performance and complexity scale with available resources. Evolutionary crawlers achieve high efficiency and scalability by distributing the work across concurrent agents, resulting in the best performance/cost ratio.

Scientific Reports | 2013

Virality Prediction and Community Structure in Social Networks

Lilian Weng; Filippo Menczer; Yong-Yeol Ahn

How does network structure affect diffusion? Recent studies suggest that the answer depends on the type of contagion. Complex contagions, unlike infectious diseases (simple contagions), are affected by social reinforcement and homophily. Hence, the spread within highly clustered communities is enhanced, while diffusion across communities is hampered. A common hypothesis is that memes and behaviors are complex contagions. We show that, while most memes indeed spread like complex contagions, a few viral memes spread across many communities, like diseases. We demonstrate that the future popularity of a meme can be predicted by quantifying its early spreading pattern in terms of community concentration. The more communities a meme permeates, the more viral it is. We present a practical method to translate data about community structure into predictive knowledge about what information will spread widely. This connection contributes to our understanding in computational social science, social media analytics, and marketing applications.

Communications of The ACM | 2016

The rise of social bots

Emilio Ferrara; Onur Varol; Clayton A. Davis; Filippo Menczer; Alessandro Flammini

Todays social bots are sophisticated and sometimes menacing. Indeed, their presence can endanger online ecosystems as well as our society.

international world wide web conferences | 2005

Algorithmic detection of semantic similarity

Ana Gabriela Maguitman; Filippo Menczer; Heather Roinestad; Alessandro Vespignani

Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic measures is limited by the coverage of user studies, which do not scale with the size, heterogeneity, and growth of the Web. Here we propose to leverage human-generated metadata --- namely topical directories --- to measure semantic relationships among massive numbers of pairs of Web pages or topics. The Open Directory Project classifies millions of URLs in a topical ontology, providing a rich source from which semantic relationships between Web pages can be derived. While semantic similarity measures based on taxonomies (trees) are well studied, the design of well-founded similarity measures for objects stored in the nodes of arbitrary ontologies (graphs) is an open problem. This paper defines an information-theoretic measure of semantic similarity that exploits both the hierarchical and non-hierarchical structure of an ontology. An experimental study shows that this measure improves significantly on the traditional taxonomy-based approach. This novel measure allows us to address the general question of how text and link analyses can be combined to derive measures of relevance that are in good agreement with semantic similarity. Surprisingly, the traditional use of text similarity turns out to be ineffective for relevance ranking.

knowledge discovery and data mining | 2000

Feature selection in unsupervised learning via evolutionary search

YeongSeog Kim; W. Nick Street; Filippo Menczer

Feature subset selection is an important problem in knowledge discovery, not only for the insight gained from determining relevant modeling variables but also for the improved understandability, scalabilit y, and possibly , accuracy of the resulting models. In this paper w e consider the problem of feature selection for unsupervised learning. A number of heuristic criteria can be used to estimate the quality of clusters built from a giv en featuresubset. Rather than combining such criteria, we use ELSA, an evolutionary local selection algorithm that maintains a diverse population of solutions that approximate the Pareto front in a multidimensional objectiv espace. Eac hevolved solution represents a feature subset and a number of clusters; a standard K-means algorithm is applied to form the given n umber of clusters based on the selected features. Preliminary results on both real and synthetic data show promise in nding P areto-optimal solutions through which we can identify the signi cant features and the correct number of clusters.

international acm sigir conference on research and development in information retrieval | 2001

Evaluating topic-driven web crawlers

Filippo Menczer; Gautam Pant; Padmini Srinivasan; Miguel E. Ruiz

Due to limited bandwidth, storage, and computational resources, and to the dynamic nature of the Web, search engines cannot index every Web page, and even the covered portion of the Web cannot be monitored continuously for changes. Therefore it is essential to develop effective crawling strategies to prioritize the pages to be indexed. The issue is even more important for topic-specific search engines, where crawlers must make additional decisions based on the relevance of visited pages. However, it is difficult to evaluate alternative crawling strategies because relevant sets are unknown and the search space is changing. We propose three different methods to evaluate crawling strategies. We apply the proposed metrics to compare three topic-driven crawling algorithms based on similarity ranking, link analysis, and adaptive agents.

Physical Review Letters | 2010

Characterizing and modeling the dynamics of online popularity

Jacob Ratkiewicz; Santo Fortunato; Alessandro Flammini; Filippo Menczer; Alessandro Vespignani

Online popularity has an enormous impact on opinions, culture, policy, and profits. We provide a quantitative, large scale, temporal analysis of the dynamics of online content popularity in two massive model systems: the Wikipedia and an entire countrys Web space. We find that the dynamics of popularity are characterized by bursts, displaying characteristic features of critical systems such as fat-tailed distributions of magnitude and interevent time. We propose a minimal model combining the classic preferential popularity increase mechanism with the occurrence of random popularity shifts due to exogenous factors. The model recovers the critical features observed in the empirical analysis of the systems analyzed here, highlighting the key factors needed in the description of popularity dynamics.

Explore More