Fabian Flöck
Karlsruhe Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fabian Flöck.
international world wide web conferences | 2014
Philipp Singer; Fabian Flöck; Clemens Meinhart; Elias Zeitfogel; Markus Strohmaier
In the past few years, Reddit -- a community-driven platform for submitting, commenting and rating links and text posts -- has grown exponentially, from a small community of users into one of the largest online communities on the Web. To the best of our knowledge, this work represents the most comprehensive longitudinal study of Reddits evolution to date, studying both (i) how user submissions have evolved over time and (ii) how the communitys allocation of attention and its perception of submissions have changed over 5 years based on an analysis of almost 60 million submissions. Our work reveals an ever-increasing diversification of topics accompanied by a simultaneous concentration towards a few selected domains both in terms of posted submissions as well as perception and attention. By and large, our investigations suggest that Reddit has transformed itself from a dedicated gateway to the Web to an increasingly self-referential community that focuses on and reinforces its own user-generated image- and textual content over external sources.
web science | 2011
Fabian Flöck; Denny Vrandečić; Elena Simperl
Wikipedia is a top-ten Web site providing a free encyclopedia created by an open community of volunteer contributors. As investigated in various studies over the past years, contributors have different backgrounds, mindsets and biases; however, the effects - positive and negative - of this diversity on the quality of the Wikipedia content, and on the sustainability of the overall project are yet only partially understood. In this paper we discuss these effects through an analysis of existing scholarly literature in the area and identify directions for future research and development; we also present an approach for diversity-minded content management within Wikipedia that combines techniques from semantic technologies, data and text mining and quantitative social dynamics analysis to create greater awareness of diversity-related issues within the Wikipedia community, give readers access to indicators and metrics to understand biases and their impact on the quality of Wikipedia articles, and support editors in achieving balanced versions of these articles that leverage the wealth of knowledge and perspectives inherent to large-scale collaboration.
international world wide web conferences | 2014
Fabian Flöck; Maribel Acosta
Revisioned text content is present in numerous collaboration platforms on the Web, most notably Wikis. To track authorship of text tokens in such systems has many potential applications; the identification of main authors for licensing reasons or tracing collaborative writing patterns over time, to name some. In this context, two main challenges arise. First, it is critical for such an authorship tracking system to be precise in its attributions, to be reliable for further processing. Second, it has to run efficiently even on very large datasets, such as Wikipedia. As a solution, we propose a graph-based model to represent revisioned content and an algorithm over this model that tackles both issues effectively. We describe the optimal implementation and design choices when tuning it to a Wiki environment. We further present a gold standard of 240 tokens from English Wikipedia articles annotated with their origin. This gold standard was created manually and confirmed by multiple independent users of a crowdsourcing platform. It is the first gold standard of this kind and quality and our solution achieves an average of 95% precision on this data set. We also perform a first-ever precision evaluation of the state-of-the-art algorithm for the task, exceeding it by over 10% on average. Our approach outperforms the execution time of the state-of-the-art by one order of magnitude, as we demonstrate on a sample of over 240 English Wikipedia articles. We argue that the increased size of an optional materialization of our results by about 10% compared to the baseline is a favorable trade-off, given the large advantage in runtime performance.
acm conference on hypertext | 2012
Fabian Flöck; Denny Vrandecic; Elena Simperl
Wikipedia is commonly used as a proving ground for research in collaborative systems. This is likely due to its popularity and scale, but also to the fact that large amounts of data about its formation and evolution are freely available to inform and validate theories and models of online collaboration. As part of the development of such approaches, revert detection is often performed as an important pre-processing step in tasks as diverse as the extraction of implicit networks of editors, the analysis of edit or editor features and the removal of noise when analyzing the emergence of the content of an article. The current state of the art in revert detection is based on a rather naive approach, which identifies revision duplicates based on MD5 hash values. This is an efficient, but not very precise technique that forms the basis for the majority of research based on revert relations in Wikipedia. In this paper we prove that this method has a number of important drawbacks - it only detects a limited number of reverts, while simultaneously misclassifying too many edits as reverts, and not distinguishing between complete and partial reverts. This is very likely to hamper the accurate interpretation of the findings of revert-related research. We introduce an improved algorithm for the detection of reverts based on word tokens added or deleted to adresses these drawbacks. We report on the results of a user study and other tests demonstrating the considerable gains in accuracy and coverage by our method, and argue for a positive trade-off, in certain research scenarios, between these improvements and our algorithms increased runtime.
international conference on knowledge capture | 2015
Maribel Acosta; Elena Simperl; Fabian Flöck; Maria-Esther Vidal
Due to the semi-structured nature of RDF data, missing values affect answer completeness of queries that are posed against RDF. To overcome this limitation, we present HARE, a novel hybrid query processing engine that brings together machine and human computation to execute SPARQL queries. We propose a model that exploits the characteristics of RDF in order to estimate the completeness of portions of a data set. The completeness model complemented by crowd knowledge is used by the HARE query engine to on-the-fly decide which parts of a query should be executed against the data set or via crowd computing. To evaluate HARE, we created and executed a collection of 50 SPARQL queries against the DBpedia data set. Experimental results clearly show that our solution accurately enhances answer completeness.
Handbook of Human Computation | 2013
Elena Simperl; Maribel Acosta; Fabian Flöck
In this chapter, we will analyze a number of essential knowledge engineering activities that, for technical or principled reasons, can hardly be optimally executed through automatic processing approaches, thus remaining heavily reliant on human intervention. Human computation methods can be applied to this field in order to overcome these limitations in terms of accuracy, while still being able to fully take advantage of the scalability and performance of machine-driven capabilities. For each activity, we will explain how this symbiosis can be achieved by giving a short overview of the state of the art and several examples of systems and applications such as games-with-a-purpose, microtask crowdsourcing projects, and community-driven collaborative initiatives that showcase the benefits of the general idea.
international world wide web conferences | 2015
Fabian Flöck; Maribel Acosta
The visualization of editor interaction dynamics and provenance of content in revisioned, collaboratively written documents has the potential to allow for more transparency and intuitive understanding of the intricate mechanisms inherent to collective content production. Although approaches exist to build editor interactions from individual word changes in Wikipedia articles, they do not allow to inquire into individual interactions, and have yet to be implemented as usable end-user tools. We thus present whoVIS, a web tool to mine and visualize editor interactions in Wikipedia over time. whoVIS integrates novel features with existing methods, tailoring them to the use case of understanding intra-article disagreement between editors. Using real Wikipedia examples, our system demonstrates the combination of various visualization techniques to identify different social dynamics and explore the evolution of an article that would be particularly hard for end-users to investigate otherwise.
Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18 | 2018
Maribel Acosta; Elena Simperl; Fabian Flöck; Maria-Esther Vidal
We propose HARE, a SPARQL query engine that encompasses human-machine query processing to augment the completeness of query answers. We empirically assessed the effectiveness of HARE on 50 SPARQL queries over DBpedia. Experimental results clearly show that our solution accurately enhances answer completeness.
web science | 2017
Olga Zagovora; Fabian Flöck; Claudia Wagner
Previous research has shown the existence of gender biases in the depiction of professions and occupations in search engine results. Such an unbalanced presentation might just as likely occur on Wikipedia, one of the most popular knowledge resources on the Web, since the encyclopedia has already been found to exhibit such tendencies in past studies. Under this premise, our work assesses gender bias with respect to the content of German Wikipedia articles about professions and occupations along three dimensions: used male vs. female titles (and redirects), included images of persons, and names of professionals mentioned in the articles. We further use German labor market data to assess the potential misrepresentation of a gender for each specific profession. Our findings in fact provide evidence for systematic over-representation of men on all three dimensions. For instance, for professional fields dominated by females, the respective articles on average still feature almost two times more images of men; and in the mean, 83% of the mentioned names of professionals were male and only 17% female.
Archive | 2016
Paul Groth; Elena Simperl; Alasdair J. G. Gray; Marta Sabou; Markus Krötzsch; Freddy Lécué; Fabian Flöck; Yolanda Gil
Meaning Representations as Linked Data. . . . . . . . . . . . . . . . . . . . 12 Gully A. Burns, Ulf Hermjakob, and José Luis Ambite Interoperability for Smart Appliances in the IoT World . . . . . . . . . . . . . . . . 21 Laura Daniele, Monika Solanki, Frank den Hartog, and Jasper Roes An Ontology of Soil Properties and Processes . . . . . . . . . . . . . . . . . . . . . . 30 Heshan Du, Vania Dimitrova, Derek Magee, Ross Stirling, Giulio Curioni, Helen Reeves, Barry Clarke, and Anthony Cohn LODStats: The Data Web Census Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 38 Ivan Ermilov, Jens Lehmann, Michael Martin, and Sören Auer Zhishi.lemon: On Publishing Zhishi.me as Linguistic Linked Open Data . . . . 47 Zhijia Fang, Haofen Wang, Jorge Gracia, Julia Bosque-Gil, and Tong Ruan Linked Disambiguated Distributional Semantic Networks . . . . . . . . . . . . . . . 56 Stefano Faralli, Alexander Panchenko, Chris Biemann, and Simone P. Ponzetto BESDUI: A Benchmark for End-User Structured Data User Interfaces. . . . . . 65 Roberto García, Rosa Gil, Juan Manuel Gimeno, Eirik Bakke, and David R. Karger SPARQLGX: Efficient Distributed Evaluation of SPARQL with Apache Spark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Damien Graux, Louis Jachiet, Pierre Genevès, and Nabil Layaïda Querying Wikidata: Comparing SPARQL, Relational and Graph Databases . . . 88 Daniel Hernández, Aidan Hogan, Cristian Riveros, Carlos Rojas, and Enzo Zerega Clinga: Bringing Chinese Physical and Human Geography in Linked Open Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Wei Hu, Haoxuan Li, Zequn Sun, Xinqi Qian, Lingkun Xue, Ermei Cao, and Yuzhong Qu LinkGen: Multipurpose Linked Data Generator . . . . . . . . . . . . . . . . . . . . . . 113 Amit Krishna Joshi, Pascal Hitzler, and Guozhu Dong OntoBench: Generating Custom OWL 2 Benchmark Ontologies . . . . . . . . . . 122 Vincent Link, Steffen Lohmann, and Florian Haag Linked Data (in Low-Resource) Platforms: A Mapping for Constrained Application Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Giuseppe Loseto, Saverio Ieva, Filippo Gramegna, Michele Ruta, Floriano Scioscia, and Eugenio Di Sciascio TripleWave: Spreading RDF Streams on the Web . . . . . . . . . . . . . . . . . . . . 140 Andrea Mauri, Jean-Paul Calbimonte, Daniele Dell’Aglio, Marco Balduini, Marco Brambilla, Emanuele Della Valle, and Karl Aberer Conference Linked Data: The ScholarlyData Project . . . . . . . . . . . . . . . . . . 150 Andrea Giovanni Nuzzolese, Anna Lisa Gentile, Valentina Presutti, and Aldo Gangemi The OWL Reasoner Evaluation (ORE) 2015 Resources . . . . . . . . . . . . . . . . 159 Bijan Parsia, Nicolas Matentzoglu, Rafael S. Gonçalves, Birte Glimm, and Andreas Steigmiller FOOD: FOod in Open Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Silvio Peroni, Giorgia Lodi, Luigi Asprino, Aldo Gangemi, and Valentina Presutti YAGO: A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Thomas Rebele, Fabian Suchanek, Johannes Hoffart, Joanna Biega, Erdal Kuzey, and Gerhard Weikum A Collection of Benchmark Datasets for Systematic Evaluations of Machine Learning on the Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . 186 Petar Ristoski, Gerben Klaas Dirk de Vries, and Heiko Paulheim Enabling Combined Software and Data Engineering at Web-Scale: The ALIGNED Suite of Ontologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Monika Solanki, Bojan Božić, Markus Freudenberg, Dimitris Kontokostas, Christian Dirschl, and Rob Brennan A Replication Study of the Top Performing Systems in SemEval Twitter Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 Efstratios Sygkounas, Giuseppe Rizzo, and Raphaël Troncy XXVI