Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ran Yu is active.

Publication


Featured researches published by Ran Yu.


international semantic web conference | 2016

Towards Entity Summarisation on Structured Web Markup

Ran Yu; Ujwal Gadiraju; Xiaofei Zhu; Besnik Fetahu; Stefan Dietze

Embedded markup based on Microdata, RDFa, and Microformats have become prevalent on the Web and constitute an unprecedented source of data. However, statements extracted from markup are fundamentally different to traditional RDF graphs: entity descriptions are flat, facts are highly redundant and granular, and co-references are very frequent yet explicit links are missing. Therefore, carrying out typical entity-centric tasks such as retrieval and summarisation cannot be tackled sufficiently with state of the art methods. We present an entity summarisation approach that overcomes such issues through a combination of entity retrieval and summarisation techniques geared towards the specific challenges associated with embedded markup. We perform a preliminary evaluation on a subset of the Web Data Commons dataset and show improvements over existing entity retrieval baselines. In addition, an investigation into the coverage and complementary of facts from the constructed entity summaries shows potential for aiding tasks such as knowledge base population.


international conference on data engineering | 2017

FuseM: Query-Centric Data Fusion on Structured Web Markup

Ran Yu; Ujwal Gadiraju; Besnik Fetahu; Stefan Dietze

Embedded markup based on Microdata, RDFa, and Microformats have become prevalent on the Web and constitute an unprecedented source of data. However, RDF statements extracted from markup are fundamentally different from traditional RDF graphs: entity descriptions are flat, facts are highly redundant, and despite very frequent co-references explicit links are missing. Therefore, carrying out typical entity-centric tasks such as retrieval and summarisation cannot be tackled sufficiently with state-of-the-art methods and require preliminary data fusion. Given the scale and dynamics of Web markup, the applicability of general data fusion approaches is limited. We present a novel query-centric data fusion approach which overcomes such issues through a combination of entity retrieval and fusion techniques geared towards the specific challenges associated with embedded markup. To ensure precise and diverse entity descriptions, we follow a supervised learning approach and train a classifier for data fusion of a pool of candidate facts relevant to a given query and obtained through a preliminary entity retrieval step. We perform a thorough evaluation on a subset of the Web Data Commons dataset and show significant improvement over existing baselines. In addition, an investigation into the coverage and complementarity of facts from the constructed entity descriptions compared to DBpedia, shows potential for aiding tasks such as knowledge base population.


International Workshop on Semantic, Analytics, Visualization | 2016

Analysing Structured Scholarly Data Embedded in Web Pages

Pracheta Sahoo; Ujwal Gadiraju; Ran Yu; Sriparna Saha; Stefan Dietze

Web pages increasingly embed structured data in the form of microdata, microformats and RDFa. Through efforts such as schema.org, such embedded markup have become prevalent, with current studies estimating an adoption by about 26% of all web pages. Similar to the early adoption of Linked Data principles by publishers, libraries and other providers of bibliographic data, such organisations have been among the early adopters, providing an unprecedented source of structured data about scholarly works. Such data, however, is fundamentally different from traditional Linked Data, by being very sparsely linked and consisting of a large amount of coreferences and redundant statements. So far, the scale and nature of embedded scholarly data on the Web has not been investigated. In this work, we provide a study on embedded scholarly data to answer research questions about the depth, syntactic and semantic characteristics and distribution of extracted data, thereby investigating challenges and opportunities for using embedded data as a structured knowledge graph of scholarly information.


conference on human information interaction and retrieval | 2018

Analyzing Knowledge Gain of Users in Informational Search Sessions on the Web

Ujwal Gadiraju; Ran Yu; Stefan Dietze; Peter Holtz

Web search is frequently used by people to acquire new knowledge and to satisfy learning-related objectives, but little is known about how a user»s knowledge evolves through the course of a search session. We present a study addressing the knowledge gain of users in informational search sessions. Using crowdsourcing, we recruited 500 distinct users and orchestrated real-world search sessions spanning 10 different topics and information needs. By using scientifically formulated knowledge tests we calibrated the knowledge of users before and after their search sessions, quantifying their knowledge gain. We investigated the impact of information needs on the search behavior and knowledge gain of users, revealing a significant effect of information need on user queries and navigational patterns, but no direct effect on the knowledge gain. Users on average exhibited a higher knowledge gain through search sessions pertaining to topics they were less familiar with. Our findings in this paper contribute important ground work towards advancing current research in understanding user knowledge gain through web search sessions.


International Conference on Cultural Heritage | 2016

Enrichment and Preservation of Architectural Knowledge

J Jakob Beetz; Ina Blümel; Stefan Dietze; Besnik Fetahui; Ujwal Gadiraju; Martin Hecher; Tf Thomas Krijnen; Michelle Lindlar; Martin Tamke; Raoul Wessel; Ran Yu

In the context of the EU FP7 DURAARK project (2013–2016), inter-disciplinary methods, technologies and tools have been researched and developed, that support the Long Term Preservation of semantically enriched digital representations of built structures. The results of the research efforts include approaches of semi-automatically deriving building models from point cloud data sets acquired from laser scans and the integration and overlay of such representations with explicit Building Information Models (BIM). We introduce novel ways for the further semantic enrichment of such hybrid building models with contextual data and vocabularies from external resources using Linked Data (LD) and the recognition relevant features and building components. A special focus of the research reported here lies on strategies and policies for their long term archival, information retrieval based on rich semantic metadata and the use of such archival systems in research and commercial scenarios. We introduce a set of prototypical, open-source tools implementing these features that have been integrated into a modular preservation framework called the “DURAARK Workbench”.


international acm sigir conference on research and development in information retrieval | 2018

Predicting User Knowledge Gain in Informational Search Sessions

Ran Yu; Ujwal Gadiraju; Peter Holtz; Markus Rokicki; Philipp Kemkes; Stefan Dietze

Web search is frequently used by people to acquire new knowledge and to satisfy learning-related objectives. In this context, informational search missions with an intention to obtain knowledge pertaining to a topic are prominent. The importance of learning as an outcome of web search has been recognized. Yet, there is a lack of understanding of the impact of web search on a users knowledge state. Predicting the knowledge gain of users can be an important step forward if web search engines that are currently optimized for relevance can be molded to serve learning outcomes. In this paper, we introduce a supervised model to predict a users knowledge state and knowledge gain from features captured during the search sessions. To measure and predict the knowledge gain of users in informational search sessions, we recruited 468 distinct users using crowdsourcing and orchestrated real-world search sessions spanning 11 different topics and information needs. By using scientifically formulated knowledge tests, we calibrated the knowledge of users before and after their search sessions, quantifying their knowledge gain. Our supervised models utilise and derive a comprehensive set of features from the current state of the art and compare performance of a range of feature sets and feature selection strategies. Through our results, we demonstrate the ability to predict and classify the knowledge state and gain using features obtained during search sessions, exhibiting superior performance to an existing baseline in the knowledge state prediction task.


web information systems engineering | 2015

Adaptive Focused Crawling of Linked Data

Ran Yu; Ujwal Gadiraju; Besnik Fetahu; Stefan Dietze

Given the evolution of publicly available Linked Data, crawling and preservation have become increasingly important challenges. Due to the scale of available data on the Web, efficient focused crawling approaches which are able to capture the relevant semantic neighborhood of seed entities are required. Here, determining relevant entities for a given set of seed entities is a crucial problem. While the weight of seeds within a seed list vary significantly with respect to the crawl intent, we argue that an adaptive crawler is required, which considers such characteristics when configuring the crawling and relevance detection approach. To address this problem, we introduce a crawling configuration, which considers seed list-specific features as part of its crawling and ranking algorithm. We evaluate it through extensive experiments in comparison to a number of baseline methods and crawling parameters. We demonstrate that, configurations which consider seed list features outperform the baselines and present further insights gained from our experiments.


international semantic web conference | 2016

A Survey on Challenges in Web Markup Data for Entity Retrieval.

Ran Yu; Besnik Fetahu; Ujwal Gadiraju; Stefan Dietze


arXiv: Human-Computer Interaction | 2018

Detecting, Understanding and Supporting Everyday Learning in Web Search.

Ran Yu; Ujwal Gadiraju; Stefan Dietze


Social Work | 2018

KnowMore – knowledge base augmentation with structured web markup

Ran Yu; Ujwal Gadiraju; Besnik Fetahu; Oliver Lehmberg; Dominique Ritze; Stefan Dietze

Collaboration


Dive into the Ran Yu's collaboration.

Top Co-Authors

Avatar

Stefan Dietze

Leibniz University of Hanover

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ina Blümel

German National Library of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Michelle Lindlar

German National Library of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

J Jakob Beetz

Eindhoven University of Technology

View shared research outputs
Top Co-Authors

Avatar

Tf Thomas Krijnen

Eindhoven University of Technology

View shared research outputs
Top Co-Authors

Avatar

Martin Tamke

Royal Danish Academy of Fine Arts

View shared research outputs
Top Co-Authors

Avatar

Sriparna Saha

Indian Institute of Technology Patna

View shared research outputs
Researchain Logo
Decentralizing Knowledge