Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Lydia Weiland is active.

Publication


Featured researches published by Lydia Weiland.


international conference on the theory of information retrieval | 2016

Understanding the Message of Images with Knowledge Base Traversals

Lydia Weiland; Ioana Hulpus; Simone Paolo Ponzetto; Laura Dietz

The message of news articles is often supported by the pointed use of iconic images. These images together with their captions encourage emotional involvement of the reader. Current algorithms for understanding the semantics of news articles focus on its text, often ignoring the image. On the other side, works that target the semantics of images, mostly focus on recognizing and enumerating the objects that appear in the image. In this work, we explore the problem from another perspective: Can we devise algorithms to understand the message encoded by images and their captions? To answer this question, we study how well algorithms can describe an image-caption pair in terms of Wikipedia entities, thereby casting the problem as an entity-ranking task with an image-caption pair as query. Our proposed algorithm brings together aspects of entity linking, subgraph selection, entity clustering, relatedness measures, and learning-to-rank. In our experiments, we focus on media-iconic image-caption pairs which often reflect complex subjects such as sustainable energy and endangered species. Our test collection includes a gold standard of over 300 image-caption pairs about topics at different levels of abstraction. We show that with a MAP of 0.69, the best results are obtained when aggregating content-based and graph-based features in a Wikipedia-derived knowledge base.


conference on multimedia modeling | 2014

A Novel Approach for Semantics-Enabled Search of Multimedia Documents on the Web

Lydia Weiland; Ansgar Scherp

We present an analysis of a large corpus of multimedia documents obtained from the web. From this corpus of documents, we have extracted the media assets and the relation information between the assets. In order to conduct our analysis, the assets and relations are represented using a formal ontology. The ontology not only allows for representing the structure of multimedia documents but also to connect with arbitrary background knowledge on the web. The ontology as well as the analysis serve as basis for implementing a novel search engine for multimedia documents on the web.


ubiquitous computing | 2016

Exploring a multi-sensor picking process in the future warehouse

Alexander Diete; Lydia Weiland; Timo Sztyler; Heiner Stuckenschmidt

Recognizing, validating, and optimizing activities of workers in logistics is increasingly aided by smart devices like glasses, gloves, and sensor enhanced wristbands. We present a system that augments picking processes with smart glasses and wristband that incorporates different types of sensors including ultrasonic, pressure, and inertial. We focus on low barriers for the adoption as well as the combination of video and inertial sensors. For that purpose, we create a new semi-supervised dataset to evaluate the feasibility of our approach. The system recognizes and monitors activities like grabbing and releasing of objects that are essential for order picking tasks.


empirical methods in natural language processing | 2015

Image with a Message: Towards Detecting Non-Literal Image Usages by Visual Linking

Lydia Weiland; Laura Dietz; Simone Paolo Ponzetto

A key task to understand an image and its corresponding caption is not only to find out what is shown on the picture and described in the text, but also what is the exact relationship between these two elements. The long-term objective of our work is to be able to distinguish different types of relationship, including literal vs. non-literal usages, as well as finegrained non-literal usages (i.e., symbolic vs. iconic). Here, we approach this challenging problem by answering the question: ‘How can we quantify the degrees of similarity between the literal meanings expressed within images and their captions?’. We formulate this problem as a ranking task, where links between entities and potential regions are created and ranked for relevance. Using a Ranking SVM allows us to leverage from the preference ordering of the links, which help us in the similarity calculation for the cases of visual or textual ambiguity, as well as misclassified data. Our experiments show that aggregating different features using a supervised ranker achieves better results than a baseline knowledge-base method. However, much work still lies ahead, and we accordingly conclude the paper with a detailed discussion of a short- and longterm outlook on how to push our work on relationship classification one step further.


international conference on computational linguistics | 2014

Weakly supervised construction of a repository of iconic images

Lydia Weiland; Wolfgang Effelsberg; Simone Paolo Ponzetto

We present a first attempt at semi-automatically harvesting a dataset of iconic images. Iconic images are depicting objects or scenes, which arouse associations to abstract topics. Our method starts with representative topic-evoking images from Wikipedia, which are labeled with relevant concepts and entities found in their associated captions. These are used to query an online image repository (i.e., Flickr), in order to further acquire additional examples of topic-specific iconic relations. To this end, we leverage a combination of visual similarity measures, image clustering and matching algorithms to acquire clusters of iconic images that are topically connected to the original seed images, while also allowing for various degrees of diversity. Our first results are promising in that they indicate the feasibility of the task and that we are able to build a first version of our resource with minimal supervision.


data and knowledge engineering | 2018

Knowledge-rich image gist understanding beyond literal meaning

Lydia Weiland; Ioana Hulpus; Simone Paolo Ponzetto; Wolfgang Effelsberg; Laura Dietz

We investigate the problem of understanding the message (gist) conveyed by images and their captions as found, for instance, on websites or news articles. To this end, we propose a methodology to capture the meaning of image-caption pairs on the basis of large amounts of machine-readable knowledge that has previously been shown to be highly effective for text understanding. Our method identifies the connotation of objects beyond their denotation: where most approaches to image understanding focus on the denotation of objects, i.e., their literal meaning, our work addresses the identification of connotations, i.e., iconic meanings of objects, to understand the message of images. We view image understanding as the task of representing an image-caption pair on the basis of a wide-coverage vocabulary of concepts such as the one provided by Wikipedia, and cast gist detection as a concept-ranking problem with image-caption pairs as queries. To enable a thorough investigation of the problem of gist understanding, we produce a gold standard of over 300 image-caption pairs and over 8,000 gist annotations covering a wide variety of topics at different levels of abstraction. We use this dataset to experimentally benchmark the contribution of signals from heterogeneous sources, namely image and text. The best result with a Mean Average Precision (MAP) of 0.69 indicate that by combining both dimensions we are able to better understand the meaning of our image-caption pairs than when using language or vision information alone. We test the robustness of our gist detection approach when receiving automatically generated input, i.e., using automatically generated image tags or generated captions, and prove the feasibility of an end-to-end automated process.


conference on multimedia modeling | 2017

Using Object Detection, NLP, and Knowledge Bases to Understand the Message of Images

Lydia Weiland; Ioana Hulpus; Simone Paolo Ponzetto; Laura Dietz

With the increasing amount of multimodal content from social media posts and news articles, there has been an intensified effort towards conceptual labeling and multimodal (topic) modeling of images and of their affiliated texts. Nonetheless, the problem of identifying and automatically naming the core abstract message (gist) behind images has received less attention. This problem is especially relevant for the semantic indexing and subsequent retrieval of images. In this paper, we propose a solution that makes use of external knowledge bases such as Wikipedia and DBpedia. Its aim is to leverage complex semantic associations between the image objects and the textual caption in order to uncover the intended gist. The results of our evaluation prove the ability of our proposed approach to detect gist with a best MAP score of 0.74 when assessed against human annotations. Furthermore, an automatic image tagging and caption generation API is compared to manually set image and caption signals. We show and discuss the difficulty to find the correct gist especially for abstract, non-depictable gists as well as the impact of different types of signals on gist detection quality.


Procedia Computer Science | 2017

Recognizing grabbing actions from inertial and video sensor data in a warehouse scenario

Alexander Diete; Timo Sztyler; Lydia Weiland; Heiner Stuckenschmidt

Modern industries are increasingly adapting to smart devices for aiding and improving their productivity and work flow. This includes logistics in warehouses where validation of correct items per order can be enhanced with mobile devices. Since handling incorrect orders is a big part of the costs of warehouse maintenance, reducing errors like missed or wrong items should be avoided. Thus, early identification of picking procedures and items picked is beneficial for reducing these errors. By using data glasses and a smartwatch we aim to reduce these errors while also enabling the picker to work hands-free. In this paper, we present an analysis of feature sets for classification of grabbing actions in the order picking process. For this purpose, we created a dataset containing inertial data and egocentric video from four participants performing picking tasks. As we previously worked with logistics companies, we modeled our test scenario close to real-world warehouse environments. Afterwards, we extracted features from the time and frequency domain for inertial data and color and descriptor features from the image data to learn grabbing actions. We were able to show that the combination of inertial and video data enables us to recognize grabbing actions in a picking scenario. We also show that the combination of different sensors improves the results, yielding an F-measure of 85.3% for recognizing grabbing actions.


ieee international conference semantic computing | 2014

Requirements Elicitation Towards a Search Engine for Semantic Multimedia Content

Lydia Weiland; Felix Hanser; Ansgar Scherp

We investigate user requirements regarding the interface design for a semantic multimedia search and retrieval based on a prototypical implementation of a search engine for multimedia content on the web. Thus, unlike existing image search engines and video search engines, we are interested in true multimedia content combining different media assets into multimedia documents like PowerPoint presentations and Flash files. In a user study with 20 participants, we conducted a formative evaluation based on the think-aloud method and semi-structured interviews in order to obtain requirements to a future web search engine for multimedia content. The interviews are complemented by a paper-and-pencil questionnaire to obtain quantitative information and present mockups demonstrating the user interface of a future multimedia search and retrieval engine.


ieee international conference semantic computing | 2014

Fulgeo -- Towards an Intuitive User Interface for a Semantics-Enabled Multimedia Search Engine

D. Schneider; Denny Stohr; J. Tingvold; A. B. Amundsen; Lydia Weiland; Stephan Kopf; Wolfgang Effelsberg; Ansgar Scherp

Multimedia documents like PowerPoint presentations or Flash documents are widely adopted in the Internet and exist in context of lots of different topics. However, so far there is no user friendly way to explore and search for this content. The aim of this work is to address this issue by developing a new, easy-to-use user interface approach and prototype search engine. Our system is called fulgeo and specifically focuses on a suitable multimedia interface for visualizing the query results of semantically-enriched Flash documents.

Collaboration


Dive into the Lydia Weiland's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wolfgang Effelsberg

Technische Universität Darmstadt

View shared research outputs
Top Co-Authors

Avatar

Laura Dietz

University of New Hampshire

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Denny Stohr

Technische Universität Darmstadt

View shared research outputs
Researchain Logo
Decentralizing Knowledge