Marco Fisichella
Leibniz University of Hanover
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marco Fisichella.
conference on information and knowledge management | 2010
Marco Fisichella; Avaré Stewart; Kerstin Denecke; Wolfgang Nejdl
Recent pandemics such as Swine Flu have caused concern for public health officials. Given the ever increasing pace at which infectious diseases can spread globally, officials must be prepared to react sooner and with greater epidemic intelligence gathering capabilities. However, state-of-the-art systems for Epidemic Intelligence have not kept the pace with the growing need for more robust public health event detection. In this paper, we propose a game-changing approach where public health events are detected in an unsupervised manner. We address the problems associated with adapting an unsupervised learner to the medical domain and in doing so, propose an approach which combines aspects from different feature-based event detection methods. We evaluate our approach with a real world dataset with respect to the quality of article clusters. Our results show that we are able to achieve a precision of 66% and a recall of 81% when evaluated using manually annotated, real-world data. This shows promising results for the use of such techniques in this new problem setting.
european conference on technology enhanced learning | 2010
Katja Niemann; Uta Schwertel; Marco Kalz; Alexander Mikroyannidis; Marco Fisichella; Martin Friedrich; Michele Dicerto; Kyung-Hun Ha; Philipp Holtkamp; Ricardo Kawase; Elisabetta Parodi; Jan M. Pawlowski; Henri Pirkkalainen; Vassilis Pitsilis; Aristides Vidalis; Martin Wolpers; Volker Zimmermann
Already existing open educational resources in management have a high potential for enterprises to address the increasing training needs of their employees. However, access barriers still prevent the full exploitation of this potential. Users have to search a number of repositories with heterogeneous interfaces in order to retrieve the desired content. In addition, the use of search criteria related to skills, such as learning objectives and skill-levels is in most cases not supported. The demonstrator presented in this paper addresses these shortcomings by federating multiple repositories, integrating and enriching their metadata, and employing skill-based search for management related content.
systems, man and cybernetics | 2011
Alfredo Cuzzocrea; Marco Fisichella
One of the main advantages of Web services is that they can be composed into more complex processes in order to achieve a given business goal. However, such potentiality cannot be fully exploited until suitable methods and techniques allowing us to enable automatic discovery of composed processes are provided. Indeed, nowadays service discovery still focuses on matching atomic services by typically checking the similarity of functional parameters, such as inputs and outputs. However, a more profitable process discovering can be reached if both internal structure and component services are taken into account. Based on this main intuition, in this paper we describe a method for discovering composite OWL-S processes that founds on the following main contributions: (i) proposing a graph-based representation of composite OWL-S processes; and (ii) introducing an algorithm that matches over such (graph-based) representations and computes their degree of matching via combining the similarity of the atomic services they comprise and the similarity of the control flow among them. Finally, as another contribution of our research, we conducted a comprehensive experimental campaign where we tested our proposed algorithm by deriving insightful trade-offs of benefits and limitations of the overall framework for discovering Semantic Web services.
web information systems engineering | 2014
Tuan A. Tran; Andrea Ceroni; Mihai Georgescu; Kaweh Djafari Naini; Marco Fisichella
Much of existing work in information extraction assumes the static nature of relationships in fixed knowledge bases. However, in collaborative environments such as Wikipedia, information and structures are highly dynamic over time. In this work, we introduce a new method to extract complex event structures from Wikipedia. We propose a new model to represent events by engaging multiple entities, generalizable to an arbitrary language. The evolution of an event is captured effectively based on analyzing the user edits history in Wikipedia. Our work provides a foundation for a novel class of evolution-aware entity-based enrichment algorithms, and considerably increases the quality of entity accessibility and temporal retrieval for Wikipedia. We formalize this problem and introduce an efficient end-to-end platform as a solution. We conduct comprehensive experiments on a real dataset of \(1.8 \ million\) Wikipedia articles to show the effectiveness of our proposed solution. Our results demonstrate that we are able to achieve a precision of 70% when evaluated using manually annotated data. Finally, we make a comparative analysis of our work with the well established Current Event Portal of Wikipedia and find that our system WikipEvent using Co-References method can be used in a complementary way to deliver new and more information about events.
Proceedings of The International Symposium on Open Collaboration | 2014
Andrea Ceroni; Mihai Georgescu; Ujwal Gadiraju; Kaweh Djafari Naini; Marco Fisichella
The Web of data is constantly evolving based on the dynamics of its content. Current Web search engine technologies consider static collections and do not factor in explicitly or implicitly available temporal information, that can be leveraged to gain insights into the dynamics of the data. In this paper, we hypothesize that by employing the temporal aspect as the primary means for capturing the evolution of entities, it is possible to provide entity-based accessibility to Web archives. We empirically show that the edit activity on Wikipedia can be exploited to provide evidence of the evolution of Wikipedia pages over time, both in terms of their content and in terms of their temporally defined relationships, classified in literature as events. Finally, we present results from our extensive analysis of a dataset consisting of 31,998 Wikipedia pages describing politicians, and observations from in-depth case studies. Our findings reflect the usefulness of leveraging temporal information in order to study the evolution of entities and breed promising grounds for further research.
european conference on information retrieval | 2014
Andrea Ceroni; Marco Fisichella
Event Detection algorithms infer the occurrence of real---world events from natural language text and always require a ground truth for their validation. However, the lack of an annotated and comprehensive ground truth makes the evaluation onerous for humans, who have to manually search for events inside it. In this paper, we envision to automatize the evaluation process by defining the novel problem of Entity---based Automatic Event Validation. We propose a first approach which validates events by estimating the temporal relationships among their representative entities within documents in the Web. Our approach reached a Kappa Statistic of 0.68 when compared with the evaluation of real---world events done by humans. This and other preliminary results motivate further research effort on this novel problem.
database and expert systems applications | 2010
Marco Fisichella; Fan Deng; Wolfgang Nejdl
In this paper, we study the problem of detecting near duplicates for high dimensional data points in an incremental manner. For example, for an image sharing website, it would be a desirable feature if near-duplicates can be detected whenever a user uploads a new image into the website so that the user can take some action such as stopping the upload or reporting an illegal copy. Specifically, whenever a new point arrives, our goal is to find all points within an existing point set that are close to the new point based on a given distance function and a distance threshold before the new point is inserted into the data set. Based on a well-known indexing technique, Locality Sensitive Hashing, we propose a new approach which clearly speeds up the running time of LSH indexing while using only a small amount of extra space. The idea is to store a small fraction of near duplicate pairs within the existing point set which are found when they are inserted into the data set, and use them to prune LSH candidate sets for the newly arrived point. Extensive experiments based on three real-world data sets show that our method consistently outperforms the original LSH approach: to reach the same query response time, our method needs significantly less memory than the original LSH approach. Meanwhile, the LSH theoretical guarantee on the quality of the search result is preserved by our approach. Furthermore, it is easy to implement our approach based on LSH.
international acm sigir conference on research and development in information retrieval | 2016
Andrea Ceroni; Ujwal Gadiraju; Jan Matschke; Simon Wingert; Marco Fisichella
Manually inspecting text in a document collection to assess whether an event occurs in it is a cumbersome task. Although a manual inspection can allow one to identify and discard false events, it becomes infeasible with increasing numbers of automatically detected events. In this paper, we present a system to automatize event validation, defined as the task of determining whether a given event occurs in a given document or corpus. In addition to supporting users seeking for information that corroborates a given event, event validation can also boost the precision of automatically detected event sets by discarding false events and preserving the true ones. The system allows to specify events, retrieves candidate web documents, and assesses whether events occur in them. The validation results are shown to the user, who can revise the decision of the system. The validation method relies on a supervised model to predict the occurrence of events in a non-annotated corpus. This system can also be used to build ground-truths for event corpora.
international world wide web conferences | 2013
Ricardo Kawase; Marco Fisichella; Katja Niemann; Vassilis Pitsilis; Aristides Vidalis; Philipp Holtkamp; Bernardo Pereira Nunes
Already existing open educational resources in the field of Business and Management have a high potential for enterprises to address the increasing training needs of their employees. However, it is difficult to act on OERs as some data is hidden. In the meanwhile, numerous repositories provide Linked Open Data on this field. Though, users have to search a number of repositories with heterogeneous interfaces in order to retrieve the desired content. In this paper, we present the strategies to gather heterogeneous learning objects from the Web of Data, and we provide an overview of the benefits of the OpenScout platform. Despite the fact that not all data repositories strictly follow Linked Data principles, OpenScout addressed individual variations in order to harvest, align, and provide a single end-point. In the end, OpenScout provides a full-fledged environment that leverages on the Linked Open Data available on the Web and additionally exposes it in an homogeneous format.
european conference on technology enhanced learning | 2012
Ricardo Kawase; Patrick Siehndel; Bernardo Pereira Nunes; Marco Fisichella; Wolfgang Nejdl
Competence-annotations assist learners to retrieve and better understand the level of skills required to comprehend learning objects. However, the process of annotating learning objects with competence levels is a very time consuming task; ideally, this task should be performed by experts on the subjects of the educational resources. Due to this, most educational resources available online do not enclose competence information. In this paper, we present a method to tackle the problem of automatically assigning an educational resource with competence topics. To solve this problem, we exploit information extracted from external repositories available on the Web, which lead us to a domain independent approach. Results show that automatically assigned competences are coherent and may be applied to automatically enhance learning objects metadata.