Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marko Brunzel is active.

Publication


Featured researches published by Marko Brunzel.


Lecture Notes in Computer Science | 2006

Discovering multi terms and co-hyponymy from XHTML documents with XTREEM

Marko Brunzel; Myra Spiliopoulou

The Semantic Web needs ontologies as an integral component. Current methods for learning and enhancing ontologies, need to be further improved to overcome the knowledge acquisition bottleneck. The identification of concepts and relations with only minimal user interaction is still a challenging objective. Current approaches performed to extract semantics often use association rules or clustering upon regular flat text. In this paper we describe an approach on extracting semantics from Web Document collections which takes advantage of the semi structured content within XHTML (an XML dialect which can be obtained from traditional HTML documents) Web Documents. The XTREEM (Xhtml TREE Mining) method uses structural information, the mark-up in Web content, as indicators of term boundaries and for co-hyponymy relations.


applications of natural language to data bases | 2007

Domain relevance on term weighting

Marko Brunzel; Myra Spiliopoulou

The TFxIDF term weighting scheme is the standard approach on vectorization of textual data. For a data set where textual data stemming from web document structure is to be vectorized [2] the need for a enhanced term weighting scheme arose. In this publication we introduce a term weighting scheme which improves the behavior compared to the traditional TFxIDF scheme by adding a component which is based on the linguistically inspired notion of domain relevance. Domain relevance measures the degree to which a term is regarded as more relevant within a data set compared to a reference data set. By means of this external component a potential weakness of TFxIDF on non standard distributed data sets is overcome. This weighting scheme favours domain relevant terms, which can be regarded as more useful in settings where the clustering is performed to be consumed by an human supervisor e.g for semi-automatic ontology learning.


european semantic web conference | 2005

RELFIN – topic discovery for ontology enhancement and annotation

Markus Schaal; Roland M. Müller; Marko Brunzel; Myra Spiliopoulou

While classic information retrieval methods return whole documents as a result of a query, many information demands would be better satisfied by fine-grain access inside the documents. One way to support this goal is to make the semantics of small document regions explicit, e.g. as XML labels, so that query engines can exploit them. To this purpose, the topics of the small document regions must be discovered from the texts; differently from document labelling applications, fine-grain topics cannot be listed in advance for arbitrary collections. Text-understanding approaches can derive the topic of a document region but are less appropriate for the construction of a small set of topics that can be used in queries. To address this challenge we propose the coupling of text mining, prior knowledge explicated in ontologies and human expertise and present the system RELFIN, which is designed to assis the human expert in the discovery of topics appropriate for (i) ontology enhancement with additional concepts or relationships, (ii) semantic characterization and tagging of document regions. RELFIN performs data mining upon linguistically preprocessed corpora to group document regions on topics and constructing the topic labels for them, so that the labels are characteristic of the regions and thus helpful in ontology-based search. We show our first results of applying RELFIN on a case study of text analysis and retrieval.


knowledge acquisition, modeling and management | 2006

Discovering semantic sibling groups from web documents with XTREEM-SG

Marko Brunzel; Myra Spiliopoulou

The acquisition of explicit semantics is still a research challenge. Approaches for the extraction of semantics focus mostly on learning hierarchical hypernym-hyponym relations. The extraction of co-hyponym and co-meronym sibling semantics is performed to a much lesser extent, though they are not less important in ontology engineering. In this paper we will describe and evaluate the XTREEM-SG (Xhtml TREE Mining – for Sibling Groups) approach on finding sibling semantics from semi-structured Web documents. XTREEM takes advantage of the added value of mark-up, available in web content, for grouping text siblings. We will show that this grouping is semantically meaningful. The XTREEM-SG approach has the advantage that it is domain and language independent; it does not rely on background knowledge, NLP software or training. In this paper we apply the XTREEM-SG approach and evaluate against the reference semantics from two golden standard ontologies. We investigate how variations on input, parameters and reference influence the obtained results on structuring a closed vocabulary on sibling relations. Earlier methods that evaluate sibling relations against a golden standard report a 14.18% F-measure value. Our method improves this number into 21.47%.


VISUAL '08 Proceedings of the 10th international conference on Visual Information Systems: Web-Based Visual Information Search and Management | 2008

Handling of Task Hierarchies on the Nepomuk Social Semantic Desktop

Marko Brunzel; Roland M. Mueller

The idea of the Semantic Web is becoming more and more reality. Recent developments in Semantic Desktop research bring semantically enriched data onto the knowledge workers desktop environment. A crucial application for todays knowledge worker is task management, which goes beyond simple ToDo lists. In this paper we explain why task management applications in social semantic environments need an appropriate user interface to take advantage of the opportunities ahead. Then we propose the usage of an UI which is tailored towards the underlying data schema: the task management ontology. One challenge for task management based on flexible semantic schemas in social environments is to scope on what should be shared and what not. Sharing and transfer of semantic, graph-represented information is highly desired. We propose WYSIWYS: What You See Is What You Share. This paradigm gives the human user an interface where he/she maintains control over what is shared and what not.


International Journal of Data Warehousing and Mining | 2007

Acquiring Semantic Sibling Associations from Web Documents

Marko Brunzel; Myra Spiliopoulou

The automated discovery of relationships among terms contributes to the automation of the ontology engineering process and allows for sophisticated query expansion in information retrieval. While there are many findings on the identification of direct hierarchical relations among concepts, less attention has been paid on the discovery sibling terms. These are terms that share a common, a priori unknown parent such as co-hyponyms and co-meronyms. In this study, we present our results on the discovery of pairs or groups of sibling terms with XTREEM-SA (Xhtml TREE mining for sibling associations), an algorithm that extracts semantics from Web documents. While conventional methods process an appropriately prepared corpus, XTREEM-SA takes as input an arbitrary collection of Web documents on a given topic and finds sibling relations between terms in this corpus. It is thus independent of domain and language, does not require linguistic preprocessing, and does not rely on syntactic or other rules on text formation. We describe XTREEM-SA and evaluate it toward two reference ontologies. In this context, we also elaborate on the challenges of evaluating semantics extracted from the Web against handcrafted ontologies of high quality but possibly low coverage.


RIAO '04 Coupling approaches, coupling media and coupling languages for information retrieval | 2004

Coupling information extraction and data mining for ontology learning in PARMENIDES

Myra Spiliopoulou; Fabio Rinaldi; William J. Black; Gian Piero Zarri; Roland M. Mueller; Marko Brunzel; Babis Theodoulidis; Giorgos Orphanos; Michael Hess; James Dowdall; John McNaught; Maghi King; Andreas Persidis; Luc Bernard


data warehousing and knowledge discovery | 2007

Learning of semantic sibling group hierarchies - K-means vs. bi-secting-K-means

Marko Brunzel


data warehousing and knowledge discovery | 2006

Discovering semantic sibling associations from web documents with XTREEM-SP

Marko Brunzel; Myra Spiliopoulou


Journal on Data Semantics | 2008

Discovering Groups of Sibling Terms from Web Documents with XTREEM-SG

Marko Brunzel; Myra Spiliopoulou

Collaboration


Dive into the Marko Brunzel's collaboration.

Top Co-Authors

Avatar

Myra Spiliopoulou

Otto-von-Guericke University Magdeburg

View shared research outputs
Top Co-Authors

Avatar

Roland M. Mueller

Berlin School of Economics and Law

View shared research outputs
Top Co-Authors

Avatar

Roland M. Müller

Otto-von-Guericke University Magdeburg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John McNaught

University of Manchester

View shared research outputs
Top Co-Authors

Avatar

Luc Bernard

University of Manchester

View shared research outputs
Researchain Logo
Decentralizing Knowledge