Dunja Mladenic | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dunja Mladenic is active.

Explore More

Publication

Featured researches published by Dunja Mladenic.

EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining | 2005

Semi-automatic construction of topic ontologies

Blaž Fortuna; Dunja Mladenic; Marko Grobelnik

In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and K-Means clustering) and present how we integrated them into a system for semi-automatic topic ontology construction. The OntoGen system offers support to the user during the construction process by suggesting topics and analyzing them in real time. It suggests names for the topics in two alternative ways both based on extracting keywords from a set of documents inside the topic. The first set of descriptive keyword is extracted using document centroid vectors, while the second set of distinctive keyword is extracted from the SVM classification model dividing documents in the topic from the neighboring documents.

WIT Transactions on Information and Communication Technologies | 2002

Feature Selection Using Support Vector Machines

Janez Brank; Marko Grobelnik; Natasa Milic-Frayling; Dunja Mladenic

Text categorization is the task of classifying natural language documents into a set of predefined categories. Documents are typically represented by sparse vectors under the vector space model, where each word in the vocabulary is mapped to one coordinate axis and its occurrence in the document gives rise to one nonzero component in the vector representing that document. When training classifiers on large collections of documents, both the time and memory requirements connected with processing of these vectors may be prohibitive. This calls for using a feature selection method, not only to reduce the number of features but also to increase the sparsity of document vectors. We propose a feature selection method based on linear Support Vector Machines (SVMs). First, we train the linear SVM on a subset of training data and retain only those features that correspond to highly weighted components (in absolute value sense) of the normal to the resulting hyperplane that separates positive and negative examples. This reduced feature space is then used to train a classifier over a larger training set because more documents now fit into the same amount of memory. In our experiments we compare the effectiveness of the SVM -based feature selection with that of more traditional feature selection methods, such as odds ratio and information gain, in achieving the desired tradeoff between the vector sparsity and the classification performance. Experimental results indicate that, at the same level of vector sparsity, feature selection based on SVM normals yields better classification performance than odds ratioor information gainbased feature selection when linear SVM classifiers are used.

Data Science and Classification | 2006

kNN Versus SVM in the Collaborative Filtering Framework

Miha Grcar; Blaž Fortuna; Dunja Mladenic; Marko Grobelnik

We present experimental results of confronting the k-Nearest Neighbor (kNN) algorithm with Support Vector Machine (SVM) in the collaborative filtering framework using datasets with different properties. While k-Nearest Neighbor is usually used for the collaborative filtering tasks, Support Vector Machine is considered a state-of-the-art classification algorithm. Since collaborative filtering can also be interpreted as a classification/regression task, virtually any supervised learning algorithm (such as SVM) can also be applied. Experiments were performed on two standard, publicly available datasets and, on the other hand, on a real-life corporate dataset that does not fit the profile of ideal data for collaborative filtering. We conclude that the quality of collaborative filtering recommendations is highly dependent on the quality of the data. Furthermore, we can see that kNN is dominant over SVM on the two standard datasets. On the real-life corporate dataset with high level of sparsity, kNN fails as it is unable to form reliable neighborhoods. In this case SVM outperforms kNN.

2009 13th International Conference Information Visualisation | 2009

Document Visualization Based on Semantic Graphs

Delia Rusu; Bla Fortuna; Dunja Mladenic; Marko Grobelnik; Ruben Sipo

In this paper, we present a document visualization technique for data analysis based on the semantic representation of text in the form of a directed graph, referred to as semantic graph. It is derived using natural language processing as follows. Firstly subject– verb – object triplets are automatically extracted from the Penn Treebank parse tree obtained for each sentence in the document. Secondly, the triplets are further enhanced by linking them to their corresponding co-referenced named entity, by resolving pronominal anaphors as well as attaching the associated WordNet synset. Starting from the documents semantic graph and the list of extracted triplets we automatically generate the document summary, for which we also derive the semantic representation.

Archive | 2009

Capturing Document Semantics for Ontology Generation and Document Summarization

David Baxter; Bryan Klimt; Marko Grobelnik; David Schneider; Michael J. Witbrock; Dunja Mladenic

When dealing with a document collection, it is important to identify repeated information. In multi-document summarization, for example, it is important to retain widely repeated content, even if the wording is not exactly the same. Simplistic approaches simply look for the same strings, or the same syntactic structures (including words), across documents. Here we investigate semantic matching, applying background knowledge from a large, general knowledge base (KB) to identify such repeated information in texts. Automatic document summarization is the problem of creating a surrogate for a document that adequately represents its full content. Automatic ontology generation requires information about candidate types, roles and relationships gathered from across a document or document collection. We aim at a summarization system that can replicate the quality of summaries created by humans and ontology creation systems that significantly reduce the human effort required for construction. Both applications depend for their success on extracting the essence of a collection of text. The work reported here demonstrates the utility of using deep knowledge from Cyc for effectively identifying redundant information in texts by using both semantic and syntactic information.

Context and Semantics for Knowledge Management | 2011

Machine Learning Techniques for Understanding Context and Process

Marko Grobelnik; Dunja Mladenic; Gregor Leban; Tadej Štajner

This chapter discusses how machine learning techniques can be useful for modelling and understanding context and processes. Machine learning techniques that have been applied for understanding context and processes are briefly presented together with the setting in which they have been applied. An example application focusing on context understanding is described to illustrate results of applying the techniques on real-world data. Interpretation and understanding of context in the ACTIVE knowledge workspace is described in Chap. 5 and deployed at BT as described in Chap. 9, while optimizing and sharing of knowledge processes is addressed in Chap. 6.

Archive | 2009

Knowledge Discovery for Semantic Web

Dunja Mladenic; Marko Grobelnik; Blaž Fortuna; Miha Grcar

Knowledge Discovery is traditionally used for analysis of large amounts of data and enables addressing a number of tasks that arise in Semantic Web and require scalable solutions. Additionally, Knowledge Discovery techniques have been successfully applied not only to structured data, i.e., databases but also to semi-structured and unstructured data including text, graphs, images and video. Semantic Web technologies often call for dealing with text and sometimes also graphs or social networks. This chapter describes research approaches that are adopting knowledge discovery techniques to address semantic Web and presents several publicly available tools that are implementing some of the described approaches.

Context and Semantics for Knowledge Management | 2011

Managing and Understanding Context

Igor Dolinšek; Marko Grobelnik; Dunja Mladenic

This chapter focuses on the interpretation and understanding of context and proposes methods for managing contexts in the area of personal and collaborative information management. The concept of user context is introduced in an informal way and its implementation in the ACTIVE knowledge workspace is described. We elaborate on two perspectives of context management: the top-down and the bottom-up perspectives. The use of context management is examined in the light of real-world deployment of the ACTIVE knowledge workspace. For this purpose Microsoft Windows File Explorer, Internet Explorer and MS Office tools were extended into context-aware collaboration platform which was deployed at BT as described in Chap. 9. This software is also available from www.active-project.eu for research purposes.

2010 14th International Conference Information Visualisation | 2010

Visualization of Web Page Content Using Semantic Technologies

Lorand Dali; Dunja Mladenic

This paper presents a system for visualizing the information contained in the text of a web page. The goal of the visualization is to help the users better and faster understand the text on a web page and/or find related content on the internet. These visualizations are possible due to the use of text mining, natural language processing and semantic web technologies. Our system tries to make these technologies instantly accessible to a wide variety of users reading a wide variety of web pages. This high coverage of both users and content can be achieved because the system is implemented as an extension to Firefox, one of the most popular browsers, and because the visualizations are computed on the fly for any page the user happens to be reading at a given moment.

Archive | 2017

Leadership and Balance in Research

Dunja Mladenic; Marko Grobelnik

Leadership and balance are challenges relevant for scientific work as well as in business, politics and also in daily activities of individuals. Here we share our reflections based on the experience of building and leading a research group of over 40 people at a national research institute. Our first observation is that leadership of a research group towards success requires clear philosophical alignment, fundamentals shared between all the members. This includes maintaining a common vision and high enthusiasm towards achieving results (no nonsense rule). In order be sustainable on the longer run, we have to maintain the flow of: (a) knowledge/experience, (b) social network of partners, and (c) constant funding. Organization of the team should be preferably flat (but not too flat) with well-defined roles, but also as fluid as possible (no rigidness rule) facilitating personal and group progress. One of the fundamentals is to develop trust between people and maintain good human relationships within the team (no fighting rule).

Explore More