Blaž Fortuna
Ghent University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Blaž Fortuna.
EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining | 2005
Blaž Fortuna; Dunja Mladenic; Marko Grobelnik
In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and K-Means clustering) and present how we integrated them into a system for semi-automatic topic ontology construction. The OntoGen system offers support to the user during the construction process by suggesting topics and analyzing them in real time. It suggests names for the topics in two alternative ways both based on extracting keywords from a set of documents inside the topic. The first set of descriptive keyword is extracted using document centroid vectors, while the second set of distinctive keyword is extracted from the SVM classification model dividing documents in the topic from the neighboring documents.
web mining and web usage analysis | 2005
Miha Grcar; Dunja Mladenic; Blaž Fortuna; Marko Grobelnik
With the amount of available information on the Web growing rapidly with each day, the need to automatically filter the information in order to ensure greater user efficiency has emerged. Within the fields of user profiling and Web personalization several popular content filtering techniques have been developed. In this chapter we present one of such techniques – collaborative filtering. Apart from giving an overview of collaborative filtering approaches, we present the experimental results of confronting the k-Nearest Neighbor (kNN) algorithm with Support Vector Machine (SVM) in the collaborative filtering framework using datasets with different properties. While the k-Nearest Neighbor algorithm is usually used for collaborative filtering tasks, Support Vector Machine is considered a state-of-the-art classification algorithm. Since collaborative filtering can also be interpreted as a classification/regression task, virtually any supervised learning algorithm (such as SVM) can also be applied. Experiments were performed on two standard, publicly available datasets and, on the other hand, on a real-life corporate dataset that does not fit the profile of ideal data for collaborative filtering. We conclude that the quality of collaborative filtering recommendations is highly dependent on the sparsity of available data. Furthermore, we show that kNN is dominant on datasets with relatively low sparsity while SVM-based approaches may perform better on highly sparse data.
Data Science and Classification | 2006
Miha Grcar; Blaž Fortuna; Dunja Mladenic; Marko Grobelnik
We present experimental results of confronting the k-Nearest Neighbor (kNN) algorithm with Support Vector Machine (SVM) in the collaborative filtering framework using datasets with different properties. While k-Nearest Neighbor is usually used for the collaborative filtering tasks, Support Vector Machine is considered a state-of-the-art classification algorithm. Since collaborative filtering can also be interpreted as a classification/regression task, virtually any supervised learning algorithm (such as SVM) can also be applied. Experiments were performed on two standard, publicly available datasets and, on the other hand, on a real-life corporate dataset that does not fit the profile of ideal data for collaborative filtering. We conclude that the quality of collaborative filtering recommendations is highly dependent on the quality of the data. Furthermore, we can see that kNN is dominant over SVM on the two standard datasets. On the real-life corporate dataset with high level of sparsity, kNN fails as it is unable to form reliable neighborhoods. In this case SVM outperforms kNN.
international world wide web conferences | 2006
Blaž Fortuna; Marko Grobelnik; Dunja Mladenic
In this paper we describe a solution for incorporating background knowledge into the OntoGen system for semi-automatic ontology construction. This makes it easier for different users to construct different and more personalized ontologies for the same domain. To achieve this we introduce a word weighting schema to be used in the document representation. The weighting schema is learned based on the background knowledge provided by user. It is than used by OntoGens machine learning and text mining algorithms.
international semantic web conference | 2012
Lorand Dali; Blaž Fortuna; Thanh Tran Duc; Dunja Mladenic
The amount of structured data is growing rapidly. Given a structured query that asks for some entities, the number of matching candidate results is often very high. The problem of ranking these results has gained attention. Because results in this setting equally and perfectly match the query, existing ranking approaches often use features that are independent of the query. A popular one is based on the notion of centrality that is derived via PageRank. In this paper, we adopt learning to rank approach to this structured query setting, provide a systematic categorization of query-independent features that can be used for that, and finally, discuss how to leverage information in access logs to automatically derive the training data needed for learning. In experiments using real-world datasets and human evaluation based on crowd sourcing, we show the superior performance of our approach over two relevant baselines.
IEEE Internet Computing | 2009
Marko Grobelnik; Dunja Mladenic; Blaž Fortuna
Capturing information about employees can give organizations insight into underlying knowledge processes. Analyzing email communications, for example, can produce an informal structure thats flexible and that organizations can recalculate regularly to capture information flow among employees. The structure could also help institutions to both identify collaboration patterns and predict changes in those patterns. Using semantic technologies in various domains, researchers have developed domain-specific ontologies to capture knowledge and enable reasoning. Organizations can support such knowledge management by capturing knowledge of their own people and their communication records, including email exchanges. The proposed approach analyzes an internal social network and uses the resulting information to produce an informal organizational structure. The authors evaluated their approach using a mid-size organizations actual data and compared the informal structure they obtained with the formal organizational structure. As the results show, the approach proved useful for modeling social structures based on real-world communication records.
Archive | 2013
Klemen Kenda; Carolina Fortuna; Alexandra Moraru; Dunja Mladenic; Blaž Fortuna; Marko Grobelnik
The Web of Things (WoT) together with mashup-like applications is gaining popularity with the development of the Internet towards a network of interconnected objects, ranging from cars and transportation cargos to electrical appliances. In this chapter we provide a brief architectural overview of technologies which can be used in WoT mashups with emphasis on artificial intelligence technologies such as conceptualization and stream processing. We also look at data sources and existing WoT mashups. In the last part of the chapter we discuss the architecture and implementation of Videk, a prototype mashup for environmental intelligence.
european conference on machine learning | 2010
Blaž Fortuna; Carolina Fortuna; Dunja Mladenic
In this demo we present a robust system for delivering real-time news recommendation to the user based on the users history of the past visits to the site, current users context and popularity of stories. Our system is running live providing real-time recommendations of news articles. The system handles overspecializing as we recommend categories as opposed to items, it implicitly uses collaboration by taking into account user context and popular items and, it can handle new users by using context information. A unique characteristic of our system is that it prefers freshness over relevance, which is important for recommending news articles in real-world setting as addressed here. We experimentally compare the proposed approach as implemented in our system against several state-of-the-art alternatives and show that it significantly outperforms them.
knowledge discovery and data mining | 2009
Delia Rusu; Blaž Fortuna; Dunja Mladenic; Marko Grobelnik; Ruben Sipos
In this paper, we present a technique for visual analysis of documents based on the semantic representation of text in the form of a directed graph, referred to as semantic graph. This approach can aid data mining tasks, such as exploratory data analysis, data description and summarization. In order to derive the semantic graph, we take advantage of natural language processing, and carry out a series of operations comprising a pipeline, as follows. Firstly, named entities are identified and co-reference resolution is performed; moreover, pronominal anaphors are resolved for a subset of pronouns. Secondly, subject -- predicate -- object triplets are automatically extracted from the Penn Treebank parse tree obtained for each sentence in the document. The triplets are further enhanced by linking them to their corresponding co-referenced named entity, as well as attaching the associated WordNet synset, where available. Thus we obtain a semantic directed graph composed of connected triplets. The documents semantic graph is a starting point for automatically generating the document summary. The model for summary generation is obtained by machine learning, where the features are extracted from the semantic graph structure and content. The summary also has an associated semantic representation. The size of the semantic graph, as well as the summary length can be manually adjusted for an enhanced visual analysis. We also show how to employ the proposed technique for the Visual Analytics challenge.
Archive | 2009
Blaž Fortuna; Dunja Mladenic; Marko Grobelnik
In order to realize the semantic web vision, the creation of semantic annotation, the linking of web pages to ontologies, and the creation, evolution and interrelation of ontologies must become automatic or semi-automatic processes. Natural Language Generation (NLG) takes structured data in a knowledge base as input and produces natural language text, tailored to the presentational context and the target reader. NLG techniques use and build models of the context and the user and use them to select appropriate presentation strategies. In the context of Semantic Web or knowledge management, NLG can be applied to provide automated documentation of ontologies and knowledge bases. Unlike human- written texts, an automatic approach will constantly keep the text up-to-date which is vitally important in the Semantic Web context where knowledge is dynamic and is updated frequently. This chapter presents several Natural Language Generation (NLG) techniques that produce textual summaries from Semantic Web ontologies. The main contribution is in showing how existing NLG tools can be adapted to take Semantic Web ontologies as their input, in a way which minimizes the customization effort. A major factor in the quality of the generated text is the content of the ontology itself. For instance, the use of string datatype properties with implicit semantics leads to the generation of text with missing semantic information. Three approaches to overcome this problem are presented and users can choose the one that suits their application best.