Damir Vandic
Erasmus University Rotterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Damir Vandic.
decision support systems | 2012
Damir Vandic; Jan Willem Van Dam; Flavius Frasincar
This paper presents a platform for multifaceted product search using Semantic Web technology. Online shops can use a ping service to submit their RDFa annotated Web pages for processing. The platform is able to process these RDFa annotated (X)HTML pages and aggregate product information coming from different Web stores. We propose solutions for the identification of products and the mapping of the categories in this process. Furthermore, when a loose vocabulary such as the Google RDFa vocabulary is used, the platform deals with the issue of heterogeneous information (e.g., currencies, rating scales, etc.).
conference on information and knowledge management | 2013
Damir Vandic; Flavius Frasincar; Uzay Kaymak
Multifaceted search is a commonly used interaction paradigm in e-commerce applications, such as Web shops. Because of the large amount of possible product attributes, Web shops usually make use of static information to determine which facets should be displayed. Unfortunately, this approach does not take into account the user query, leading to a non-optimal facet drill down process. In this paper, we focus on automatic facet selection, with the goal of minimizing the number of steps needed to find the desired product. We propose several algorithms for facet selection, which we evaluate against the state-of-the-art algorithms from the literature. We implement our approach in a Web application called faccy.net. The evaluation is based on simulations employing 1000 queries, 980 products, 487 facets, and three drill down strategies. As evaluation metrics we use the average number of clicks, the average utility, and the top-10 promotion percentage. The results show that the Probabilistic Entropy algorithm significantly outperforms the other considered algorithms.
acm symposium on applied computing | 2011
Damir Vandic; Jan Willem Van Dam; Frederik Hogenboom; Flavius Frasincar
Many of the existing cloud tagging systems are unable to cope with the syntactic and semantic tag variations during user search and browse activities. As a solution to this problem, in this paper, we propose the Semantic Tag Clustering Search, a framework able to cope with these needs. The framework consists of three parts: removing syntactic variations, creating semantic clusters, and utilizing the obtained clusters to improve search and exploration of tag spaces. For removing syntactic variations, we use the normalized Levenshtein distance, and the cosine similarity measure based on tag co-occurrences. For creating semantic clusters, we improve an existing non-hierarchical clustering technique. Using our framework, we are able to find more clusters and achieve a higher precision than the original method. The advantages of a cluster-based approach for searching and browsing through tag spaces have been exploited in Xplore-Flickr.com, the implementation of our framework.
Expert Systems With Applications | 2015
Steven S. Aanen; Damir Vandic; Flavius Frasincar
We propose an algorithm for automatic product taxonomy mapping in e-commerce.The algorithm uses word sense disambiguation techniques to handle heterogeneity.Our algorithm copes with composite categories in product category names.We compute path-similarities using lexical relatedness and structural information.Using real-world data, we show that we improve existing state-of-the-art methods. Over the last few years, we have experienced a steady growth in e-commerce. This growth introduces many problems for services that want to aggregate product information and offerings. One of the problems that aggregation services face is the matching of product categories from different Web shops. This paper proposes an algorithm to perform this task automatically, making it possible to aggregate product information from multiple Web sites, in order to deploy it for search, comparison, or recommender systems applications. The algorithm uses word sense disambiguation techniques to address varying denominations between different taxonomies. Path similarity is assessed between source and candidate target categories, based on lexical relatedness and structural information. The main focus of the proposed solution is to improve the disambiguation procedure in comparison to an existing state-of-the-art approach, while coping with product taxonomy-specific characteristics, like composite categories, and re-examining lexical similarity and similarity aggregation in this context. The performance evaluation based on data from three real-world Web shops demonstrates that the proposed algorithm improves the benchmarked approach by 62% on average F 1 -measure.
decision support systems | 2014
Lennart J. Nederstigt; Steven S. Aanen; Damir Vandic; Flavius Frasincar
With the vast amount of information available on the Web, there is an urgent need to structure Web data in order to make it available to both users and machines. E-commerce is one of the areas in which growing data congestion on the Web impedes data accessibility. This paper proposes FLOPPIES, a framework capable of semi-automatic ontology population of tabular product information from Web stores. By formalizing product information in an ontology, better product comparison or parametric search applications can be built, using the semantics of product attributes and their corresponding values. The framework employs both lexical and pattern matching for classifying products, mapping properties, and instantiating values. It is shown that the performance on instantiating TVs and MP3 players from Best Buy and Newegg.com looks promising, achieving an F^1-measure of approximately 77%.
conference on advanced information systems engineering | 2013
Marnix de Bakker; Flavius Frasincar; Damir Vandic
The detection of product duplicates is one of the challenges that Web shop aggregators are currently facing. In this paper, we focus on solving the problem of product duplicate detection on the Web. Our proposed method extends a state-of-the-art solution that uses the model words in product titles to find duplicate products. First, we employ the aforementioned algorithm in order to find matching product titles. If no matching title is found, our method continues by computing similarities between the two product descriptions. These similarities are based on the product attribute keys and on the product attribute values. Furthermore, instead of only extracting model words from the title, our method also extracts model words from the product attribute values. Based on our experimental results on real-world data gathered from two existing Web shops, we show that the proposed method, in terms of F1-measure, significantly outperforms the existing state-of-the-art title model words method and the well-known TF-IDF method.
web intelligence | 2012
Emmanuelle-Anna Dietz; Damir Vandic; Flavius Frasincar
Building domain taxonomies is a crucial task in the domain of ontology construction. Domain taxonomy learning keeps getting more important as a form of automatically obtaining a knowledge representation of a certain domain. The alternative of manually developing domain taxonomies is not trivial. The main issues encountered when manually developing a taxonomy are the non-availability of a domain knowledge expert and the considerable amount of effort needed for this task. This paper proposes Taxo Learn, an approach to automatic construction of domain taxonomies. Taxo Learn is a new methodology that combines aspects from existing approaches, but also contains new steps in order to improve the quality of the resulted domain taxonomy. The contribution of this paper is threefold. First, we employ a word sense disambiguation step when detecting concepts in the text. Second, we show the use of semantics-based hierarchical clustering for the purpose of taxonomy learning. Third, we propose a novel dynamic labeling procedure for the concept clusters. We evaluate our approach by comparing the machine generated taxonomy with a manually constructed golden taxonomy. Based on a corpus of documents in the field of financial economics, Taxo Learn shows a high precision for the learned taxonomic concept relationships.
decision support systems | 2012
Damir Vandic; Jan Willem Van Dam; Flavius Frasincar
In this paper we propose the Semantic Tag Clustering Search (STCS) framework for enhancing the user experience in interacting with tagging systems. This framework consists of three parts. The first part deals with syntactic variations by finding clusters of tags that are syntactic variations of each other and assigning labels to them. The second part of the framework addresses the problem of the lack of semantics in tagging systems by recognizing contexts and constructing semantic clusters for tags. The last, and final part of the STCS framework, utilizes the clusters obtained from the first two parts to improve the search and exploration of tag spaces. For removing syntactic variations, we use the normalized Levenshtein distance and the cosine similarity measure based on tag co-occurrences. For creating semantic clusters, we employ two non-hierarchical and two hierarchical clustering techniques. To evaluate the value of the semantic clusters, we develop a Web application called XploreFlickr.com for searching and browsing through Flickr resources.
international semantic web conference | 2012
Steven S. Aanen; Lennart J. Nederstigt; Damir Vandic; Flavius Frăsincar
This paper proposes SCHEMA, an algorithm for automated mapping between heterogeneous product taxonomies in the e-commerce domain. SCHEMA utilises word sense disambiguation techniques, based on the ideas from the algorithm proposed by Lesk, in combination with the semantic lexicon WordNet. For finding candidate map categories and determining the path-similarity we propose a node matching function that is based on the Levenshtein distance. The final mapping quality score is calculated using the Damerau-Levenshtein distance and a node-dissimilarity penalty. The performance of SCHEMA was tested on three real-life datasets and compared with PROMPT and the algorithm proposed by Park & Kim. It is shown that SCHEMA improves considerably on both recall and F
international conference on web engineering | 2011
Joni Radelaar; Aart-Jan Boor; Damir Vandic; Jan-Willem van Dam; Frederik Hogenboom; Flavius Frasincar
_{\textrm{1}}