Filip Ilievski
VU University Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Filip Ilievski.
Sprachwissenschaft | 2017
Wouter Beek; Filip Ilievski; Jeremy Debattista; Stefan Schlobach; Jan Wielemaker
Quality is a complicated and multifarious topic in contemporary Linked Data research. The aspect of literal quality in particular has not yet been rigorously studied. Nevertheless, analyzing and improving the quality of literals is important since literals form a substantial (one in seven statements) and crucial part of the Semantic Web. Specifically, literals allow infinite value spaces to be expressed and they provide the linguistic entry point to the LOD Cloud. We present a toolchain that builds on the LOD Laundromat data cleaning and republishing infrastructure and that allows us to analyze the quality of literals on a very large scale, using a collection of quality criteria we specify in a systematic way. We illustrate the viability of our approach by lifting out two particular aspects in which the current LOD Cloud can be immediately improved by automated means: value canonization and language tagging. Since not all quality aspects can be addressed algorithmically, we also give an overview of other problems that can be used to guide future endeavors in tooling, training, and best practice formulation.
international semantic web conference | 2016
Filip Ilievski; Wouter Beek; Marieke van Erp; Laurens Rietveld; Stefan Schlobach
Finding relevant resources on the Semantic Web today is a dirty job: no centralized query service exists and the support for natural language access is limited. We present LOTUS: Linked Open Text UnleaShed, a text-based entry point to a massive subset of todays Linked Open Data Cloud. Recognizing the use case dependency of resource retrieval, LOTUS provides an adaptive framework in which a set of matching and ranking algorithms are made available. Researchers and developers are able to tune their own LOTUS index by choosing and combining the matching and ranking algorithms that suit their use case best. In this paper, we explain the LOTUS approach, its implementation and the functionality it provides. We demonstrate the ease with which LOTUS enables text-based resource retrieval at an unprecedented scale in concrete and domain-specific scenarios. Finally, we provide evidence for the scalability of LOTUS with respect to the LOD Laundromat, the largest collection of easily accessible Linked Open Data currently available.
language data and knowledge | 2017
Filip Ilievski; Piek Vossen; Marieke van Erp
The task of entity linking (EL) is often perceived as an algorithmic problem, where the novelty of systems lies in the decision making process, while the knowledge is relatively fixed. As a consequence, we lack an understanding about the importance and the relevance of diverse knowledge types in EL. However, knowledge and relevance are crucial: following the Gricean maxim, an author relies on assumptions about the knowledge of the reader and uses the most efficient and scarce, yet understandable, level of detail when conveying a message. In this paper, we seek to understand the EL task from a knowledge and relevance perspective. We define four categories of contextual knowledge relevant for EL and observe that two of these are systematically absent in existing entity linkers. Consequently, many contextual cases, in particular long-tail entities, can never be interpreted by existing systems. Finally, we present our ideas on developing knowledge-intensive systems and long-tail datasets.
empirical methods in natural language processing | 2016
Marten Postma; Filip Ilievski; Piek Vossen; M.G.J. van Erp
Entities and events in the world have no frequency, but our communication about them and the expressions we use to refer to them do have a strong frequency profile. Language expressions and their meanings follow a Zipfian distribution, featuring a small amount of very frequent observations and a very long tail of low frequent observations. Since our NLP datasets sample texts but do not sample the world, they are no exception to Zipf’s law. This causes a lack of representativeness in our NLP tasks, leading to models that can capture the head phenomena in language, but fail when dealing with the long tail. We therefore propose a referential challenge for semantic NLP that reflects a higher degree of ambiguity and variance and captures a large range of small real-world phenomena. To perform well, systems would have to show deep understanding on the linguistic tail.
12th International Summer School on Reasoning Web Summer School, RW 2016 | 2016
Wouter Beek; Laurens Rietveld; Filip Ilievski; Stefan Schlobach
With tens if not hundreds of billions of logical statements, the Linked Open Data (LOD) is one of the biggest knowledge bases ever built. As such it is a gigantic source of information for applications in various domains, but also given its size an ideal test-bed for knowledge representation and reasoning, heterogeneous nature, and complexity.
language resources and evaluation | 2016
Marieke van Erp; Pablo N. Mendes; Heiko Paulheim; Filip Ilievski; Julien Plu; Giuseppe Rizzo; Jörg Waitelonis
language resources and evaluation | 2016
Filip Ilievski; Giuseppe Rizzo; M.G.J. van Erp; Julien Plu; Raphaël Troncy
Proceedings of the Eighth Global WordNet Conference (Bucharest, Romania, January 27-30, 2016) | 2016
Roxane Segers; Egoitz Laparra; Marco Rospocher; Piek Vossen; German Rigau; Filip Ilievski
international conference on computational linguistics | 2016
Filip Ilievski; Marten Postma; Piek Vossen
Proceedings of NLP&DBpedia 2015 | 2015
Marieke van Erp; Filip Ilievski; Marco Rospocher; Piek Vossen