Arlind Kopliku | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Arlind Kopliku is active.

Explore More

Publication

Featured researches published by Arlind Kopliku.

ACM Computing Surveys | 2014

Aggregated search: A new information retrieval paradigm

Arlind Kopliku; Karen Pinel-Sauvagnat; Mohand Boughanem

Traditional search engines return ranked lists of search results. It is up to the user to scroll this list, scan within different documents, and assemble information that fulfill his/her information need. Aggregated search represents a new class of approaches where the information is not only retrieved but also assembled. This is the current evolution in Web search, where diverse content (images, videos, etc.) and relational content (similar entities, features) are included in search results. In this survey, we propose a simple analysis framework for aggregated search and an overview of existing work. We start with related work in related domains such as federated search, natural language generation, and question answering. Then we focus on more recent trends, namely cross vertical aggregated search and relational aggregated search, which are already present in current Web search.

international conference on enterprise information systems | 2015

Implementing Multidimensional Data Warehouses into NoSQL

Max Chevalier; Mohammed El Malki; Arlind Kopliku; Olivier Teste; Ronan Tournier

Not only SQL (NoSQL) databases are becoming increasingly popular and have some interesting strengths such as scalability and flexibility. In this paper, we investigate on the use of NoSQL systems for implementing OLAP (On-Line Analytical Processing) systems. More precisely, we are interested in instantiating OLAP systems (from the conceptual level to the logical level) and instantiating an aggregation lattice (optimization). We define a set of rules to map star schemas into two NoSQL models: column-oriented and document-oriented. The experimental part is carried out using the reference benchmark TPC. Our experiments show that our rules can effectively instantiate such systems (star schema and lattice). We also analyze differences between the two NoSQL systems considered. In our experiments, HBase (column-oriented) happens to be faster than MongoDB (document-oriented) in terms of loading time.

conference on information and knowledge management | 2011

Towards a framework for attribute retrieval

Arlind Kopliku; Mohand Boughanem; Karen Pinel-Sauvagnat

In this paper, we propose an attribute retrieval approach which extracts and ranks attributes from HTML tables. We distinguish between class attribute retrieval and instance attribute retrieval. On one hand, given an instance (e.g. University of Strathclyde) we retrieve from the Web its attributes (e.g. principal, location, number of students). On the other hand, given a class (e.g. universities) represented by a set of instances, we retrieve common attributes of its instances. Furthermore, we show we can reinforce instance attribute retrieval if similar instances are available. Our approach uses HTML tables which are probably the largest source for attribute retrieval. Three recall oriented filters are applied over tables to check the following three properties: (i) is the table relational, (ii) has the table a header, and (iii) the conformity of its attributes and values. Candidate attributes are extracted from tables and ranked with a combination of relevance features. Our approach is shown to have a high recall and a reasonable precision. Moreover, it outperforms state of the art techniques.

research challenges in information science | 2015

Benchmark for OLAP on NoSQL technologies comparing NoSQL multidimensional data warehousing solutions

Max Chevalier; Mohammed El Malki; Arlind Kopliku; Olivier Teste; Ronan Tournier

The plethora of data warehouse solutions has created a need comparing these solutions using experimental benchmarks. Existing benchmarks rely mostly on the relational data model and do not take into account other models. In this paper, we propose an extension to a popular benchmark (the Star Schema Benchmark or SSB) that considers non-relational NoSQL models. To avoid data post-processing required for using this data with NoSQL systems, the data is generated in different formats. To exploit at best horizontal scaling, data can be produced in a distributed file system, hence removing disk or partition sizes as limit for the generated dataset. Experimental work proves improved performance of our new benchmark.

advances in databases and information systems | 2015

Implementation of Multidimensional Databases in Column-Oriented NoSQL Systems

Max Chevalier; Mohammed El Malki; Arlind Kopliku; Olivier Teste; Ronan Tournier

NoSQL (Not Only SQL) systems are becoming popular due to known advantages such as horizontal scalability and elasticity. In this paper, we study the implementation of multidimensional data warehouses with columnoriented NoSQL systems. We define mapping rules that transform the conceptual multidimensional data model to logical column-oriented models. We consider three different logical models and we use them to instantiate data warehouses. We focus on data loading, model-to-model conversion and OLAP cuboid computation.

international conference on enterprise information systems | 2015

How Can We Implement a Multidimensional Data Warehouse Using NoSQL

Max Chevalier; Mohammed El Malki; Arlind Kopliku; Olivier Teste; Ronan Tournier

The traditional OLAP (On-Line Analytical Processing) systems store data in relational databases. Unfortunately, it is difficult to manage big data volumes with such systems. As an alternative, NoSQL systems (Not-only SQL) provide scalability and flexibility for an OLAP system. We define a set of rules to map star schemas and its optimization structure, a precomputed aggregate lattice, into two logical NoSQL models: column-oriented and document-oriented. Using these rules we analyse and implement two decision support systems, one for each model (using MongoDB and HBase).We compare both systems during the phases of data (generated using the TPC-DS benchmark) loading, lattice generation and querying.

international conference on big data | 2015

Implementation of Multidimensional Databases with Document-Oriented NoSQL

Max Chevalier; Mohammed El Malki; Arlind Kopliku; Olivier Teste; Ronan Tournier

NoSQL (Not Only SQL) systems are becoming popular due to known advantages such as horizontal scalability and elasticity. In this paper, we study the implementation of data warehouses with document-oriented NoSQL systems. We propose mapping rules that transform the multidimensional data model to logical document-oriented models. We consider three different logical models and we use them to instantiate data warehouses. We focus on data loading, model-to-model conversion and OLAP cuboid computation.

acm ieee joint conference on digital libraries | 2011

Retrieving attributes using web tables

Arlind Kopliku; Karen Pinel-Sauvagnat; Mohand Boughanem

In this paper we propose an attribute retrieval approach which extracts and ranks attributes from Web tables. We combine simple heuristics to filter out improbable attributes and we rank attributes based on frequencies and a table match score. Ranking is reinforced with external evidence from Web search, DBPedia and Wikipedia. Our approach can be applied to whatever instance (e.g. Canada) to retrieve its attributes (capital, GDP). It is shown it has a much higher recall than DBPedia and Wikipedia and that it works better than lexico-syntactic rules for the same purpose.

web intelligence | 2011

Interest and Evaluation of Aggregated Search

Arlind Kopliku; Firas Damak; Karen Pinel-Sauvagnat; Mohand Boughanem

Major search engines perform what is known as Aggregated Search (AS). They integrate results coming from different vertical search engines (images, videos, news, etc.) with typical Web search results. Aggregated search is relatively new and its advantages need to be evaluated. Some existing works have already tried to evaluate the interest (usefulness) of aggregated search as well as the effectiveness of the existing approaches. However, most of evaluation methodologies were based (i) on what we call relevance by intent (i.e. search results were not shown to real users), and (ii) short text queries. In this paper, we conducted a user study which was designed to revisit and compare the interest of aggregated search, by exploiting both relevance by intent and content, and using both short text and fixed need queries. This user study allowed us to analyze the distribution of relevant results across different verticals, and to show that AS helps to identify complementary relevant sources for the same information need. Comparison between relevance by intent and relevance by content showed that relevance by intent introduces a bias in evaluation. Discussion about the results also allowed us to identify some useful thoughts concerning the evaluation of AS approaches.

international conference on enterprise information systems | 2016

Document-oriented Models for Data Warehouses

Max Chevalier; Mohammed El Malki; Arlind Kopliku; Olivier Teste; Ronan Tournier

There is an increasing interest in NoSQL (Not Only SQL) systems developed in the area of Big Data as candidates for implementing multidimensional data warehouses due to the capabilities of data structuration/storage they offer. In this paper, we study implementation and modeling issues for data warehousing with document-oriented systems, a class of NoSQL systems. We study four different mappings of the multidimensional conceptual model to document data models. We focus on formalization and cross-model comparison. Experiments go through important features of data warehouses including data loading, OLAP cuboid computation and querying. Document-oriented systems are also compared to relational systems.

Explore More