Alexandros Karakasidis
Hellenic Open University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alexandros Karakasidis.
information quality in information systems | 2005
Alexandros Karakasidis; Panos Vassiliadis; Evaggelia Pitoura
Traditionally, the refreshment of data warehouses has been performed in an off-line fashion. Active Data Warehousing refers to a new trend where data warehouses are updated as frequently as possible, to accommodate the high demands of users for fresh data. In this paper, we propose a framework for the implementation of active data warehousing, with the following goals: (a) minimal changes in the software configuration of the source, (b) minimal overhead for the source due to the active nature of data propagation, (c) the possibility of smoothly regulating the overall configuration of the environment in a principled way. In our framework, we have implemented ETL activities over queue networks and employ queue theory for the prediction of the performance and the tuning of the operation of the overall refreshment process. Due to the performance overheads incurred, we explore different architectural choices for this task and discuss the issues that arise for each of them.
balkan conference in informatics | 2009
Alexandros Karakasidis; Vassilios S. Verykios
Phonetic codes such as Soundex and Metaphone havebeen used in the past to address the Record Linkage Problem.However, to the best of our knowledge, no particulareffort was made within this context towards privacy assuranceduring the matching process. Phonetic codes have aninteresting feature which can be cornerstone to providingprivacy. They are mappings of strings which do not exhibitthe one-to-one property.In this paper, we present a novel protocol for achieving privacypreserving record linkage using phonetics, we provideproof of correctness for our approach and finally we illustrateexperimental results concerning performance andmatching accuracy. The proposed protocol can be equallywell applied to codes different than the phonetic ones, which do not exhibit the one-to-one property, such as hash tables with comparable results.
International Journal of Data Mining, Modelling and Management | 2009
Vassilios S. Verykios; Alexandros Karakasidis; Vassilios K. Mitrogiannis
Privacy-preserving record linkage is a very important task, mostly because of the very sensitive nature of the personal data. The main focus in this task is to find a way to match records from among different organisation data sets or databases without revealing competitive or personal information to non-owners. Towards accomplishing this task, several methods and protocols have been proposed. In this work, we propose a certain methodology for preserving the privacy of various record linkage approaches and we implement, examine and compare four pairs of privacy preserving record linkage methods and protocols. Two of these protocols use n-gram based similarity comparison techniques, the third protocol uses the well known edit distance and the fourth one implements the Jaro-Winkler distance metric. All of the protocols used are enhanced by private key cryptography and hash encoding. This paper presents also a blocking scheme as an extension to the privacy preserving record linkage methodology. Our comparison is backed up by extended experimental evaluation that demonstrates the performance achieved by each of the proposed protocols.
Journal of computing science and engineering | 2011
Alexandros Karakasidis; Vassilios S. Verykios
Performing approximate data matching has always been an intriguing problem for both industry and academia. This task becomes even more challenging when the requirement of data privacy rises. In this paper, we propose a novel technique to address the problem of efficient privacy-preserving approximate record linkage. The secure framework we propose consists of two basic components. First, we utilize a secure blocking component based on phonetic algorithms statistically enhanced to improve security. Second, we use a secure matching component where actual approximate matching is performed using a novel private approach of the Levenshtein Distance algorithm. Our goal is to combine the speed of private blocking with the increased accuracy of approximate secure matching. Category: Ubiquitous computing; Security and privacy
DPM'11 Proceedings of the 6th international conference, and 4th international conference on Data Privacy Management and Autonomous Spontaneus Security | 2011
Alexandros Karakasidis; Vassilios S. Verykios; Peter Christen
In many aspects of everyday life, from education to health care and from economics to homeland security, information exchange involving companies or government agencies has become a common application. Locating the same real world entities within this information however is not trivial at all due to insufficient identifying information, misspellings, etc. The problem becomes even more complicated when privacy considerations arise. This introduction describes an informal approach to the privacy preserving record linkage problem. In this paper we provide a solution to this problem by examining the alternatives offered by phonetic codes, a range of algorithms which despite their age, are still used for record linkage purposes. The main contribution of our work, as our extensive experimental evaluation indicates, is that our methodology manages to offer privacy guarantees for performing Privacy Preserving Record Linkage without the need of computationally expensive cryptographic methods.
acm symposium on applied computing | 2012
Alexandros Karakasidis; Vassilios S. Verykios
Privacy Preserving Record Linkage is an emerging field of research which attempts to deal with the classical linkage problem from a privacy preserving point of view. In this paper we propose a novel approach for performing Privacy Preserving Blocking in order to minimize the computational cost of Privacy Preserving Record Linkage. We achieve this without compromising privacy by using Nearest Neighbors clustering, a well-known clustering algorithm and by using a reference table. A reference table is a publicly known table the contents of which are used as intermediate references. The combination of Nearest Neighbors and a reference table offers our approach k-anonymity characteristics.
international conference on data mining | 2012
Alexandros Karakasidis; Vassilios S. Verykios
Privacy Preserving Record Linkage is an emerging field of research which aims to integrate data from heterogeneous data sources while respecting privacy. It is evident that this task exhibits high computational complexity, therefore Privacy Preserving Blocking has been introduced in order to improve performance by eliminating unrelated candidate pairs. In this paper we present a solution to this problem by introducing the Sorted Neighborhood for Encrypted Fields algorithm and combining it with a secure multidimensional privacy preserving blocking method. Our approach is applicable to all types of data fields and manages to significantly boost the Privacy Preserving Record Linkage process without sacrificing matching accuracy. We analytically prove that our method is secure and we also provide empirical evidence where the high performance of our method is established by comparing it to other established methods.
international conference on distributed computing systems | 2002
Alexandros Karakasidis; Evaggelia Pitoura
In the near future, there will be increasingly powerful computers in smart cards, telephones, and other information appliances. This will create a massive infrastructure composed of highly diverse interconnected mobile entities. In this paper, we present a data-centric approach to storage and querying in such environments. At a first level, we view each entity as a miniature database; at a second level we maintain databases of metadata and services. We describe how information delivery and querying are performed in such architectures.
knowledge discovery and data mining | 2015
Alexandros Karakasidis; Georgia Koloniari; Vassilios S. Verykios
When dealing with sensitive and personal user data, the process of record linkage raises privacy issues. Thus, privacy preserving record linkage has emerged with the goal of identifying matching records across multiple data sources while preserving the privacy of the individuals they describe. The task is very resource demanding, considering the abundance of available data, which, in addition, are often dirty. Blocking techniques are deployed prior to matching to prune out unlikely to match candidate records so as to reduce processing time. However, when scaling to large datasets, such methods often result in quality loss. To this end, we propose Multi-Sampling Transitive Closure for Encrypted Fields (MS-TCEF), a novel privacy preserving blocking technique based on the use of reference sets. Our new method effectively prunes records based on redundant assignments to blocks, providing better fault-tolerance and maintaining result quality while scaling linearly with respect to the dataset size. We provide a theoretical analysis on the methods complexity and show how it outperforms state-of-the-art privacy preserving blocking techniques with respect to both recall and processing cost.
international conference on tools with artificial intelligence | 2012
Alexandros Karakasidis; Vassilios S. Verykios
Privacy Preserving Record Linkage is the process of securely integrating information without compromising the privacy of the individuals described by these data. While such an effort sounds appealing for both academic and business applications, it is complicated and computationally intensive. In this paper we aspire to provide a solution to this problem by presenting a highly secure mutidimensional Privacy Preserving Blocking approach which is totally distributed and runs independently on each data holder, making it invulnerable to third party attacks. It is based on the idea of using publicly available corpora of data known as reference sets for creating k-anonymous clusters. We analytically prove that our method is secure and provide experimental results which evaluate the increased performance of our method in terms of matching accuracy and execution time.