Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yücel Saygin is active.

Publication


Featured researches published by Yücel Saygin.


international conference on management of data | 2004

State-of-the-art in privacy preserving data mining

Vassilios S. Verykios; Elisa Bertino; Igor Nai Fovino; Loredana Parasiliti Provenza; Yücel Saygin; Yannis Theodoridis

We provide here an overview of the new and rapidly emerging research area of privacy preserving data mining. We also propose a classification hierarchy that sets the basis for analyzing the work which has been performed in this context. A detailed review of the work accomplished in this area is also given, along with the coordinates of each work to the classification hierarchy. A brief evaluation is performed, and some initial conclusions are made.


IEEE Transactions on Knowledge and Data Engineering | 2004

Association rule hiding

Vassilios S. Verykios; Ahmed K. Elmagarmid; Elisa Bertino; Yücel Saygin; Elena Dasseni

Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may encounter when releasing data to outside parties. A key problem, and still not sufficiently investigated, is the need to balance the confidentiality of the disclosed data with the legitimate needs of the data users. Every disclosure limitation method affects, in some way, and modifies true data values and relationships. We investigate confidentiality issues of a broad category of rules, the association rules. In particular, we present three strategies and five algorithms for hiding a group of association rules, which is characterized as sensitive. One rule is characterized as sensitive if its disclosure risk is above a certain privacy threshold. Sometimes, sensitive rules should not be disclosed to the public since, among other things, they may be used for inferring sensitive data, or they may provide business competitors with an advantage. We also perform an evaluation study of the hiding algorithms in order to analyze their time complexity and the impact that they have in the original database.


international conference on management of data | 2001

Using unknowns to prevent discovery of association rules

Yücel Saygin; Vassilios S. Verykios; Chris Clifton

Data mining technology has given us new capabilities to identify correlations in large data sets. This introduces risks when the data is to be made public, but the correlations are private. We introduce a method for selectively removing individual values from a database to prevent the discovery of a set of rules, while preserving the data for other applications. The efficacy and complexity of this method are discussed. We also present an experiment showing an example of this methodology.


international workshop on research issues in data engineering | 2002

Privacy preserving association rule mining

Yücel Saygin; Vassilios S. Verykios; Ahmed K. Elmagarmid

The current trend in the application space towards systems of loosely coupled and dynamically bound components that enables just-in-time integration jeopardizes the security of information that is shared between the broker, the requester, and the provider at runtime. In particular, new advances in data mining and knowledge discovery that allow for the extraction of hidden knowledge in an enormous amount of data, impose new threats on the seamless integration of information. We consider the problem of building privacy preserving algorithms for one category of data mining techniques, association rule mining. We introduce new metrics in order to demonstrate how security issues can be taken into consideration in the general framework of association rule mining, and we show that the complexity of the new heuristics is similar to that of the original algorithms.


advances in geographic information systems | 2008

Towards trajectory anonymization: a generalization-based approach

Mehmet Ercan Nergiz; Maurizio Atzori; Yücel Saygin

Trajectory datasets are becoming more and more popular due to the massive usage of GPS and other location-based devices and services. In this paper, we address privacy issues regarding the identification of individuals in static trajectory datasets. We provide privacy protection by definig trajectory k-anonymity, meaning every released information refers to at least k users/trajectories. We propose a novel generalization-based approach that applies to trajectories and sequences in general. We also suggest the use of a simple random reconstruction of the original dataset from the anonymization, to overcome possible drawbacks of generalization approaches. We present a utility metric that maximizes the probability of a good representation and propose trajectory anonymization techniques to address time and space sensitive applications. The experimental results over synthetic trajectory datasets show the effectiveness of the proposed approach.


pacific-asia conference on knowledge discovery and data mining | 2004

Secure association rule sharing

Stanley Robson de Medeiros Oliveira; Osmar R. Zaïane; Yücel Saygin

The sharing of association rules is often beneficial in industry, but requires privacy safeguards. One may decide to disclose only part of the knowledge and conceal strategic patterns which we call restrictive rules. These restrictive rules must be protected before sharing since they are paramount for strategic decisions and need to remain private. To address this challenging problem, we propose a unified framework for protecting sensitive knowledge before sharing. This framework encompasses: (a) an algorithm that sanitizes restrictive rules, while blocking some inference channels. We validate our algorithm against real and synthetic datasets; (b) a set of metrics to evaluate attacks against sensitive knowledge and the impact of the sanitization. We also introduce a taxonomy of sanitizing algorithms and a taxonomy of attacks against sensitive knowledge.


data and knowledge engineering | 2007

Privacy preserving clustering on horizontally partitioned data

Ali Inan; Selim Volkan Kaya; Yücel Saygin; Erkay Savas; Ayça Azgin Hintoglu; Albert Levi

Data mining has been a popular research area for more than a decade due to its vast spectrum of applications. However, the popularity and wide availability of data mining tools also raised concerns about the privacy of individuals. The aim of privacy preserving data mining researchers is to develop data mining techniques that could be applied on databases without violating the privacy of individuals. Privacy preserving techniques for various data mining models have been proposed, initially for classification on centralized data then for association rules in distributed environments. In this work, we propose methods for constructing the dissimilarity matrix of objects from different sites in a privacy preserving manner which can be used for privacy preserving clustering as well as database joins, record linkage and other operations that require pair-wise comparison of individual private data objects horizontally distributed to multiple sites. We show communication and computation complexity of our protocol by conducting experiments over synthetically generated and real datasets. Each experiment is also performed for a baseline protocol, which has no privacy concern to show that the overhead comes with security and privacy by comparing the baseline protocol and our protocol.


Archive | 2006

Computer and information sciences-ISCIS 2006

Albert Levi; Erkay Savas; Hüsnü Yenigün; Selim Balcisoy; Yücel Saygin

This book constitutes the refereed proceedings of the 21st International Symposium on Computer and Information Sciences, ISCIS 2006, held in Istanbul, Turkey in October 2006. The 106 revised full papers presented together with 5 invited lectures were carefully reviewed and selected from 606 submissions. The papers are organized in topical sections on algorithms and theory, bioinformatics, computational intelligence, computer architecture, computer graphics, computer networks, computer vision, data mining, databases, embedded systems, information retrieval, mobile computing, parallel and distributed computing, performance evaluation, security and cryptography, as well as software engineering.


IEEE Transactions on Knowledge and Data Engineering | 2002

Exploiting data mining techniques for broadcasting data in mobile computing environments

Yücel Saygin; Özgür Ulusoy

Mobile computers can be equipped with wireless communication devices that enable users to access data services from any location. In wireless communication, the server-to-client (downlink) communication bandwidth is much higher than the client-to-server (uplink) communication bandwidth. This asymmetry makes the dissemination of data to client machines a desirable approach. However, dissemination of data by broadcasting may induce high access latency in case the number of broadcast data items is large. We propose two methods aiming to reduce client access latency of broadcast data. Our methods are based on analyzing the broadcast history (i.e., the chronological sequence of items that have been requested by clients) using data mining techniques. With the first method, the data items in the broadcast disk are organized in such a way that the items requested subsequently are placed close to each other. The second method focuses on improving the cache hit ratio to be able to decrease the access latency. It enables clients to prefetch the data from the broadcast disk based on the rules extracted from previous data request patterns. The proposed methods are implemented on a Web log to estimate their effectiveness. It is shown through performance experiments that the proposed rule-based methods are effective in improving the system performance in terms of the average latency as well as the cache hit ratio of mobile clients.


international conference of the ieee engineering in medicine and biology society | 2012

Anonymization of Longitudinal Electronic Medical Records

Acar Tamersoy; Grigorios Loukides; Mehmet Ercan Nergiz; Yücel Saygin; Bradley Malin

Electronic medical record (EMR) systems have enabled healthcare providers to collect detailed patient information from the primary care domain. At the same time, longitudinal data from EMRs are increasingly combined with biorepositories to generate personalized clinical decision support protocols. Emerging policies encourage investigators to disseminate such data in a deidentified form for reuse and collaboration, but organizations are hesitant to do so because they fear such actions will jeopardize patient privacy. In particular, there are concerns that residual demographic and clinical features could be exploited for reidentification purposes. Various approaches have been developed to anonymize clinical data, but they neglect temporal information and are, thus, insufficient for emerging biomedical research paradigms. This paper proposes a novel approach to share patient-specific longitudinal data that offers robust privacy guarantees, while preserving data utility for many biomedical investigations. Our approach aggregates temporal and diagnostic information using heuristics inspired from sequence alignment and clustering methods. We demonstrate that the proposed approach can generate anonymized data that permit effective biomedical analysis using several patient cohorts derived from the EMR system of the Vanderbilt University Medical Center.

Collaboration


Dive into the Yücel Saygin's collaboration.

Researchain Logo
Decentralizing Knowledge