Is this you? Create Your Porfile

Ali Şaman Tosun

University of Texas at San Antonio

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ali Şaman Tosun is active.

Explore More

Publication

Featured researches published by Ali Şaman Tosun.

conference on information and knowledge management | 2003

High dimensional reverse nearest neighbor queries

Amit Singh; Hakan Ferhatosmanoglu; Ali Şaman Tosun

Reverse Nearest Neighbor (RNN) queries are of particular interest in a wide range of applications such as decision support systems, profile based marketing, data streaming, document databases, and bioinformatics. The earlier approaches to solve this problem mostly deal with two dimensional data. However most of the above applications inherently involve high dimensions and high dimensional RNN problem is still unexplored. In this paper, we propose an approximate solution to answer RNN queries in high dimensions. Our approach is based on the strong correlation in practice between k-NN and RNN. It works in two phases. In the first phase the k-NN of a query point is found and in the next phase they are further analyzed using a novel type of query Boolean Range Query (BRQ). Experimental results show that BRQ is much more efficient than both NN and range queries, and can be effectively used to answer RNN queries. Performance is further improved by running multiple BRQ simultaneously. The proposed approach can also be used to answer other variants of RNN queries such as RNN of order k, bichromatic RNN, and Matching Query which has many applications of its own. Our technique can efficiently answer NN, RNN, and its variants with approximately same number of I/O as running a NN query.

distributed computing in sensor systems | 2007

Data Salmon: a greedy mobile basestation protocol for efficient data collection in wireless sensor networks

Murat Demirbas; Onur Soysal; Ali Şaman Tosun

Our work addresses the spatiotemporally varying nature of data traffic in environmental monitoring and surveillance applications. By employing a network-controlled mobile basestation (MB), we present a simple energy-efficient data collection protocol for wireless sensor networks (WSNs). In contrast to the existing MB-based solutions where WSN nodes buffer data passively until visited by an MB, our protocol maintains an always-on multihop connectivity to the MB by means of an efficient distributed tracking mechanism. This allows the nodes to forward their data in a timely fashion, avoiding latencies due to long-term buffering. Our protocol progressively relocates the MB closer to the regions that produce higher data rates and reduces the average weighted multihop traffic, enabling energy savings. Using the convexity of the cost function, we prove that our local and greedy protocol is in fact optimal.

symposium on principles of database systems | 2004

Replicated declustering of spatial data

Hakan Ferhatosmanoǧlu; Ali Şaman Tosun

The problem of disk declustering is to distribute data among multiple disks to reduce query response times through parallel I/O. A strictly optimal declustering technique is one that achieves optimal parallel I/O for all possible queries. In this paper, we focus on techniques that are optimized for spatial range queries. Current declustering techniques, which have single copies of the data, have been shown to be suboptimal for range queries. The lower bound on extra disk accesses is proved to be Ω(log N) for N disks even in the restricted case of an N-by-N grid, and all current approaches have been trying to achieve this bound. Replication is a well-known and effective solution for several problems in databases, especially for availability and load balancing. In this paper, we explore the idea of replication in the context of declustering and propose a framework where strictly optimal parallel I/O is achievable using a small amount of replication. We provide some theoretical foundations for replicated declustering, e.g., a bound for number of copies for strict optimality on any number of disks, and propose a class of replicated declustering schemes, periodic allocations, which are shown to be strictly optimal. The results for optimal disk allocation are extended for larger number of disks by increasing replication. Our techniques and results are valid for any arbitrary a-by-b grids, and any declustering scheme can be further improved using our replication framework. Using the framework, we perform experiments to identify a strictly optimal disk access schedule for any given arbitrary range query. In addition to the theoretical bounds, we compare the proposed replication based scheme to other existing techniques by performing experiments on real datasets.

acm symposium on applied computing | 2004

Replicated declustering for arbitrary queries

Ali Şaman Tosun

Declustering have attracted a lot of interest over the couple of years. Recently, declustering using replication is proposed to reduce the additive overhead of declustering. Most of the work on declustering focuses on spatial range queries. However, in many scenarios including multi-user environments, query shapes can be arbitrary. In this paper, we explore replicated declustering for arbitrary queries. Replication reduces the cost of arbitrary queries to manageable levels. First, we investigate theoretically what is possible using replication for arbitrary queries. Then, we propose a 2-copy replication strategy that achieves the theoretical limit and therefore is the best possible scheme. Using proposed scheme, an arbitrary query containing b buckets requires disk accesses bounded by [√b] This is a significant improvement especially for small queries because using a single copy b buckets require min (b, N) disk accesses in the worst case even for small queries. Proposed scheme works for nonuniform data as well as uniform data. Finally, we extend the proposed scheme to a partial replication scheme to achieve best performance using limited replication.

Distributed and Parallel Databases | 2006

Efficient parallel processing of range queries through replicated declustering

Hakan Ferhatosmanoglu; Ali Şaman Tosun; Guadalupe Canahuate

A common technique used to minimize I/O in data intensive applications is data declustering over parallel servers. This technique involves distributing data among several disks so as to parallelize query retrieval and thus, improve performance. We focus on optimizing access to large spatial data, and the most common type of queries on such data, i.e., range queries. An optimal declustering scheme is one in which the processing for all range queries is balanced uniformly among the available disks. It has been shown that single copy based declustering schemes are non-optimal for range queries. In this paper, we integrate replication in conjunction with parallel disk declustering for efficient processing of range queries. We note that replication is largely used in database applications for several purposes like load balancing, fault tolerance and availability of data. We propose theoretical foundations for replicated declustering and propose a class of replicated declustering schemes, periodic allocations, which are shown to be strictly optimal for a number of disks. We propose a framework for replicated declustering, using a limited amount of replication and provide extensions to apply it on real data, which include arbitrary grids and a large number of disks. Our framework also provides an effective indexing scheme that enables fast identification of data of interest in parallel servers. In addition to optimal processing of single queries, we show that this framework is effective for parallel processing of multiple queries. We present experimental results comparing the proposed replication scheme to other techniques for both single queries and multiple queries, on synthetic and real data sets.

Information Sciences | 2007

Threshold-based declustering

Ali Şaman Tosun

Declustering techniques reduce query response time through parallel I/O by distributing data among multiple devices. Except for a few cases it is not possible to find declustering schemes that are optimal for all spatial range queries. As a result of this, most of the research on declustering has focused on finding schemes with low worst case additive error. However, additive error based schemes have many limitations including lack of progressive guarantees and existence of small non-optimal queries. In this paper, we take a different approach and propose threshold-based declustering. We investigate the threshold k such that all spatial range queries with =

database and expert systems applications | 2005

Threshold based declustering in high dimensions

Ali Şaman Tosun

Declustering techniques reduce query response times through parallel I/O by distributing data among multiple devices. Except for a few cases it is not possible to find declustering schemes that are optimal for all spatial range queries. As a result of this, most of the research on declustering have focused on finding schemes with low worst case additive error. Recently, constrained declustering that maximizes the threshold k such that all spatial range queries ≤ k buckets are optimal is proposed. In this paper, we extend constrained declustering to high dimensions. We investigate high dimensional bound diagrams that are used to provide upper bound on threshold and propose a method to find good threshold-based declustering schemes in high dimensions. We show that using replicated declustering with threshold N, low worst case additive error can be achieved for many values of N. In addition, we propose a framework to find thresholds in replicated declustering.

ACM Transactions on Storage | 2013

Generalized Optimal Response Time Retrieval of Replicated Data from Storage Arrays

Nihat Altiparmak; Ali Şaman Tosun

Declustering techniques reduce query response times through parallel I/O by distributing data among parallel disks. Recently, replication-based approaches were proposed to further reduce the response time. Efficient retrieval of replicated data from multiple disks is a challenging problem. Existing retrieval techniques are designed for storage arrays with identical disks, having no initial load or network delay. In this article, we consider the generalized retrieval problem of replicated data where the disks in the system might be heterogeneous, the disks may have initial load, and the storage arrays might be located on different sites. We first formulate the generalized retrieval problem using a Linear Programming (LP) model and solve it with mixed integer programming techniques. Next, the generalized retrieval problem is formulated as a more efficient maximum flow problem. We prove that the retrieval schedule returned by the maximum flow technique yields the optimal response time and this result matches the LP solution. We also propose a low-complexity online algorithm for the generalized retrieval problem by not guaranteeing the optimality of the result. Performance of proposed and state of the art retrieval strategies are investigated using various replication schemes, query types, query loads, disk specifications, network delays, and initial loads.

international performance computing and communications conference | 2009

Low cost indoor location management system using infrared leds and Wii Remote Controller

Baris Tas; Nihat Altiparmak; Ali Şaman Tosun

Many applications in wireless sensor networks can benefit from position information. However, existing accurate solutions for indoor environments are costly. RF based approaches are not suitable for some indoor environments such as factory floors where heavy machinery can cause interference. In this paper, we propose a low cost and simple location management system using the Wii remote controller and infrared leds. Proposed solution is motivated by the need to find the location of a mobile robot used for data collection in a wireless sensor network. In proposed scheme, Wii remote controller is placed on the mobile robot pointing upward and several IR leds are placed on the ceiling. Proposed scheme uses the resources efficiently and can cover a large area using a single Wii remote controller and multiple IR leds. Proposed scheme is easy to implement and requires minimal bandwith for location management.

Distributed and Parallel Databases | 2006

Efficient retrieval of replicated data

Ali Şaman Tosun

Declustering is a common technique used to reduce query response times. Data is declustered over multiple disks and query retrieval can be parallelized. Most of the research on declustering is targeted at spatial range queries and investigates schemes with low additive error. Recently, declustering using replication has been proposed to reduce the additive overhead. Replication significantly reduces retrieval cost of arbitrary queries. In this paper, we propose a disk allocation and retrieval mechanism for arbitrary queries based on design theory. Using the proposed c-copy replicated declustering scheme,

Explore More