Is this you? Create Your Porfile

Ozgur D. Sahin

University of California, Santa Barbara

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ozgur D. Sahin is active.

Explore More

Publication

Featured researches published by Ozgur D. Sahin.

acm ifip usenix international conference on middleware | 2004

Meghdoot: content-based publish/subscribe over P2P networks

Abhishek Gupta; Ozgur D. Sahin; Divyakant Agrawal; Amr El Abbadi

Publish/Subscribe systems have become a prevalent model for delivering data from producers (publishers) to consumers (subscribers) distributed across wide-area networks while decoupling the publishers and the subscribers from each other. In this paper we present Meghdoot, which adapts content-based publish/subscribe systems to Distributed Hash Table based P2P networks in order to provide scalable content delivery mechanisms while maintaining the decoupling between the publishers and the subscribers. Meghdoot is designed to adapt to highly skewed data sets, which is typical of real applications. The experimental results demonstrate that Meghdoot balances the load among the peers and the design scales well with increasing number of peers, subscriptions and events.

international conference on data engineering | 2004

A peer-to-peer framework for caching range queries

Ozgur D. Sahin; Abhishek Gupta; Divyakant Agrawal; A. El Abbadi

Peer-to-peer systems are mainly used for object sharing although they can provide the infrastructure for many other applications. We extend the idea of object sharing to data sharing on a peer-to-peer system. We propose a method, which is based on the multidimensional CAN system, for efficiently evaluating range queries. The answers of the range queries are cached at the peers and are used to answer future range queries. The scalability and efficiency of our design is shown through simulation.

international conference on web services | 2004

A peer-to-peer framework for Web service discovery with ranking

Fatih Emekci; Ozgur D. Sahin; Divyakant Agrawal; Amr El Abbadi

Current Web service discovery methods are based on centralized approaches where Web services are identified based on service functionality. Examples of service functionality include car rental, hotel booking and book selling. Since higher level Web services are increasingly composed in terms of lower level Web services, it is important that service discovery not only be based on service functionality but also be based on process behavior, i.e., how a service functionality is served. Furthermore, centralized approaches to service discovery suffer from problems such as high operational and maintenance cost, single point of failure, and scalability. Another issue that has not been considered in current Web service discovery paradigms is the issue of trust and quality of service of the service provider. We, therefore, propose a structured peer-to-peer framework for Web service discovery in which Web services are located based on both service functionality and process behavior. In addition, we integrate a scalable reputation model in this distributed peer-to-peer framework to rank Web services based on both trust and service quality.

data and knowledge engineering | 2007

Privacy preserving decision tree learning over multiple parties

Fatih Emekci; Ozgur D. Sahin; Divyakant Agrawal; A. El Abbadi

Data mining over multiple data sources has emerged as an important practical problem with applications in different areas such as data streams, data-warehouses, and bioinformatics. Although the data sources are willing to run data mining algorithms in these cases, they do not want to reveal any extra information about their data to other sources due to legal or competition concerns. One possible solution to this problem is to use cryptographic methods. However, the computation and communication complexity of such solutions render them impractical when a large number of data sources are involved. In this paper, we consider a scenario where multiple data sources are willing to run data mining algorithms over the union of their data as long as each data source is guaranteed that its information that does not pertain to another data source will not be revealed. We focus on the classification problem in particular and present an efficient algorithm for building a decision tree over an arbitrary number of distributed sources in a privacy preserving manner using the ID3 algorithm.

international conference on service oriented computing | 2005

SPiDeR: P2P-based web service discovery

Ozgur D. Sahin; Cagdas Evren Gerede; Divyakant Agrawal; Amr El Abbadi; Oscar H. Ibarra; Jianwen Su

In this paper, we describe SPiDeR, a peer-to-peer (P2P) based framework that supports a variety of Web service discovery operations. SPiDeR organizes the service providers into a structured P2P overlay and allows them to advertise and lookup services in a completely decentralized and dynamic manner. It supports three different kinds of search operations: For advertising and locating services, service providers can use keywords extracted from service descriptions (keyword-based search), categories from a global ontology (ontology-based search), and/or paths from the service automaton (behavior-based search). The users can also rate the quality of the services they use. The ratings are accumulated within the system so that users can query for the quality ratings of the discovered services. Finally, we present the performance of SPiDeR in terms of routing using a simulator.

databases information systems and peer to peer computing | 2004

Content-based similarity search over peer-to-peer systems

Ozgur D. Sahin; Fatih Emekci; Divyakant Agrawal; Amr El Abbadi

Peer-to-peer applications are used to share large volumes of data. An important requirement of these systems is efficient methods for locating the data of interest in a large collection of data. Unfortunately current peer-to-peer systems either offer exact keyword match functionality or provide inefficient text search methods through centralized indexing or flooding. In this paper we propose a method based on popular Information Retrieval techniques to facilitate content-based searches in peer-to-peer systems. A simulation of the proposed design was implemented and its performance was evaluated using some commonly used test collections, including Ohsumed which was used for the TREC-9 Filtering Track. The experiments demonstrate that our approach is scalable as it achieves high recall by visiting only a small subset of the peers.

acm multimedia | 2005

PRISM: indexing multi-dimensional data in P2P networks using reference vectors

Ozgur D. Sahin; Aziz Gulbeden; Fatih Emekci; Divyakant Agrawal; A. El Abbadi

Peer-to-peer (P2P) systems research has gained considerable attention recently with the increasing popularity of file sharing applications. Since these applications are used for sharing huge amounts of data, it is very important to efficiently locate the data of interest in such systems. However, these systems usually do not provide efficient search techniques. Existing systems offer only keyword search functionality through a centralized index or by query flooding. In this paper, we propose a scheme based on reference vectors for sharing multi-dimensional data in P2P systems. This scheme effectively supports a larger set of query operations (such as k-NN queries and content-based similarity search) than current systems, which generally support only exact key lookups and keyword searches.The basic idea is to store multiple replicas of an objects index at different peers based on the distances between the objects feature vector and the reference vectors. Later, when a query is posed, the system identifies the peers that are likely to store the index information about relevant objects using reference vectors. Thus the system is able to return accurate results by contacting a small fraction of the participating peers.

web information systems engineering | 2005

PRoBe: multi-dimensional range queries in p2p networks

Ozgur D. Sahin; Shyam Antony; Divyakant Agrawal; A. El Abbadi

Structured P2P systems are effective for exact key searches in a distributed environment as they offer scalability, self-organization, and dynamicity. These valuable properties also make them a candidate for more complex queries, such as range queries. In this paper, we describe PRoBe, a system that supports range queries over multiple attributes in P2P networks. PRoBe uses a multi-dimensional logical space for this purpose and maps data items onto this space based on their attribute values. The logical space is divided into hyper-rectangles, each maintained by a peer in the system. The range queries correspond to hyper-rectangles which are answered by forwarding the query to the peers responsible for overlapping regions of the logical space. We also propose load balancing techniques and show how cached query answers can be utilized for the efficient evaluation of similar range queries. The performance of PRoBe and the effects of various parameters are analyzed through a simulation study.

international conference on peer-to-peer computing | 2005

Techniques for efficient routing and load balancing in content-addressable networks

Ozgur D. Sahin; Divyakant Agrawal; A. El Abbadi

As a distributed hash table (DHT), a content addressable network (CAN) provides efficient routing and object location in a decentralized manner while offering fault tolerance and dynamic peer operations. However, as opposed to other DHTs that use a flat ID space, CAN uses a multi-dimensional logical space. DHTs usually require O(logN) routing information per peer and provide routing in O(logN) hops, where N is the number of peers in the system. In CAN, on the other hand, each peer keeps only constant amount of routing information and the routing takes O(dN/sup 1/d/) hops, where d is the dimensionality of the logical space. Hence the routing performance of CAN is worse than other DHTs especially when d is small. In this paper, we describe and evaluate several schemes for efficient routing in CAN by keeping additional routing information at the peers. Furthermore, due to the underlying multidimensional ID space, CAN is used by applications that require content-based mapping of data objects onto the ID space. Since uniform hashing is not used, such mappings introduce skewed object distributions among the peers. Thus we also describe load balancing schemes for CAN and investigate their efficiency.

bioinformatics and bioengineering | 2004

Efficient filtration of sequence similarity search through singular value decomposition

S. Alireza Aghili; Ozgur D. Sahin; Divyakant Agrawal; Amr El Abbadi

Similarity search in textual databases and bioinformatics has received substantial attention in the past decade. Numerous filtration and indexing techniques have been proposed to reduce the curse of dimensionality. This paper proposes a novel approach to map the problem of whole- genome sequence similarity search into an approximate vector comparison in the well-established multidimensional vector space. We propose the application of the singular value decomposition (SVD) dimensionality reduction technique as a pre-processing filtration step to effectively reduce the search space and the running time of the search operation. Our empirical results on a prokaryote and a eukaryote DNA contig dataset, demonstrate effective filtration to prune non-relevant portions of the database with up to 2.3 times faster running time compared with q-gram approach. SVD filtration may easily be integrated as a pre-processing step for any of the well-known sequence search heuristics as BLAST, QUASAR and FastA. We analyze the precision of applying SVD filtration as a transformation-based dimensionality reduction technique, and finally discuss the imposed trade-offs.

Explore More