Shrideep Pallickara | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shrideep Pallickara is active.

Explore More

Publication

Featured researches published by Shrideep Pallickara.

ieee international conference on escience | 2008

MapReduce for Data Intensive Scientific Analyses

Jaliya Ekanayake; Shrideep Pallickara; Geoffrey C. Fox

Most scientific data analyses comprise analyzing voluminous data collected from various instruments. Efficient parallel/concurrent algorithms and frameworks are the key to meeting the scalability and performance requirements entailed in such scientific data analyses. The recently introduced MapReduce technique has gained a lot of attention from the scientific community for its applicability in large parallel data analyses. Although there are many evaluations of the MapReduce technique using large textual data collections, there have been only a few evaluations for scientific data analyses. The goals of this paper are twofold. First, we present our experience in applying the MapReduce technique for two scientific data analyses: (i) high energy physics data analyses; (ii) K-means clustering. Second, we present CGL-MapReduce, a streaming-based MapReduce implementation and compare its performance with Hadoop.

acm ifip usenix international conference on middleware | 2003

NaradaBrokering: a distributed middleware framework and architecture for enabling durable peer-to-peer grids

Shrideep Pallickara; Geoffrey C. Fox

A Peer-to-Peer (P2P) Grid would comprise services that include those of Grids and P2P networks and naturally support environments that have features of both limiting cases. Such a P2P grid integrates the evolving ideas of computational grids, distributed objects, web services, P2P networks and message oriented middleware. In this paper we investigate the architecture, comprising a distributed brokering system that will support such a hybrid environment. Access to services can then be mediated either by the middleware or alternatively by direct P2P interactions between machines.

international conference on cluster computing | 2009

Granules: A lightweight, streaming runtime for cloud computing with support, for Map-Reduce

Shrideep Pallickara; Jaliya Ekanayake; Geoffrey C. Fox

Cloud computing has gained significant traction in recent years. The Map-Reduce framework is currently the most dominant programming model in cloud computing settings. In this paper, we describe Granules, a lightweight, streaming-based runtime for cloud computing which incorporates support for the Map-Reduce framework. Granules provides rich lifecycle support for developing scientific applications with support for iterative, periodic and data driven semantics for individual computations and pipelines. We describe our support for variants of the Map-Reduce framework. The paper presents a survey of related work in this area. Finally, this paper describes our performance evaluation of various aspects of the system, including (where possible) comparisons with other comparable systems.

Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande | 2002

A scaleable event infrastructure for peer to peer grids

Geoffrey C. Fox; Shrideep Pallickara; Xi Rao

In this paper we propose a peer-to-peer (P2P) grid comprising resources such as relatively static clients, high-end resources and a dynamic collection of multiple P2P subsystems. We investigate the architecture of the messaging and event service that will support such a hybrid environment. We designed a distributed publish-subscribe system NaradaBrokering for XML specified messages. NaradaBrokering interpolates between centralized systems like JMS (Java Message Service) and P2P environments. Here we investigate and present our strategy for the integration of JXTA into NaradaBrokering. The resultant system naturally scales with multiple Peer Groups linked by NaradaBrokering.

Concurrency and Computation: Practice and Experience | 2002

An event service to support Grid computational environments

Geoffrey C. Fox; Shrideep Pallickara

We believe that it is interesting to study the system and software architecture of environments which integrate the evolving ideas of computational Grids, distributed objects, Web services, peer‐to‐peer (P2P) networks and message‐oriented middleware. Such P2P Grids should seamlessly integrate users to themselves and to resources which are also linked to each other. We can abstract such environments as a distributed system of ‘clients’ which consist either of ‘users’ or ‘resources’ or proxies thereto. These clients must be linked together in a flexible, fault‐tolerant, efficient, high‐performance fashion. In this paper, we study the messaging or event system—termed Grid Event Service (GES)—that is appropriate to link the clients (both users and resources of course) together. For our purposes (registering, transporting and discovering information), events are just messages—typically with time stamps. The messaging system GES must scale over a wide variety of devices—from handheld computers at one extreme to high‐performance computers and sensors at the other. We have analyzed the requirements of several Grid services that could be built with this model, including computing and education and incorporated constraints of collaboration with a shared event model. We suggest that generalizing the well‐known publish–subscribe model is an attractive approach and here we study some of the issues to be addressed if this model is used in GES. Copyright

conference on high performance computing (supercomputing) | 2004

Toward Flexible Messaging for SOAP-Based Services

Geoffrey C. Fox; Shrideep Pallickara; Savas Parastatidis

NaradaBrokering provides a messaging abstraction that allows it to provide message-related capabilities in a transparent fashion. These capabilities include message-based security, time and causal ordering, compression, virtualization of transport protocol and addressing, and fault tolerance related functionalities. NaradaBrokering — combined with further extensions to its existing capabilities — can also take advantage of the maturing of Web Service specifications to build very powerful general mechanisms to deploy and integrate it with general Web services. In this paper we describe our strategy to interface NaradaBrokering with Web services. The strategy described in this paper will allow new, and existing, applications built around the Web Services Framework to leverage capabilities offered by the NaradaBrokering substrate without changes to the service implementations.

Future Generation Computer Systems | 2013

Performance implications of multi-tier application deployments on Infrastructure-as-a-Service clouds: Towards performance modeling

Wes Lloyd; Shrideep Pallickara; Olaf David; Jim Lyon; Mazdak Arabi; Ken Rojas

Hosting a multi-tier application using an Infrastructure-as-a-Service (IaaS) cloud requires deploying components of the application stack across virtual machines (VMs) to provide the applications infrastructure while considering factors such as scalability, fault tolerance, performance and deployment costs (# of VMs). This paper presents results from an empirical study which investigates implications for application performance and resource requirements (CPU, disk and network) resulting from how multi-tier applications are deployed to IaaS clouds. We investigate the implications of: (1) component placement across VMs, (2) VM memory size, (3) VM hypervisor type (KVM vs. Xen), and (4) VM placement across physical hosts (provisioning variation). All possible deployment configurations for two multi-tier application variants are tested. One application variant was computationally bound by the application middleware, the other bound by geospatial queries. The best performing deployments required as few as 2 VMs, half the number required for VM-level service isolation, demonstrating potential cost savings when components can be consolidated. Resource utilization (CPU time, disk I/O, and network I/O) varied with component deployment location, VM memory allocation, and the hypervisor used (Xen or KVM) demonstrating how application deployment decisions impact required resources. Isolating application components using separate VMs produced performance overhead of ~1%-2%. Provisioning variation of VMs across physical hosts produced overhead up to 3%. Relationships between resource utilization and performance were assessed using multiple linear regression to develop a model to predict application deployment performance. Our model explained over 84% of the variance and predicted application performance with mean absolute error of only ~0.3 s with CPU time, disk sector reads, and disk sector writes serving as the most powerful predictors of application performance.

Future Generation Computer Systems | 2013

Exploiting geospatial and chronological characteristics in data streams to enable efficient storage and retrievals

Matthew Malensek; Sangmi Lee Pallickara; Shrideep Pallickara

We describe the design of a high-throughput storage system, Galileo, for data streams generated in observational settings. To cope with data volumes, the shared nothing architecture in Galileo supports incremental assimilation of nodes, while accounting for heterogeneity in their capabilities. To achieve efficient storage and retrievals of data, Galileo accounts for the geospatial and chronological characteristics of such time-series observational data streams. Our benchmarks demonstrate that Galileo supports high-throughput storage and efficient retrievals of specific portions of large datasets while supporting different types of queries.

Future Generation Computer Systems | 2013

On the performance of high dimensional data clustering and classification algorithms

Kathleen Ericson; Shrideep Pallickara

There is often a need to perform machine learning tasks on voluminous amounts of data. These tasks have application in fields such as pattern recognition, data mining, bioinformatics, and recommendation systems. Here we evaluate the performance of 4 clustering algorithms and 2 classification algorithms supported by Mahout within two different cloud runtimes, Hadoop and Granules. Our benchmarks use the same Mahout backend code, ensuring a fair comparison. The differences between these implementations stem from how the Hadoop and Granules runtimes (1) support and manage the lifecycle of individual computations, and (2) how they orchestrate exchange of data between different stages of the computational pipeline during successive iterations of the clustering algorithm. We include an analysis of our results for each of these algorithms in a distributed setting, as well as a discussion on measures for failure recovery.

grid computing | 2006

A Framework for Secure End-to-End Delivery of Messages in Publish/Subscribe Systems

Shrideep Pallickara; Marlon E. Pierce; Harshawardhan Gadgil; Geoffrey C. Fox; Yan Yan; Yi Huang

In this paper, we present a framework for the secure end-to-end delivery of messages in distributed messaging infrastructures based on the publish/subscribe paradigm. The framework enables authorized publishing and consumption of messages. Brokers, which constitute individual nodes within the messaging infrastructure, also ensure that the dissemination of content is enabled only for authorized entities. The framework includes strategies to cope with attack scenarios such as denial of service attacks and replay attacks. Finally, we include experimental results from our implementation

Explore More