Parvathi Chundi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Parvathi Chundi is active.

Explore More

Publication

Featured researches published by Parvathi Chundi.

international conference on software maintenance | 2007

Discovering Dynamic Developer Relationships from Software Version Histories by Time Series Segmentation

Harvey P. Siy; Parvathi Chundi; Daniel J. Rosenkrantz; Mahadevan Subramaniam

Time series analysis is a promising approach to discover temporal patterns from time stamped, numeric data. A novel approach to apply time series analysis to discern temporal information from software version repositories is proposed. Version logs containing numeric as well as non-numeric data are represented as an item-set time series. A dynamic programming based algorithm to optimally segment an item-set time series is presented. The algorithm automatically produces a compacted item-set time series that can be analyzed to discern temporal patterns. The effectiveness of the approach is illustrated by applying to the Mozilla data set to study the change frequency and developer activity profiles. The experimental results show that the segmentation algorithm produces segments that capture meaningful information and is superior to the information content obtaining by arbitrarily segmenting time period into regular time intervals.

data and knowledge engineering | 2011

Extracting hot spots of topics from time-stamped documents

Wei Chen; Parvathi Chundi

Identifying time periods with a burst of activities related to a topic has been an important problem in analyzing time-stamped documents. In this paper, we propose an approach to extract a hot spot of a given topic in a time-stamped document set. Topics can be basic, containing a simple list of keywords, or complex. Logical relationships such as and, or, and not are used to build complex topics from basic topics. A concept of presence measure of a topic based on fuzzy set theory is introduced to compute the amount of information related to the topic in the document set. Each interval in the time period of the document set is associated with a numeric value which we call the discrepancy score. A high discrepancy score indicates that the documents in the time interval are more focused on the topic than those outside of the time interval. A hot spot of a given topic is defined as a time interval with the highest discrepancy score. We first describe a naive implementation for extracting hot spots. We then construct an algorithm called EHE (Efficient Hot Spot Extraction) using several efficient strategies to improve performance. We also introduce the notion of a topic DAG to facilitate an efficient computation of presence measures of complex topics. The proposed approach is illustrated by several experiments on a subset of the TDT-Pilot Corpus and DBLP conference data set. The experiments show that the proposed EHE algorithm significantly outperforms the naive one, and the extracted hot spots of given topics are meaningful.

data and knowledge engineering | 2009

An approach for temporal analysis of email data based on segmentation

Parvathi Chundi; Mahadevan Subramaniam; Dileep K. Vasireddy

Many kinds of information are hidden in email data, such as the information being exchanged, the time of exchange, and the user IDs participating in the exchange. Analyzing the email data can reveal valuable information about the social networks of a single user or multiple users, the topics being discussed, and so on. In this paper, we describe a novel approach for temporally analyzing the communication patterns embedded in email data based on time series segmentation. The approach computes egocentric communication patterns of a single user, as well as sociocentric communication patterns involving multiple users. Time series segmentation is used to uncover patterns that may span multiple time points and to study how these patterns change over time. To find egocentric patterns, the email communication of a user is represented as an item-set time series. An optimal segmentation of the item-set time series is constructed, from which patterns are extracted. To find sociocentric patterns, the email data is represented as an item-setgroup time series. Patterns involving multiple users are then extracted from an optimal segmentation of the item-setgroup time series. The proposed approach is applied to the Enron email data set, which produced very promising results.

conference on information and knowledge management | 2004

On lossy time decompositions of time stamped documents

Parvathi Chundi; Daniel J. Rosenkrantz

Constructing time decompositions of time stamped documents is an important first step in extracting temporal information from a document set. Efficient algorithms are described for computing optimal lossy decompositions for a given document set, where the loss of information is constrained to be within a specified bound. A novel and efficient algorithm is proposed for computing information loss values required to construct optimal lossy decompositions. Experimental results are reported comparing optimal lossy decompositions and equal length decompositions in terms of a number of parameters such as information loss. In particular, our results show that optimal lossy decompositions outperform equal length decompositions by preserving more of the information content of the underlying document set. The results also demonstrate that permitting even small amounts of variability in the length of the subintervals of a decomposition results in capturing more of the temporal information content of a document set when compared to equal length decompositions. This paper builds upon our earlier work on time decompositions where the problem of computing optimal lossy decomposition of the time period associated with a document set was first formulated.

Data Mining and Knowledge Discovery | 2006

Information Preserving Time Decompositions of Time Stamped Documents

Parvathi Chundi; Daniel J. Rosenkrantz

Extraction of sequences of events from news and other documents based on the publication times of these documents has been shown to be extremely effective in tracking past events. This paper addresses the issue of constructing an optimal information preserving decomposition of the time period associated with a given document set, i.e., a decomposition with the smallest number of subintervals, subject to no loss of information. We introduce the notion of the compressed interval decomposition, where each subinterval consists of consecutive time points having identical information content. We define optimality, and show that any optimal information preserving decomposition of the time period is a refinement of the compressed interval decomposition. We define several special classes of measure functions (functions that measure the prevalence of keywords in the document set and assign them numeric values), based on their effect on the information computed as document sets are combined. We give algorithms, appropriate for different classes of measure functions, for computing an optimal information preserving decomposition of a given document set. We studied the effectiveness of these algorithms by computing several compressed interval and information preserving decompositions for a subset of the Reuters–21578 document set. The experiments support the obvious conclusion that the temporal information gleaned from a document set is strongly dependent on the measure function used and on other user-defined parameters.

international conference on formal engineering methods | 2004

An Approach to Preserve Protocol Consistency and Executability Across Updates

Mahadevan Subramaniam; Parvathi Chundi

An approach to systematically update finite state protocols while preserving application independent properties such as consistency and executability is described. Protocols are modelled as a network of communicating finite state machines with each machine denoting the behavior of a single protocol controller. Updates to protocols are specified as a finite set of rules that may add, delete and/or replace one or more transitions in one or more controllers. Conditions on updates are identified under which a single transition in a single controller or multiple transitions in one or more controllers can be changed to produce an executable protocol with a consistent global transition relation. The effectiveness of the proposed approach is illustrated on a large class of cache coherence protocols. It is shown how several common design choices can be consistently incorporated into these protocols by specifying them as updates. Many changes to verified protocols are non-monotonic in the sense that they do not preserve all of the verified protocol invariants. The proposed approach enables incremental verification of application independent properties that need to be preserved by any update and are a precursor to verification of the application specific properties.

computational intelligence and data mining | 2009

Extracting hot spots of basic and complex topics from time stamped documents

Wei Chen; Parvathi Chundi

Identifying time periods with a burst of activity related to a topic has been an important problem in analyzing time stamped documents. In this paper, we discuss methods to compute a hot spot of a given topic from a time stamped document set. We consider basic topics that contain one or more keywords as well as complex topics that contain topics connected by logical operators and, or, not. We use the temporal scan statistic to assign a discrepancy score to each of the intervals of the time period spanning the given document set. The hot spot of the given topic is the time interval with the highest discrepancy score. We describe efficient algorithms to compute the hot spots of both basic and complex topics. Our preliminary experiments using the SIGMOD/VLDB paper titles data set and the CNN/Reuters news article titles data set collected from the TDT-Pilot Corpus show that our methods to compute the measure and the hot spot of a topic work very well in practice.

conference on information and knowledge management | 2012

Simulating prosthetic vision with disortions for retinal prosthesis design

Mahadevan Subramaniam; Parvathi Chundi; Abhilash Muthuraj; Eyal Margalit; Sylvie Sim

Retinal prostheses are used to restore vision to individuals with vision impairments caused by the damaged photoreceptors in their retina. Despite the early successes, designing prostheses that can restore functional vision in general, continues to be a challenging problem due to the large number of design parameters that need to be customized for individual users. Gathering data using real patients in a timely and safe manner is also difficult. To address these problems, a virtual environment for realistically and safely simulating prosthetic vision is described. Besides supporting phosphenized rendering of images at different resolutions to normal users, and eye movement tracking, the environment also supports spatial distortions that are commonly perceived by prostheses users. A procedure to automatically generate such spatial distortions is developed. User corrections if any, are logged and compared with the original distortion values to evaluate distortion perception. Experimental results obtained in using this environment to perform various visual acuity tasks are described.

mining software repositories | 2008

Summarizing developer work history using time series segmentation: challenge report

Harvey P. Siy; Parvathi Chundi; Mahadevan Subramaniam

Temporal segmentation partitions time series data with the intent of producing more homogeneous segments. It is a technique used to preprocess data so that subsequent time series analysis on individual segments can detect trends that may not be evident when performing time series analysis on the entire dataset. This technique allows data miners to partition a large dataset without making any assumption of periodicity or any other a priori knowledge of the datasets features. We investigate the insights that can be gained from the application of time series segmentation to software version repositories. Software version repositories from large projects contain on the order of hundreds of thousands of timestamped entries or more. It is a continuing challenge to aggregate such data so that noise is reduced and important characteristics are brought out. In this paper, we present a way to summarize developer work history in terms of the files they have modified over time by segmenting the CVS change data of individual Eclipse developers. We show that the files they modify tends to change significantly over time though most of them tend to work within the same directories.

database and expert systems applications | 2005

Efficient algorithms for constructing time decompositions of time stamped documents

Parvathi Chundi; Rui Zhang; Daniel J. Rosenkrantz

Identifying temporal information of topics from a document set typically involves constructing a time decomposition of the time period associated with the document set. In an earlier work, we formulated several metrics on a time decomposition, such as size, information loss, and variability, and gave dynamic programming based algorithms to construct time decompositions that are optimal with respect to these metrics. Computing information loss values for all subintervals of the time period is central to the computation of optimal time decompositions. This paper proposes several algorithms to assist in more efficiently constructing an optimal time decomposition. More efficient, parallelizable algorithms for computing loss values are described. An efficient top-down greedy heuristic to construct an optimal time decomposition is also presented. Experiments to study the performance of this greedy heuristic were conducted. Although lossy time decompositions constructed by the greedy heuristic are suboptimal, they seem to be better than the widely used uniform length decompositions.

Explore More