Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Colin A. Puri is active.

Publication


Featured researches published by Colin A. Puri.


international conference on tools with artificial intelligence | 2010

An Efficient and Robust Approach for Discovering Data Quality Rules

Peter Z. Yeh; Colin A. Puri

Poor quality data is a growing problem that affects many enterprises across all aspects of their business ranging from operational efficiency to revenue protection. Moreover, this problem is costly to fix because significant effort and resources are required to identify a comprehensive set of rules that can detect (and correct) data defects along various data quality dimensions such as consistency, conformity, and more. Hence, many organizations employ only basic data quality rules that check for null values, format, etc. in efforts such as data profiling and data cleansing; and ignore rules that are needed to detect deeper problems such as inconsistent values across interdependent attributes. This oversight can lead to numerous problems such as inaccurate reporting of key metrics used to inform critical decisions or derive business insights. In this paper, we present an approach that efficiently and robustly discovers data quality rules -- in particular conditional functional dependencies -- for detecting inconsistencies in data and hence improves data quality along the critical dimension of consistency. We evaluate our approach empirically on several real-world data sets. We show that our approach performs well on these data sets for metrics such as precision and recall. We also compare our approach to an established solution and show that our approach outperforms this solution for the same metrics. Finally, we show that our approach scales efficiently with the number of records, the number of attributes, and the domain size.


database and expert systems applications | 2015

Analyzing and Predicting Security Event Anomalies: Lessons Learned from a Large Enterprise Big Data Streaming Analytics Deployment

Colin A. Puri; Carl Dukatz

This paper presents a novel and unique live operational and situational awareness implementation bringing big data architectures, graph analytics, streaming analytics, and interactive visualizations to a security use case with data from a large Global 500 company. We present the data acceleration patterns utilized, the employed analytics framework and its complexities, and finally demonstrate the creation of rich interactive visualizations that bring the story of the data acceleration pipeline and analytics to life. We deploy a novel solution to learn typical network agent behaviors and extract the degree to which a network event is anomalous for automatic anomaly rule learning to provide additional context to security alerts. We implement and evaluate the analytics over a data acceleration framework that performs the analysis and model creation at scale in a distributed parallel manner. Additionally, we talk about the acceleration architecture considerations and demonstrate how we complete the analytics story with rich interactive visualizations designed for the security and business analyst alike. This paper concludes with evaluations and lessons learned.


data warehousing and knowledge discovery | 2012

Implementing a data lineage tracker

Colin A. Puri; Doo Soon Kim; Peter Z. Yeh; Kunal Verma

Everyday business users face the tracking of the origin of information used in calculations and business decisions. Knowing the origin and lineage of data can help in the decision making process, provide a clear audit trail for regulation, and answer key questions such as: who, what, where, when, why, and how. In the case of tracking data lineage, many issues and challenges arise in trying to track and support a heterogeneous enterprise environment. This paper presents one method of tackling data lineage to answer the questions needed for business users, for both new and old applications in a heterogeneous infrastructure environment. Using trace logs from data sources, we show how our system performs by effectively tracking data lineage and determining data flows of information as it moves from one data source to another through the execution of applications. Utilizing SQL and NoSQL systems, we demonstrate the recall and precision of our proposed data lineage tracking system.


International Journal of Advanced Intelligence Paradigms | 2010

A knowledge based approach for capturing rich semantic representations from text for intelligent systems

Peter Z. Yeh; Colin A. Puri; Alex Kass

In this paper, we present a knowledge based approach to capture semantic representations from text for intelligent systems that know the representations of interest in advance. Our approach performs this task by generating phrases from these representations and then matching these phrases against text using a set of syntactic and semantic transformations. The representation that best matches a piece of text is selected as its meaning. We evaluate our approach in the context of a real-world intelligent system that tracks the maturity of wireless technologies, and show how our approach performs well on capturing semantic representations from text.


international conference on tools with artificial intelligence | 2009

Towards a Technology Platform for Building Corporate Radar Applications that Mine the Web for Business Insight

Peter Z. Yeh; Colin A. Puri; Alex Kass

In this paper, we give a progress report on an ongoing effort at Accenture to develop a technology platform for building a wide range of corporate radar applications,which can turn the Web into a systematic source of business insight. Our goal is to share the platform we have developed and the lessons we have learned, so others can leverage this knowledge when building similar applications. We give an overview of this platform, which integrates a combination of established AI technologies -- i.e. semantic models, natural language processing, and inference engines -- in a novel way. We then illustrate the kinds of corporate radars that can be built with our platform through two applications we developed at Accenture: the Technology Lifecycle Tracker, which assesses the maturity of technologies from the wireless industry, and the Technology Trend Tracker, which measures hype versus reality for emerging technology trends such as cloud computing, software-as-a-service, and more. Finally, we discuss our experiences in using this platform to build these applications and the lessons learned.


Archive | 2010

METHOD AND SYSTEM FOR ACCELERATED DATA QUALITY ENHANCEMENT

Peter Z. Yeh; Colin A. Puri


ICBO | 2011

Multiple Ontologies in Healthcare Information Technology: Motivations and Recommendation for Ontology Mapping and Alignment.

Colin A. Puri; Karthik Gomadam; Prateek Jain; Peter Z. Yeh; Kunal Verma


Archive | 2014

CONTEXTUAL GRAPH MATCHING BASED ANOMALY DETECTION

Colin A. Puri; John K. Nguyen; Scott W. Kurth


Archive | 2012

Data lineage tracking

Colin A. Puri; Doo Soon Kim; Peter Z. Yeh; Kunal Verma


Archive | 2011

Evolutionary process system

Agata Opalach; Scott W. Kurth; Kimberly Sparkes Ostman; Colin A. Puri

Collaboration


Dive into the Colin A. Puri's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kunal Verma

Digital Enterprise Research Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge