Rema Ananthanarayanan

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rema Ananthanarayanan is active.

Explore More

Publication

Featured researches published by Rema Ananthanarayanan.

analytics for noisy unstructured text data | 2008

Rule based synonyms for entity extraction from noisy text

Rema Ananthanarayanan; Vijil Chenthamarakshan; Prasad M. Deshpande; Raghuram Krishnapuram

Identification of named entities such as person, organization and product names from text is an important task in information extraction. In many domains, the same entity could be referred to in multiple ways due to variations introduced by different user groups, variations of spellings across regions or cultures, usage of abbreviations, typographical errors and other reasons associated with conventional usage. Identifying a piece of text as a mention of an entity in such noisy data is difficult, even if we have a dictionary of possible entities. Previous approaches treat the synonym problem as part entity disambiguation and use learning-based methods that use the context of the words to identify synonyms. In this paper, we show that existing domain knowledge, encoded as rules, can be used effectively to address the synonym problem to a considerable extent. This makes the disambiguation task simpler, without the need for much training data. We look at a subset of application scenarios in named entity extraction, categorize the possible variations in entity names, and define rules for each category. Using these rules, we generate synonyms for the canonical list and match these synonyms to the actual occurrence in the data sets. In particular, we describe the rule categories that we developed for several named entities and report the results of applying our technique of extracting named entities by generating synonyms for two different domains.

international middleware conference | 2012

Rapid adjustment and adoption to MIaaS clouds

Balaji Viswanathan; Akshat Verma; Bharat Krishnamurthy; Praveen Jayachandran; Kamal Bhattacharya; Rema Ananthanarayanan

Emerging Managed Infrastructure as a Service (MIaaS) clouds allow enterprises to outsource their IT infrastructure as well as their IT management needs. One of the core tenets of a MIaaS cloud is a standardized service delivery model, allowing the cloud provider to provide infrastructure management services at a lower cost. As opposed to pure IaaS clouds where arbitrary customer virtual machines can be migrated to the cloud, migration to MIaaS clouds require the customer servers to be adapted in a way such that the cloud steady state management stack can manage these virtual machines using the standardized delivery model. In this work, we address the problem of migrating customer workloads to a standardized MIaaS cloud. We present the design and implementation of Rapid Adjustment Engine (RAE). RAE captures the adjustment process across arbitrary customer servers with high diversity in a unified rule framework. It uses rapid image adjustment to reduce the end-to-end migration time and a flexible orchestrator framework to integrate diverse functionalities and associated tools in a single migration process. Our experimental evaluation establishes the ability of RAE to enable rapid, reliable and reduced cost migration to MIaaS clouds.

international conference on data engineering | 2004

EShopMonitor: a Web content monitoring tool

Neeraj Agrawal; Rema Ananthanarayanan; Rahul Gupta; Sachindra Joshi; Raghu Krishnapuram; Sumit Negi

Data presented on commerce sites runs into thousands of pages, and is typically delivered from multiple back-end sources. This makes it difficult to identify incorrect, anomalous, or interesting data such as

ieee international conference on services computing | 2010

AHA: Asset Harvester Assistant

Debdoot Mukherjee; Senthil Mani; Vibha Singhal Sinha; Rema Ananthanarayanan; Biplav Srivastava; Pankaj Dhoolia; Prahlad Chowdhury

9.99 air fares, missing links, drastic changes in prices and addition of new products or promotions. We describe a system that monitors Web sites automatically and generates various types of reports so that the content of the site can be monitored and the quality maintained. The solution designed and implemented by us consists of a site crawler that crawls dynamic pages, an information miner that learns to extract useful information from the pages based on examples provided by the user, and a reporter that can be configured by the user to answer specific queries. The tool can also be used for identifying price trends and new products or promotions at competitor sites. A pilot run of the tool has been successfully completed at the ibm.com site.

data and knowledge engineering | 2007

Editorial: Some issues in privacy data management

Mukesh K. Mohania; Rema Ananthanarayanan; Ajay Gupta

Information assets in service enterprises are typically available as unstructured documents. There is an increasing need for unraveling information from these documents into a structured and semantic format. Structured data can be more effectively queried, which increases information reuse from asset repositories. This paper addresses the problem of extracting XML models, which follow a given target schema, from enterprise documents. We discuss why existing approaches for information extraction do not suffice for the enterprise documents created during service delivery. To address this limitation, we present the Asset Harvester Assistant (AHA), a tool that automatically extracts structured models from MS-Word documents, and supports manual refinement of the extracted models within an interactive environment. We present the results of empirical studies conducted using business-process documents from real service-delivery engagements. Our results indicate that the AHA approach can be effective in extracting accurate models from unstructured documents and improving user productivity.

Ibm Journal of Research and Development | 2004

The eShopmonitor: a comprehensive data extraction tool for monitoring web sites

Neeraj Agrawal; Rema Ananthanarayanan; Rahul Gupta; Sachindra Joshi; Raghu Krishnapuram; Sumit Negi

The management of privacy sensitive information is a very critical and important need for every enterprise, while ensuring the compliance with various rules and regulations relating to data management. This poses several challenging problems, such as the need to translate the high-level business goals into system-level privacy policies, administration of privacy sensitive data, privacy data integration and engineering, privacy access control mechanisms, information-oriented security and query execution on privacy sensitive data for partial answers. Privacy may be defined as the claim of individuals, groups and institutions to determine for themselves when, how and to what extent information about them is communicated to others (this definition is attributed to Professor Alan Westin, Columbia University, 1967). However, with the growth of the web and associated technologies for accessing and processing large quantities of information at a faster rate, the privacy issues gain additional significance in view of the increasing volumes of data that may have sensitive information about business entities, and the increasing rules and regulations for safe-guarding the sensitive information. Role-based access control [11] has been widely used for data protection, especially when the data resides in relational tables. Further modifications were proposed to the database at different levels of granularity for additional protection [1,7]. There are several regulations, such as Sarbannes-Oxley Act (SOX) and Health Insurance Portability and Accountability Act (HIPAA), that aim to provide regulatory control, by mandating disclosure of certain information or restricting or imposing conditions for sharing other kinds of information.

ieee international conference on services computing | 2009

Dependency Analysis Framework for Software Service Delivery

Rema Ananthanarayanan; Vijil Chenthamarakshan; Heng Chu; Prasad M. Deshpande; Raghu Krishnapuram; Shajeer K. Mohammed

Typical commercial Web sites publish information from multiple back-end data sources; these data sources are also updated very frequently. Given the size of most commercial sites today, it becomes essential to have an automated means of checking for correctness and consistency of data. The eShopmonitor allows users to specify items of interest to be tracked, monitors these items on the Web pages, and reports on any changes observed. Our solution comprises a crawler, a miner, a reporter, and a user component that work together to achieve the above functionality. The miner learns to locate the items of interest on a class of pages based on just one sample supplied by the user, via the user interface (UI) provided. The learning algorithm is based on the XPaths of the Document Object Model (DOM) of the page.

congress on evolutionary computation | 2005

Negotiation support in online markets, with competition and co-operation

Rema Ananthanarayanan; Manoj Kumar

Various phases in the delivery of software services such as solution design, application deployment, and maintenance require analysis of the dependencies of software products that form the solution. As software systems become more complex and involve a large number of software products from multiple vendors, availability of correct and up-to-date system requirement information becomes critical to ensure proper functioning of managed and maintained software solutions. System requirement information, is mostly made available in unstructured formats from sources such as websites or product documents and are not amenable to programmatic analysis. In this paper, we motivate the benefits of capturing this information in a structured format for software service delivery, and present a dependency analysis system that collects and integrates software dependency/interoperability information from multiple unstructured sources using text mining techniques. Information hence collected, is used to support analytics useful in software service delivery. We report the results of our experiments on mining millions of web pages to collect dependency information for more than 700 software products.

very large data bases | 2018

Tooling framework for instantiating natural language querying system

Manasa Jammi; Jaydeep Sen; Ashish R. Mittal; Sagar Verma; Vardaan Pahuja; Rema Ananthanarayanan; Pranay Lohia; Hima P. Karanam; Diptikalyan Saha; Karthik Sankaranarayanan

We describe the design and implementation of online bilateral negotiations. The negotiation middleware can be used in a stand-alone mode for electronic commerce and to provide support for other multilateral trading systems such as auctions and RFQs, where it is integrated with bidding and bargaining processes. The negotiation engine design uses the state-machine formalism. It supports multi-attributed bids and offers and a wide variety of evaluation schemes. Since various users of the system may need different negotiation protocols to achieve their differing objectives such as rapidly concluding the deal or finding the mutually optimal solution, the design of the negotiation engine supports the choice of a negotiation protocol best suited for the objective. The negotiation engine has been prototyped on WebSphere commerce suite, IBMs e-commerce middleware.

international workshop on the web and databases | 2018

DataVizard: Recommending Visual Presentations for Structured Data

Rema Ananthanarayanan; Pranay Lohia; Srikanta J. Bedathur

Recent times have seen a growing demand for natural language querying (NLQ) interfaces to retrieve information from the structured data sources such as knowledge bases. Using this interface, business users can directly interact with a database without the knowledge of the query language or the data schema. Our earlier work describes a natural language query engine called ATHENA which has several shortcoming around ease of use and compatibility with data stores, formats and flows. In this demonstration paper, we present a tooling framework to address these challenges so that one can instantiate an NLQ system with utmost ease. Our framework makes it easy and practically applicable to all NLIDB scenarios involving different sources of structured data, file formats, and ontologies to enable natural language querying on top of them with minimal human configuration. We present the tool design and the solution to the challenges towards building such a system and demonstrate its applicability in the medical domain. PVLDB Reference Format: Manasa Jammi, Jaydeep Sen, Ashish Mittal, Sagar Verma, Vardaan Pahuja, Rema Ananthanarayanan, Pranay Lohia, Hima Karanam, Diptikalyan Saha, Karthik Sankaranarayanan. Tooling Framework for Instantiating Natural Language Querying System. PVLDB, 11 (12): 2014-2017, 2018. DOI: https://doi.org/10.14778/3229863.3236248

Explore More