Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Salil Joshi is active.

Publication


Featured researches published by Salil Joshi.


Data Mining and Knowledge Discovery | 2014

Detecting localized homogeneous anomalies over spatio-temporal data

Aditya Telang; Deepak P; Salil Joshi; Prasad M. Deshpande; Ranjana Rajendran

The last decade has witnessed an unprecedented growth in availability of data having spatio-temporal characteristics. Given the scale and richness of such data, finding spatio-temporal patterns that demonstrate significantly different behavior from their neighbors could be of interest for various application scenarios such as—weather modeling, analyzing spread of disease outbreaks, monitoring traffic congestions, and so on. In this paper, we propose an automated approach of exploring and discovering such anomalous patterns irrespective of the underlying domain from which the data is recovered. Our approach differs significantly from traditional methods of spatial outlier detection, and employs two phases—(i) discovering homogeneous regions, and (ii) evaluating these regions as anomalies based on their statistical difference from a generalized neighborhood. We evaluate the quality of our approach and distinguish it from existing techniques via an extensive experimental evaluation.


international world wide web conferences | 2013

Offering language based services on social media by identifying user's preferred language(s) from romanized text

Mitesh M. Khapra; Salil Joshi; Ananthakrishnan Ramanathan; Karthik Visweswariah

With the increase of multilingual content and multilingual users on the web, it is prudent to offer personalized services and ads to users based on their language profile (\textit{i.e.}, the list of languages that a user is conversant with). Identifying the language profile of a user is often non-trivial because (i) users often do not specify all the languages known to them while signing up for an online service (ii) users of many languages (especially Indian languages) largely use Latin/Roman script to write content in their native language. This makes it non-trivial for a machine to distinguish the language of one comment from another. This situation presents an opportunity for offering following language based services for romanized content (i) hide romanized comments which belong to a language which is not known to the user (ii) translate romanized comments which belong to a language which is not known to the user (iii) transliterate romanized comments which belong to a language which is known to the user (iv) show language based ads by identifying languages known to a user based on the romanized comments that he wrote/read/liked. We first use a simple bootstrapping based semi-supervised algorithm for identify the language of a romanized comment. We then apply this algorithm to all the comments written/read/liked by a user to build a language profile of the user and propose that this profile can be used to offer the services mentioned above.


very large data bases | 2017

Creation and interaction with large-scale domain-specific knowledge bases

Shreyas Bharadwaj; Laura Chiticariu; Marina Danilevsky; Samarth Dhingra; Samved Divekar; Arnaldo Carreno-Fuentes; Himanshu Gupta; Nitin Gupta; Sang-Don Han; Mauricio A. Hernández; Howard Ho; Parag Jain; Salil Joshi; Hima P. Karanam; Saravanan Krishnan; Rajasekar Krishnamurthy; Yunyao Li; Satishkumaar Manivannan; Ashish R. Mittal; Fatma Ozcan; Abdul Quamar; Poornima Raman; Diptikalyan Saha; Karthik Sankaranarayanan; Jaydeep Sen; Prithviraj Sen; Shivakumar Vaithyanathan; Mitesh Vasa; Hao Wang; Huaiyu Zhu

The ability to create and interact with large-scale domain-specific knowledge bases from unstructured/semi-structured data is the foundation for many industry-focused cognitive systems. We will demonstrate the Content Services system that provides cloud services for creating and querying high-quality domain-specific knowledge bases by analyzing and integrating multiple (un/semi)structured content sources. We will showcase an instantiation of the system for a financial domain. We will also demonstrate both cross-lingual natural language queries and programmatic API calls for interacting with this knowledge base.


Archive | 2017

Insights on Hindi WordNet Coming from the IndoWordNet

Laxmi Kashyap; Salil Joshi; Pushpak Bhattacharyya

In a multilingual country such as India, machine translation and crosslingual search are highly relevant problems. The WordNets, as crucial linguistic resources, play the most dominant role in the field of text processing and applications, such as machine learning, machine translation, information extraction, information retrieval, and natural language understanding systems. Therefore, no meaningful research in these areas can be complete without their help. This paper reports the categorization work of synsets of the Hindi WordNet (version 1.2), the challenges that were faced while doing the work, and solutions obtained for them thereafter. There are a number of concepts common to most of the languages, and linking these concepts with each other can provide an indispensable resource for Natural Language Processing and Language technology. The WordNet for Hindi language is created using the ab initio method while all the other Indian language WordNets are being created using the Hindi WordNet through expansion approach. The Hindi WordNet forms the foundation for the other Indian language WordNets as they are based on it and are being linked to it.


Knowledge and Information Systems | 2015

The Mask of ZoRRo: preventing information leakage from documents

Prasad M. Deshpande; Salil Joshi; Prateek Dewan; Karin Murthy; Mukesh K. Mohania; Sheshnarayan Agrawal

In today’s enterprise world, information about business entities such as a customer’s or patient’s name, address, and social security number is often present in both relational databases as well as content repositories. Information about such business entities is generally well protected in databases by well-defined and fine-grained access control. However, current document retrieval systems do not provide user-specific, fine-grained redaction of documents to prevent leakage of information about business entities from documents. Leaving companies with only two choices: either providing complete access to a document, risking potential information leakage, or prohibiting access to the document altogether, accepting potentially negative impact on business processes. In this paper, we present ZoRRo, an add-on for document retrieval systems to dynamically redact sensitive information of business entities referenced in a document based on access control defined for the entities. ZoRRo exploits database systems’ fine-grained, label-based access-control mechanism to identify and redact sensitive information from unstructured text, based on the access privileges of the user viewing it. To make on-the-fly redaction feasible, ZoRRo exploits the concept of


north american chapter of the association for computational linguistics | 2013

More than meets the eye: Study of Human Cognition in Sense Annotation

Salil Joshi; Diptesh Kanojia; Pushpak Bhattacharyya


meeting of the association for computational linguistics | 2011

Together We Can: Bilingual Bootstrapping for WSD

Mitesh M. Khapra; Salil Joshi; Arindam Chatterjee; Pushpak Bhattacharyya

k


international joint conference on natural language processing | 2011

It Takes Two to Tango: A Bilingual Unsupervised Approach for Estimating Sense Distributions using Expectation Maximization

Mitesh M. Khapra; Salil Joshi; Pushpak Bhattacharyya


Archive | 2013

DISCOVERY OF RELATED ENTITIES IN A MASTER DATA MANAGEMENT SYSTEM

Prasad M. Deshpande; Salil Joshi; Mukesh K. Mohania; Karin Murthy; Scott Schumacher; Bruhathi H. Sundarmurthy

k-safety in combination with Lucene-based indexing and scoring. We demonstrate the efficiency and effectiveness of ZoRRo through a detailed experimental study.


Archive | 2017

IDENTIFYING ENTITY MAPPINGS ACROSS DATA ASSETS

Prasad M. Deshpande; Atreyee Dey; Rajeev Gupta; Sanjeev Kumar Gupta; Salil Joshi; Sriram Padmanabhan

Collaboration


Dive into the Salil Joshi's collaboration.

Top Co-Authors

Avatar

Pushpak Bhattacharyya

Indian Institute of Technology Bombay

View shared research outputs
Top Co-Authors

Avatar

Arindam Chatterjee

Indian Institute of Technology Bombay

View shared research outputs
Top Co-Authors

Avatar

Diptesh Kanojia

Indian Institute of Technology Bombay

View shared research outputs
Top Co-Authors

Avatar

Aditya Telang

University of Texas at Arlington

View shared research outputs
Researchain Logo
Decentralizing Knowledge