Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vishal Gupta is active.

Publication


Featured researches published by Vishal Gupta.


international conference on information systems | 2011

Preprocessing Phase of Punjabi Language Text Summarization

Vishal Gupta; Gurpreet Singh Lehal

Punjabi Text Summarization is the process of condensing the source Punjabi text into a shorter version, preserving its information content and overall meaning. It comprises two phases: 1) Pre Processing 2) Processing. Pre Processing is structured representation of the Punjabi text. This paper concentrates on Pre processing phase of Punjabi Text summarization. Various sub phases of pre processing are: Punjabi words boundary identification, Punjabi language stop words elimination, Punjabi language noun stemming, finding Common English Punjabi noun words, finding Punjabi language proper nouns, Punjabi sentence boundary identification, and identification of Punjabi language Cue phrase in a sentence.


ACM Computing Surveys | 2016

Text Stemming: Approaches, Applications, and Challenges

Jasmeet Singh; Vishal Gupta

Stemming is a process in which the variant word forms are mapped to their base form. It is among the basic text pre-processing approaches used in Language Modeling, Natural Language Processing, and Information Retrieval applications. In this article, we present a comprehensive survey of text stemming techniques, evaluation mechanisms, and application domains. The main objective of this survey is to distill the main insights and present a detailed assessment of the current state of the art. The performance of some well-known rule-based and statistical stemming algorithms in different scenarios has been analyzed. In the end, we highlighted some open issues and challenges related to unsupervised statistical text stemming. This research work will help the researchers to select the most suitable text stemming technique in a specific application and will also serve as a guide to identify the areas that need attention from the research community.


SIRS | 2014

Automatic Stemming of Words for Punjabi Language

Vishal Gupta

The major task of a stemmer is to find root words that are not in original form and are hence absent in the dictionary. The stemmer after stemming finds the word in the dictionary. If a match of the word is not found, then it may be some incorrect word or a name, otherwise the word is correct. For any language in the world, stemmer is a basic linguistic resource required to develop any type of application in Natural Language Processing (NLP) with high accuracy such as machine translation, document classification, document clustering, text question answering, topic tracking, text summarization and keywords extraction etc. This paper concentrates on complete automatic stemming of Punjabi words covering Punjabi nouns, verbs, adjectives, adverbs, pronouns and proper names. A suffix list of 18 suffixes for Punjabi nouns and proper names and a number of other suffixes for Punjabi verbs, adjectives and adverbs and different stemming rules for Punjabi nouns, verbs, adjectives, adverbs, pronouns and proper names have been generated after analysis of corpus of Punjabi. It is first time that complete Punjabi stemmer covering Punjabi nouns, verbs, adjectives, adverbs, pronouns, and proper names has been proposed and it will be useful for developing other Punjabi NLP applications with high accuracy. A portion of Punjabi stemmer of proper names and nouns has been implemented as a part of Punjabi text summarizer in MS Access as back end and ASP.NET as front end with 87.37% efficiency


Cognitive Computation | 2016

A Novel Hybrid Text Summarization System for Punjabi Text

Vishal Gupta; Narvinder Kaur

Text summarization is the task of shortening text documents but retaining their overall meaning and information. A good summary should highlight the main concepts of any text document. Many statistical-based, location-based and linguistic-based techniques are available for text summarization. This paper has described a novel hybrid technique for automatic summarization of Punjabi text. Punjabi is an official language of Punjab State in India. There are very few linguistic resources available for Punjabi. The proposed summarization system is hybrid of conceptual-, statistical-, location- and linguistic-based features for Punjabi text. In this system, four new location-based features and two new statistical features (entropy measure and Z score) are used and results are very much encouraging. Support vector machine-based classifier is also used to classify Punjabi sentences into summary and non-summary sentences and to handle imbalanced data. Synthetic minority over-sampling technique is applied for over-sampling minority class data. Results of proposed system are compared with different baseline systems, and it is found that F score, Precision, Recall and ROUGE-2 score of our system are reasonably well as compared to other baseline systems. Moreover, summary quality of proposed system is comparable to the gold summary.


multi disciplinary trends in artificial intelligence | 2014

N-gram Based Approach for Opinion Mining of Punjabi Text

Amandeep Kaur; Vishal Gupta

Opinion mining is the process of analyzing views, attitude or opinions of a writer or a speaker. Research in this particular area involves the detection of opinions from the text of any language. Vast amount of work has been done for the English language. In spite of lack of resources for Indian languages, work has been done for Telugu, Bengali and Hindi language. In this paper, we proposed a hybrid research approach for the emotion/opinion mining of the Punjabi text. Hybrid technique is the combination of Naive Bayes and N-grams. As the part of presented research, we have extracted the features of N-grams model which are used to train Naive Bayes. The trained model is then validated using the testing data. Results obtained are also compared with already existing approaches and the accuracy of the results shows the better efficacy of the proposed method.


Artificial Cells Nanomedicine and Biotechnology | 2018

Enhanced acyclovir delivery using w/o type microemulsion: preclinical assessment of antiviral activity using murine model of zosteriform cutaneous HSV-1 infection

Amanpreet Kaur; Gajanand Sharma; Vishal Gupta; Radha Kanta Ratho; Shishu; Om Prakash Katare

Abstract The present study was aimed to develop and evaluate a microemulsion-based dermal drug delivery of an antiviral agent, acyclovir. A water-in-oil microemulsion was prepared using isopropyl myristate, Tween 20, Span 20, water and dimethylsulphoxide. It was characterized for drug content, stability, globule size, pH, viscosity and ex vivo permeation through mice skin. In vivo antiviral efficacy of optimized formulation was assessed in female Balb/c mice against herpes simplex virus-I (HSV-I)-induced infection. It was observed that optimized formulation when applied 24-h post-infection could completely inhibit the development of cutaneous herpetic lesions vis-à-vis marketed cream.


Cognitive Computation | 2017

An Efficient Corpus-Based Stemmer

Jasmeet Singh; Vishal Gupta

Word stemming is a linguistic process in which the various inflected word forms are matched to their base form. It is among the basic text pre-processing approaches used in Natural Language Processing and Information Retrieval. Stemming is employed at the text pre-processing stage to solve the issue of vocabulary mismatch or to reduce the size of the word vocabulary, and consequently also the dimensionality of training data for statistical models. In this article, we present a fully unsupervised corpus-based text stemming method which clusters morphologically related words based on lexical knowledge. The proposed method performs cognitive-inspired computing to discover morphologically related words from the corpus without any human intervention or language-specific knowledge. The performance of the proposed method is evaluated in inflection removal (approximating lemmas) and Information Retrieval tasks. The retrieval experiments in four different languages using standard Text Retrieval Conference, Cross-Language Evaluation Forum, and Forum for Information Retrieval Evaluation collections show that the proposed stemming method performs significantly better than no stemming. In the case of highly inflectional languages, Marathi and Hungarian, the improvement in Mean Average Precision is nearly 50% as compared to unstemmed words. Moreover, the proposed unsupervised stemming method outperforms state-of-the-art strong language-independent and rule-based stemming methods in all the languages. Besides Information Retrieval, the proposed stemming method also performs significantly better in inflection removal experiments. The proposed unsupervised language-independent stemming method can be used as a multipurpose tool for various tasks such as the approximation of lemmas, improving retrieval performance or other Natural Language Processing applications.


SIRS | 2014

Hybrid Approach for Punjabi Question Answering System

Poonam Gupta; Vishal Gupta

In this paper a hybrid algorithm for Punjabi Question Answering system has been implemented. A hybrid system that works on various kinds of question types using the concepts of pattern matching as well as mathematical expression for developing a scoring system that can help differentiate best answer among available set of multiple answers found by the algorithm and is also domain specific like sports. The proposed system is designed and built in such a way that it increases the accuracy of question answering system in terms of recall and precision and is working for factoid questions and answers text in Punjabi. The system constructs a novel mathematical scoring system to identify most accurate probable answer out of the multiple answer patterns.The answers are extracted for various types of Punjabi questions. The experimental results are evaluated on the basis of Precision, Recall, F-score and Mean Reciprocal Rank (MRR). The average value of precision, recall, f-score and Mean Reciprocal Rank is 85.66%, 65.28%, 74.06%, 0.43 (normalised value) respectively. MRR values are Optimal. These values are act as discrimination factor values between one relevant answer to the other relevant answer.


international conference on electrical electronics and optimization techniques | 2016

Efficiency comparison of various plagiarism detection techniques

Mansi Sahi; Vishal Gupta

Plagiarism is using someone else ideas without acknowledging or proper citation to the source. This paper discusses various plagiarism techniques including semantic based, improved ranking based semantic, semantic and syntactic based, metrics based, fuzzy based. The study demonstrates that plagiarism detection techniques these days not only concentrate on exact copy but also to catch intelligent plagiarism like paraphrasing, restructuring of words in a sentence, technical tricks to exploit weaknesses of systems, deliberate or inaccurate use references up to some extent.


international conference on recent advances in engineering computational sciences | 2015

Performance analysis of recent Word Sense Disambiguation techniques

Harsimran Singh; Vishal Gupta

This paper presents recent advances in the in the area of Word Sense Disambiguation (WSD). While the supervised machine learning techniques have proven to be most efficient with the problem of availability of sense tagged data. While describing a few important techniques the paper then represents a comparative analysis among them. There is very less commonality among the data sets which have been used but it has been found out that the Genetic Algorithm based approach has the capability to beat other milestone techniques in the literature.

Collaboration


Dive into the Vishal Gupta's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Saurabh Sharma

Indian Institute of Technology Roorkee

View shared research outputs
Researchain Logo
Decentralizing Knowledge