Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mayank Singh is active.

Publication


Featured researches published by Mayank Singh.


D-lib Magazine | 2015

The Tenth Anniversary of Assigning DOI Names to Scientific Data and a Five Year History of DataCite

Mayank Singh; Soumajit Pramanik; Tanmoy Chakraborty

As part of a project initiated by the German Research Foundation (DFG), the German National Library of Science and Technology (TIB) assigned its first DOI names to scientific data in summer 2004. The goal was to use persistent identifiers as part of a broader effort to make scientific datasets citable research outputs. The effort begun by TIB led to the creation and funding of DataCite on 1 December 2009. During the past five years DataCite has grown into a global consortium that has assigned over four million DOI names to scientific datasets and other research artefacts. It is a successful cooperative effort led by scientists, librarians and researchers. This article highlights its development and gives an overview of DataCites recent work.This paper describes PubIndia, a framework for analyzing the growth and impact of research activities performed in India in the computer science domain, based on the evidence of scientific publications. We gathered and analyzed a massive publication dataset of more than 2.5 million papers in the computer science domain with rich metadata information associated with each paper. Specifically, we attempted to analyze the temporal evolution of the collaboration pattern and the shift in research work among different topics, and made a thorough comparison between Indian and Chinese research activities. A preliminary analysis on a subset of papers on Natural Language Processing extracted from the large dataset revealed that Indian researchers tend to collaborate with researchers outside of India quite often; however Chinese researchers tend to work among themselves. We also show the evolutionary landscape of different keywords that indicate how the importance of individual keywords varies over the years.


conference on information and knowledge management | 2015

The Role Of Citation Context In Predicting Long-Term Citation Profiles: An Experimental Study Based On A Massive Bibliographic Text Dataset

Mayank Singh; Vikas Patidar; Suhansanu Kumar; Tanmoy Chakraborty; Animesh Mukherjee; Pawan Goyal

The impact and significance of a scientific publication is measured mostly by the number of citations it accumulates over the years. Early prediction of the citation profile of research articles is a significant as well as challenging problem. In this paper, we argue that features gathered from the citation contexts of the research papers can be very relevant for citation prediction. Analyzing a massive dataset of nearly 1.5 million computer science articles and more than 26 million citation contexts, we show that average countX (number of times a paper is cited within the same article) and average citeWords (number of words within the citation context) discriminate between various citation ranges as well as citation categories. We use these features in a stratified learning framework for future citation prediction. Experimental results show that the proposed model significantly outperforms the existing citation prediction models by a margin of 8-10% on an average under various experimental settings. Specifically, the features derived from the citation context help in predicting long-term citation behavior.


acm/ieee joint conference on digital libraries | 2015

ConfAssist: A Conflict Resolution Framework for Assisting the Categorization of Computer Science Conferences

Mayank Singh; Tanmoy Chakraborty; Animesh Mukherjee; Pawan Goyal

Classifying publication venues into top-tier or non top-tier is quite subjective and can be debatable at times. sIn this paper, we propose ConfAssist, a novel assisting framework for conference categorization that aims to address the limitations in the existing systems and portals for venue classification. We identify various features related to the stability of conferences that might help us separate a top-tier conference from the rest of the lot. While there are many clear cases where expert agreement can be almost immediately achieved as to whether a conference is a top-tier or not, there are equally many cases that can result in a conflict even among the experts. ConfAssist tries to serve as an aid in such cases by increasing the confidence of the experts in their decision. A human judgment survey was conducted with 28 domain experts. The results were quite impressive with 91.6% classification accuracy.


acm ieee joint conference on digital libraries | 2017

Understanding the impact of early citers on long-term scientific impact

Mayank Singh; Ajay Jaiswal; Priya Shree; Arindam Pal; Animesh Mukherjee; Pawan Goyal

This paper explores an interesting new dimension to the challenging problem of predicting long-term scientific impact (LTSI) usually measured by the number of citations accumulated by a paper in the long-term. It is well known that early citations (within 1-2 years after publication) acquired by a paper positively affects its LTSI. However, there is no work that investigates if the set of authors who bring in these early citations to a paper also affect its LTSI. In this paper, we demonstrate for the first time, the impact of these authors whom we call early citers (EC) on the LTSI of a paper. Note that this study of the complex dynamics of EC introduces a brand new paradigm in citation behavior analysis. Using a massive computer science bibliographic dataset we identify two distinct categories of EC - we call those authors who have high overall publication/citation count in the dataset as influential and the rest of the authors as non- influential. We investigate three characteristic properties of EC and present an extensive analysis of how each category correlates with LTSI in terms of these properties. In contrast to popular perception, we find that influential EC negatively affects LTSI possibly owing to attention stealing. To motivate this, we present several representative examples from the dataset. A closer inspection of the collaboration network reveals that this stealing effect is more profound if an EC is nearer to the authors of the paper being investigated. As an intuitive use case, we show that incorporating EC properties in the state-of-the-art supervised citation prediction models leads to high performance margins. At the closing, we present an online portal to visualize EC statistics along with the prediction results for a given query paper. The portal is accessible online at: http://www.cnergres.iitkgp.ac.in/earlyciters/. To facilitate reproducible research, we make all the codes and the processed dataset available in the public domain.


Journal of Informetrics | 2016

Is this conference a top-tier? ConfAssist: An assistive conflict resolution framework for conference categorization

Mayank Singh; Tanmoy Chakraborty; Animesh Mukherjee; Pawan Goyal

Classifying publication venues into top-tier or non top-tier is quite subjective and can be debatable at times. In this paper, we propose ConfAssist, a novel assisting framework for conference categorization that aims to address the limitations in the existing systems and portals for venue classification. We start with the hypothesis that top-tier conferences are much more stable than other conferences and the inherent dynamics of these groups differs to a very large extent. We identify various features related to the stability of conferences that might help us separate a top-tier conference from the rest of the lot. While there are many clear cases where expert agreement can be almost immediately achieved as to whether a conference is a top-tier or not, there are equally many cases that can result in a conflict even among the experts. ConfAssist tries to serve as an aid in such cases by increasing the confidence of the experts in their decision. An analysis of 110 conferences from 22 sub-fields of computer science clearly favors our hypothesis as the top-tier conferences are found to exhibit much less fluctuations in the stability related features than the non top-tier ones. We evaluate our hypothesis using systems based on conference categorization. For the evaluation, we conducted human judgment survey with 28 domain experts. The results are impressive with 85.18% classification accuracy. We also compare the dynamics of the newly started conferences with the older conferences to identify the initial signals of popularity. The system is applicable to any conference with atleast 5 years of publication history.


knowledge discovery and data mining | 2017

Relay-Linking Models for Prominence and Obsolescence in Evolving Networks

Mayank Singh; Rajdeep Sarkar; Pawan Goyal; Animesh Mukherjee; Soumen Chakrabarti

The rate at which nodes in evolving social networks acquire links (friends, citations) shows complex temporal dynamics. Preferential attachment and link copying models, while enabling elegant analysis, only capture rich-gets-richer effects, not aging and decline. Recent aging models are complex and heavily parameterized; most involve estimating 1-3 parameters per node. These parameters are intrinsic: they explain decline in terms of events in the past of the same node, and do not explain, using the network, where the linking attention might go instead. We argue that traditional characterization of linking dynamics are insufficient to judge the faithfulness of models. We propose a new temporal sketch of an evolving graph, and introduce several new characterizations of a networks temporal dynamics. Then we propose a new family of frugal aging models with no per-node parameters and only two global parameters. Our model is based on a surprising inversion or undoing of triangle completion, where an old node relays a citation to a younger follower in its immediate vicinity. Despite very few parameters, the new family of models shows remarkably better fit with real data. Before concluding, we analyze temporal signatures for various research communities yielding further insights into their comparative dynamics. To facilitate reproducible research, we shall soon make all the codes and the processed dataset available in the public domain.


acm ieee joint conference on digital libraries | 2017

Citation sentence reuse behavior of scientists: a case study on massive bibliographic text dataset of computer science

Mayank Singh; Abhishek Niranjan; Divyansh Gupta; Nikhil Angad Bakshi; Animesh Mukherjee; Pawan Goyal

Our current knowledge of scholarly plagiarism is largely based on the similarity between full text research articles. In this paper, we propose an innovative and novel conceptualization of scholarly plagiarism in the form of reuse of explicit citation sentences in scientific research articles. Note that while full-text plagiarism is an indicator of a gross-level behavior, copying of citation sentences is a more nuanced micro-scale phenomenon observed even for well-known researchers. The current work poses several interesting questions and attempts to answer them by empirically investigating a large bibliographic text dataset from computer science containing millions of lines of citation sentences. In particular, we report evidences of massive copying behavior. We also present several striking real examples throughout the paper to showcase widespread adoption of this undesirable practice. In contrast to the popular perception, we find that copying tendency increases as an author matures. The copying behavior is reported to exist in all fields of computer science; however, the theoretical fields indicate more copying than the applied fields.


knowledge discovery and data mining | 2016

FeRoSA: A Faceted Recommendation System for Scientific Articles

Tanmoy Chakraborty; Amrith Krishna; Mayank Singh; Niloy Ganguly; Pawan Goyal; Animesh Mukherjee


international conference on computational linguistics | 2016

OCR++: A Robust Framework For Information Extraction from Scholarly Articles.

Mayank Singh; Barnopriyo Barua; Priyank Palod; Manvi Garg; Sidhartha Satapathy; Samuel Bushi; Kumar Ayush; Krishna Sai Rohith; Tulasi Gamidi; Pawan Goyal; Animesh Mukherjee


arXiv: Computation and Language | 2016

Which techniques does your application use?: An information extraction framework for scientific articles.

Soham Dan; Sanyam Agarwal; Mayank Singh; Pawan Goyal; Animesh Mukherjee

Collaboration


Dive into the Mayank Singh's collaboration.

Top Co-Authors

Avatar

Animesh Mukherjee

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Pawan Goyal

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Tanmoy Chakraborty

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Rajdeep Sarkar

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Soumen Chakrabarti

Indian Institute of Technology Bombay

View shared research outputs
Top Co-Authors

Avatar

Abhishek Niranjan

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Ajay Jaiswal

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Amrith Krishna

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Arindam Pal

Tata Consultancy Services

View shared research outputs
Top Co-Authors

Avatar

Divyansh Gupta

Indian Institute of Technology Kharagpur

View shared research outputs
Researchain Logo
Decentralizing Knowledge