Muthu Kumar Chandrasekaran
National University of Singapore
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Muthu Kumar Chandrasekaran.
acm/ieee joint conference on digital libraries | 2013
Huy Hoang Do; Muthu Kumar Chandrasekaran; Philip S. Cho; Min-Yen Kan
We introduce Enlil, an information extraction system that discovers the institutional affiliations of authors in scholarly papers. Enlil consists of two steps: one that first identifies authors and affiliations using a conditional random field; and a second support vector machine that connects authors to their affiliations. We benchmark Enlil in three separate experiments drawn from three different sources: the ACL Anthology Corpus, the ACM Digital Library, and a set of cross-disciplinary scientific journal articles acquired by querying Google Scholar. Against a state-of-the-art production baseline, Enlil reports a statistically significant improvement in F_1 of nearly 10% (p << 0.01). In the case of multidisciplinary articles from Google Scholar, Enlil is benchmarked over both clean input (F_1 > 90%) and automatically-acquired input (F_1 > 80%). We have deployed Enlil in a case study involving Asian genomics research publication patterns to understand how government sponsored collaborative links evolve. Enlil has enabled our team to construct and validate new metrics to quantify the facilitation of research as opposed to direct publication.
acm/ieee joint conference on digital libraries | 2015
Nguyen Viet Cuong; Muthu Kumar Chandrasekaran; Min-Yen Kan; Wee Sun Lee
We address the tasks of recovering bibliographic and document structure metadata from scholarly documents. We leverage higher order semi-Markov conditional random fields to model long-distance label sequences, improving upon the performance of the linear-chain conditional random field model. We introduce the notion of extensible features, which allows the expensive inference process to be simplified through memoization, resulting in lower computational complexity. Our method significantly betters the state-of-the-art on three related scholarly document extraction tasks.
International Journal on Digital Libraries | 2018
Kokil Jaidka; Muthu Kumar Chandrasekaran; Sajal Rustagi; Min-Yen Kan
We describe the participation and the official results of the 2nd Computational Linguistics Scientific Summarization Shared Task (CL-SciSumm), held as a part of the BIRNDL workshop at the Joint Conference for Digital Libraries 2016 in Newark, New Jersey. CL-SciSumm is the first medium-scale Shared Task on scientific document summarization in the computational linguistics (CL) domain. Participants were provided a training corpus of 30 topics, each comprising of a reference paper (RP) and 10 or more citing papers, all of which cite the RP. For each citation, the text spans (i.e., citances) that pertain to the RP have been identified. Participants solved three sub-tasks in automatic research paper summarization using this text corpus. Fifteen teams from six countries registered for the Shared Task, of which ten teams ultimately submitted and presented their results. The annotated corpus comprised 30 target papers—currently the largest available corpora of its kind. The corpus is available for free download and use at https://github.com/WING-NUS/scisumm-corpus.
acm/ieee joint conference on digital libraries | 2016
Guillaume Cabanac; Muthu Kumar Chandrasekaran; Ingo Frommholz; Kokil Jaidka; Min-Yen Kan; Philipp Mayr; Dietmar Wolfram
The large scale of scholarly publications poses a challenge for scholars in information-seeking and sensemaking. Bibliometric, information retrieval~(IR), text mining and NLP techniques could help in these activities, but are not yet widely used in digital libraries. This workshop is intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometric and recommendation techniques which can advance the state-of-the-art in scholarly document understanding, analysis and retrieval at scale.
Journal of the Association for Information Science and Technology | 2017
Juyoung An; Nam-Hee Kim; Min-Yen Kan; Muthu Kumar Chandrasekaran; Min Song
Big Science and cross‐disciplinary collaborations have reshaped the intellectual structure of research areas. A number of works have tried to uncover this hidden intellectual structure by analyzing citation contexts. However, none of them analyzed by document logical structures such as sections. The two major goals of this study are to find characteristics of authors who are highly cited section‐wise and to identify the differences in section‐wise author networks. This study uses 29,158 of research articles culled from the ACL Anthology, which hosts articles on computational linguistics and natural language processing. We find that the distribution of citations across sections is skewed and that a different set of highly cited authors share distinct academic characteristics, according to their citation locations. Furthermore, the author networks based on citation context similarity reveal that the intellectual structure of a domain differs across different sections.
Scientometrics | 2013
Philip S. Cho; Huy Hoang Do; Muthu Kumar Chandrasekaran; Min-Yen Kan
We introduce a novel set of metrics for triadic closure among individuals or groups to model how co-authorship networks become more integrated over time. We call this process of triadic, third-party mediated integration, research facilitation. We apply our research facilitation or RF-metrics to the development of the Pan-Asian SNP (PASNP) Consortium, the first inter-Asian genomics network. Our aim was to examine if the consortium catalyzed research facilitation or integration among the members and the wider region. The PASNP Consortium is an ideal case study of an emerging Asian Research Area because its members themselves asserted a regional Asian identity. To validate our model, we developed data mining software to extract and match full author and institutional information from the PDFs of scientific papers.
meeting of the association for computational linguistics | 2015
Tao Chen; Naijia Zheng; Yue Zhao; Muthu Kumar Chandrasekaran; Min-Yen Kan
We propose WordNews, a web browser extension that allows readers to learn a second language vocabulary while reading news online. Injected tooltips allow readers to look up selected vocabulary and take simple interactive tests. We discover that two key system components needed improvement, both which stem from the need to model context. These two issues are real-world word sense disambiguation (WSD) to aid translation quality and constructing interactive tests. For the first, we start with Microsoft’s Bing translation API but employ additional dictionary-based heuristics that significantly improve translation in both coverage and accuracy. For the second, we propose techniques for generating appropriate distractors for multiple-choice word mastery tests. Our preliminary user survey confirms the need and viability of such a language learning platform.
international acm sigir conference on research and development in information retrieval | 2017
Muthu Kumar Chandrasekaran; Kokil Jaidka; Philipp Mayr
The large scale of scholarly publications poses a challenge for scholars in information-seeking and sensemaking. Bibliometric, information retrieval (IR), text mining and NLP techniques could help in these activities, but are not yet widely used in digital libraries. This workshop is intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometric and recommendation techniques which can advance the state-of-the-art in scholarly document understanding, analysis and retrieval at scale.
educational data mining | 2015
Muthu Kumar Chandrasekaran; Min-Yen Kan; Bernard C.Y. Tan; Kiruthika Ragupathi
BIRNDL@JCDL | 2016
Kokil Jaidka; Muthu Kumar Chandrasekaran; Sajal Rustagi; Min-Yen Kan