Shuting Wang
Pennsylvania State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shuting Wang.
document engineering | 2015
Shuting Wang; Chen Liang; Zhaohui Wu; Kyle Williams; Bart Pursel; Benjamin Brautigam; Sherwyn Saul; Hannah Williams; Kyle Bowen; C. Lee Giles
Concept hierarchies have been useful tools for presenting and organizing knowledge. With the rapid growth in the number of online knowledge resources, automatic concept hierarchy extraction is increasingly attractive. Here, we focus on concept extraction from textbooks based on the knowledge in Wikipedia. Given a book, we extract important concepts in each book chapter using Wikipedia as a resource and from this construct a concept hierarchy for that book. We define local and global features that capture both the local relatedness and global coherence embedded in that textbook. In order to evaluate the proposed features and extracted concept hierarchies, we manually construct concept hierarchies for three well used textbooks by labeling important concepts for each book chapter. Experiments show that our proposed local and global features achieve better performance than using only keyphrases to construct the concept hierarchies. Moreover, we observe that incorporating global features can improve the concept ranking precision and reaffirms the global coherence in the book.
conference on information and knowledge management | 2016
Shuting Wang; Alexander G. Ororbia; Zhaohui Wu; Kyle Williams; Chen Liang; Bart Pursel; C. Lee Giles
We present a framework for constructing a specific type of knowledge graph, a concept map from textbooks. Using Wikipedia, we derive prerequisite relations among these concepts. A traditional approach for concept map extraction consists of two sub-problems: key concept extraction and concept relationship identification. Previous work for the most part had considered these two sub-problems independently. We propose a framework that jointly optimizes these sub-problems and investigates methods that identify concept relationships. Experiments on concept maps that are manually extracted in six educational areas (computer networks, macroeconomics, precalculus, databases, physics, and geometry) show that our model outperforms supervised learning baselines that solve the two sub-problems separately. Moreover, we observe that incorporating textbook information helps with concept map extraction.
document engineering | 2015
Chen Liang; Shuting Wang; Zhaohui Wu; Kyle Williams; Bart Pursel; Benjamin Brautigam; Sherwyn Saul; Hannah Williams; Kyle Bowen; C. Lee Giles
As more educational resources become available online, it is possible to acquire more up-to-date knowledge and information. We propose BBookX, a novel computer facilitated system that automatically and collaboratively builds free open online books using publicly available educational resources such as Wikipedia. BBookX has two separate components: one creates an open version of existing books by linking different book chapters to Wikipedia articles, while another with an interactive user interface supports interactive real-time book creation where users are allowed to modify a generated book from explicit feedback.
international conference on management of data | 2016
Sagnik Ray Choudhury; Shuting Wang; C. Lee Giles
Most scholarly papers contain one or multiple figures. Often these figures show experimental results, e.g, line graphs are used to compare various methods. Compared to the text of the paper, figures and their semantics have received relatively less attention. This has significantly limited semantic search capabilities in scholarly search engines. Here, we report scalable algorithms for generating semantic metadata for figures. Our system has four sequential modules: 1. Extraction of figure, caption and mention; 2. Binary classification of figures as compound (contains sub-figures) or not; 3. Three class classification of non compound figures as line graph, bar graph or others; and 4. Automatic processing of line graphs to generate a textual summary. In each step a metadata file is generated, each having richer information than the previous one. The algorithms are scalable yet each individual step has an accuracy greater than 80%.
acm/ieee joint conference on digital libraries | 2016
Sagnik Ray Choudhury; Shuting Wang; C. Lee Giles
Line graphs are abundant in scholarly papers. They are usually generated from a data table and that data can not be accessed. One important step in an automated data extraction pipeline is the curve separation problem: segmenting the pixels into separate curves. Previous work in this domain has focused on raster graphics extracted from scholarly PDFs, whereas most scholarly plots are embedded as vector graphics. We report a system to extract these plots as SVG images and show how that can improve both the accuracy (90%) and the scalability (5-8 seconds) of the curve separation problem.
conference on information and knowledge management | 2014
Shuting Wang; Zhen Lei; Wang-Chien Lee
Effective patent valuation is important for patent holders. Forward patent citations, widely used in assessing patent value, have been considered as reflecting knowledge flows, just like paper citations. However, patent citations also carry legal implication, which is important for patent valuation. We argue that patent citations can either be technological citations that indicate knowledge transfer or be legal citations that delimit the legal scope of citing patents. In this paper, we first develop citation-network based methods to infer patent quality measures at either the legal or technological dimension. Then we propose a probabilistic mixture approach to incorporate both the legal and technological dimensions in patent citations, and an iterative learning process that integrates a temporal decay function on legal citations, a probabilistic citation network based algorithm and a prediction model for patent valuation. We learn all the parameters together and use them for patent valuation. We demonstrate the effectiveness of our approach by using patent maintenance status as an indicator of patent value and discuss the insights we learned from this study.
international conference data science | 2014
Shuting Wang; Wang-Chien Lee; Zhen Lei; Xianliang Zhang; Yu-Hsuan Kuo
Patents are very important intangible assets that protect firm technologies and maintain market competitiveness. Thus, patent evaluation is critical for firm business strategy and innovation management. Currently patent evaluation mostly relies on some meta information of patents, such as number of forward/backward citations and number of claims. In this paper, we propose to identify patent technological trends, which carries information about technology evolution and trajectories among patents, to enable more effective and precise patent evaluation. We explore features to capture both the value of trends and the quality of patents within a trend, and perform patent evaluation to validate the extracted trends and features using patents in the United States Patent and Trademark Office (USPTO) dataset. Experimental results demonstrate that the identified technological trends are able to capture patent value precisely. With the proposed trend related features extracted from our identified trends, we can improve patent evaluation performance significantly over the baseline using conventional features.
international world wide web conferences | 2016
Shuting Wang; Lei Liu
national conference on artificial intelligence | 2018
Chen Liang; Jianbo Ye; Shuting Wang; Bart Pursel; C. Lee Giles
national conference on artificial intelligence | 2016
Chen Liang; Shuting Wang; Zhaohui Wu; Kyle Williams; Bart Pursel; Benjamin Brautigam; Sherwyn Saul; Hannah Williams; Kyle Bowen; C. Lee Giles