Hongshu Chen
University of Technology, Sydney
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hongshu Chen.
Knowledge Based Systems | 2017
Yi Zhang; Hongshu Chen; Jie Lu; Guangquan Zhang
Abstract The journal Knowledge-based Systems (KnoSys) has been published for over 25 years, during which time its main foci have been extended to a broad range of studies in computer science and artificial intelligence. Answering the questions: “What is the KnoSys community interested in?” and “How does such interest change over time?” are important to both the editorial board and audience of KnoSys. This paper conducts a topic-based bibliometric study to detect and predict the topic changes of KnoSys from 1991 to 2016. A Latent Dirichlet Allocation model is used to profile the hotspots of KnoSys and predict possible future trends from a probabilistic perspective. A model of scientific evolutionary pathways applies a learning-based process to detect the topic changes of KnoSys in sequential time slices. Six main research areas of KnoSys are identified, i.e., expert systems, machine learning, data mining, decision making, optimization, and fuzzy, and the results also indicate that the interest of KnoSys communities in the area of computational intelligence is raised, and the ability to construct practical systems through knowledge use and accurate prediction models is highly emphasized. Such empirical insights can be used as a guide for KnoSys submissions.
Neural Computing and Applications | 2015
Hongshu Chen; Guangquan Zhang; Donghua Zhu; Jie Lu
Abstract Technology intelligence indicates the concept and applications that transform data hidden in patents or scientific literatures into technical insight for technology strategy-making support. The existing frameworks and applications of technology intelligence mainly focus on obtaining text-based knowledge with text mining components. However, what is the corresponding technological trend of the knowledge over time is seldom taken into consideration. In order to capture the hidden trend turning points and improve the framework of existing technology intelligence, this paper proposes a patent time series processing component with trend identification functionality. We use piecewise linear representation method to generate and quantify the trend of patent publication activities, then utilize the outcome to identify trend turning points and provide trend tags to the existing text mining component, thus making it possible to combine the text-based and time-based knowledge together to support technology strategy making more satisfactorily. A case study using Australia patents (year 1983–2012) in Information and Communications Technology industry is presented to demonstrate the feasibility of the component when dealing with real-world tasks. The result shows that the new component identifies the trend reasonably well, at the same time learns valuable trend turning points in historical patent time series.
portland international conference on management of engineering and technology | 2015
Yi Zhang; Hongshu Chen; Guangquan Zhang; Donghua Zhu; Jie Lu
Since its first engagement with industry decades ago, Technology Roadmapping (TRM) is taking a more and more important role for technical intelligence in current R&D planning and innovation tracking. Important topics for both science policy and engineering management researchers involves with the approaches that refer to the real-world problems, explore value-added information from the complex data sets, fuse the analytic results and expert knowledge effectively and reasonable, and demonstrate to the decision makers visually and understandable. Moreover, the growing variety of science data sources in the Big Data Age increases these challenges and opportunities. Addressing these concerns, this paper proposes a TRM composing method with a clustering-based topic identification model, a multiple science data sources integration model, and a semi-automated fuzzy set-based TRM composing model with expert aid. We focus on a case study on computer science related R&D. Empirical data from the United States National Science Foundation Award data (innovative research ideas and proposals) and Derwent Innovation Index data source (patents emphasizing technical products) provide vantage points at two stages of the R&D process. The understanding gained will assist in description of computer science macro-trends for R&D decision makers.
portland international conference on management of engineering and technology | 2015
Hongshu Chen; Yi Zhang; Guangquan Zhang; Donghua Zhu; Jie Lu
Patent claims usually embody the most essential terms and the core technological scope to define the protection of an invention, which makes them the ideal resource for patent content and topic change analysis. However, manually conducting content analysis on massive technical terms is very time consuming and laborious. Even with the help of traditional text mining techniques, it is still difficult to model topic changes over time, because single keywords alone are usually too general or ambiguous to represent a concept. Moreover, term frequency which used to define a topic cannot separate polysemous words that are actually describing a different theme. To address this issue, this research proposes a topic change identification approach based on Latent Dirichlet Allocation to model and analyze topic changes with minimal human intervention. After textual data cleaning, underlying semantic topics hidden in large archives of patent claims are revealed automatically. Concepts are defined by probability distributions over words instead of term frequency, so that polysemy is allowed. A case study using patents published in the United States Patent and Trademark Office (USPTO) from 2009 to 2013 with Australia as their assignee country is presented to demonstrate the validity of the proposed topic change identification approach. The experimental result shows that the proposed approach can be used as an automatic tool to provide machine-identified topic changes for more efficient and effective R&D management assistance.
ieee international conference on fuzzy systems | 2015
Hongshu Chen; Guangquan Zhang; Jie Lu; Donghua Zhu
Technology progress brings the very rapid growth of patent publications, which increases the difficulty of domain experts to measure the development of various topics, handle linguistic terms used in evaluation and understand massive technological content. To overcome the limitations of keyword-ranking type of text mining result in existing research, and at the same time deal with the vagueness of linguistic terms to assist thematic evaluation, this research proposes a fuzzy set-based topic development measurement (FTDM) approach to estimate and evaluate the topics hidden in a large volume of patent claims using Latent Dirichlet Allocation. In this study, latent semantic topics are first discovered from patent corpus and measured by a temporal-weight matrix to reveal the importance of all topics in different years. For each topic, we then calculate a temporal-weight coefficient based on the matrix, which is associated with a set of linguistic terms to describe its development state over time. After choosing a suitable linguistic term set, fuzzy membership functions are created for each term. The temporal-weight coefficients are then transformed to membership vectors related to the linguistic terms, which can be used to measure the development states of all topics directly and effectively. A case study using solar cell related patents is given to show the effectiveness of the proposed FTDM approach and its applicability for estimating hidden topics and measuring their corresponding development states efficiently.
IEEE Access | 2017
Ximeng Wang; Yun Liu; Guangquan Zhang; Yi Zhang; Hongshu Chen; Jie Lu
In recommender systems, collaborative filtering technology is an important method to evaluate user preference through exploiting user feedback data, and has been widely used in industrial areas. Diffusion-based recommendation algorithms inspired by diffusion phenomenon in physical dynamics are a crucial branch of collaborative filtering technology, which use a bipartite network to represent collection behaviors between users and items. However, diffusion-based recommendation algorithms calculate the similarity between users and make recommendations by only considering implicit feedback but neglecting the benefits from explicit feedback data, which would be a significant feature in recommender systems. This paper proposes a mixed similarity diffusion model to integrate both explicit feedback and implicit feedback. First, cosine similarity between users is calculated by explicit feedback, and we integrate it with resource-allocation index calculated by implicit feedback. We further improve the performance of the mixed similarity diffusion model by considering the degrees of users and items at the same time in diffusion processes. Some sophisticated experiments are designed to evaluate our proposed method on three real-world data sets. Experimental results indicate that recommendations given by the mixed similarity diffusion perform better on both the accuracy and the diversity than that of most state-of-the-art algorithms.
Archive | 2016
Yi Zhang; Hongshu Chen; Donghua Zhu
Since its first engagement with industry decades ago, technology roadmapping (TRM) is taking a more and more important role for competitive technical intelligence (CTI) in current R&D planning and innovation tracking. Important topics for both science policy and engineering management researchers involves with approaches that refer to real-world problems, explore value-added information from complex data sets, fuse analytic results and expert knowledge effectively and reasonable, and demonstrate to decision makers visually and understandably. The growing variety of Science, Technology, and Innovation (ST&I) data sources in the Big Data Age increases these challenges and opportunities. Addressing these concerns, this paper attempts to propose a semi-automatic TRM composing method to incorporate multiple ST&I data sources—we design an extendable interface for engaging diverse ST&I data sources and apply the fuzzy set to transfer vague expert knowledge to defined numeric values for automatic TRM generation. We focus on a case study on computer science-related R&D. Empirical data from the United States (US) National Science Foundation (NSF) Award data (innovative research ideas and proposals) and Derwent Innovation Index (DII) patent data source (technical and commercial information) affords vantage points at two stages of R&D process and also provide further capabilities for more ST&I data source incorporation. The understanding gained will also assist in description of computer science macro-trends for R&D decision makers.
Archive | 2014
Hongshu Chen; Guangquan Zhang; Jie Lu; Donghua Zhu
Patent data have time-dependent property and also semantic attributes. Technology clustering based on patent time-dependent data processed by trend analysis has been used to help technology relationship identification. However, the raw patent data carry more features than processed data. This paper aims to develop a new methodology to cluster patent frequency data based on its time-related properties. To handle time-dependent attributes of patent data, this study first compares it with typical time series data to propose preferable similarity measurement approach. It then presents a two-step agglomerative hierarchical technology clustering method to cluster original patent time-dependent data directly. Finally, a case study using communication-related patents is given to illustrate the clustering method.
Archive | 2016
Hongshu Chen; Yi Zhang; Donghua Zhu
Patent claims usually embody the core technological scope and the most essential terms to define the protection of an invention, which makes them the ideal resource for patent topic identification and theme changes analysis. However, conducting content analysis manually on massive technical terms is very time-consuming and laborious. Even with the help of traditional text mining techniques, it is still difficult to model topic changes over time, because single keywords alone are usually too general or ambiguous to represent a concept. Moreover, term frequency that used to rank keywords cannot separate polysemous words that are actually describing a different concept. To address this issue, this research proposes a topic change identification approach based on latent dirichlet allocation, to model and analyze topic changes and topic-based trend with minimal human intervention. After textual data cleaning, underlying semantic topics hidden in large archives of patent claims are revealed automatically. Topics are defined by probability distributions over words instead of terms and their frequency, so that polysemy is allowed. A case study using patents published in the United States Patent and Trademark Office (USPTO) from 2009 to 2013 with Australia as their assignee country is presented, to demonstrate the validity of the proposed topic change identification approach. The experimental result shows that the proposed approach can be used as an automatic tool to provide machine-identified topic changes for more efficient and effective R&D management assistance.
Journal of Informetrics | 2018
Yi Zhang; Jie Lu; Feng Liu; Qian Liu; Alan L. Porter; Hongshu Chen; Guangquan Zhang
Abstract Topic extraction presents challenges for the bibliometric community, and its performance still depends on human intervention and its practical areas. This paper proposes a novel kernel k-means clustering method incorporated with a word embedding model to create a solution that effectively extracts topics from bibliometric data. The experimental results of a comparison of this method with four clustering baselines (i.e., k-means, fuzzy c-means, principal component analysis, and topic models) on two bibliometric datasets demonstrate its effectiveness across either a relatively broad range of disciplines or a given domain. An empirical study on bibliometric topic extraction from articles published by three top-tier bibliometric journals between 2000 and 2017, supported by expert knowledge-based evaluations, provides supplemental evidence of the method’s ability on topic extraction. Additionally, this empirical analysis reveals insights into both overlapping and diverse research interests among the three journals that would benefit journal publishers, editorial boards, and research communities.