Chifumi Nishioka
University of Kiel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chifumi Nishioka.
international conference on knowledge capture | 2015
Gregor Große-Bölting; Chifumi Nishioka; Ansgar Scherp
We introduce a framework for automated semantic document annotation that is composed of four processes, namely concept extraction, concept activation, annotation selection, and evaluation. The framework is used to implement and compare different annotation strategies motivated by the literature. For concept extraction, we apply entity detection with semantic hierarchical knowledge bases, Tri-gram, RAKE, and LDA. For concept activation, we compare a set of statistical, hierarchy-based, and graph-based methods. For selecting annotations, we compare top-k as well as kNN. In total, we define 43 different strategies including novel combinations like using graph-based activation with kNN. We have evaluated the strategies using three different datasets of varying size from three scientific disciplines (economics, politics, and computer science) that contain 100, 000 manually labeled documents in total. We obtain the best results on all three datasets by our novel combination of entity detection with graph-based activation (e.g., HITS and Degree) and kNN. For the economic and political science datasets, the best F-measure is .39 and .28, respectively. For the computer science dataset, the maximum F-measure of .33 can be reached. The experiments are the by far largest on scholarly content annotation, which typically are up to a few hundred documents per dataset only.
ieee international conference semantic computing | 2015
Gregor Grosse-Bolting; Chifumi Nishioka; Ansgar Scherp
We present the design and application of a generic approach for semantic extraction of professional interests from social media using a hierarchical knowledge-base and spreading activation theory. By this, we can assess to which extend a users social media life reflects his or her professional life. Detecting named entities related to professional interests is conducted by a taxonomy of terms in a particular domain. It can be assumed that one can freely obtain such a taxonomy for many professional fields including computer science, social sciences, economics, agriculture, medicine, and so on. In our experiments, we consider the domain of computer science and extract professional interests from a users Twitter stream. We compare different spreading activation functions and metrics to assess the performance of the obtained results against evaluation data obtained from the professional publications of the Twitter users. Besides selected existing activation functions from the literature, we also introduce a new spreading activation function that normalizes the activation w.r.t. to the outdegree of the concepts.
international conference on knowledge capture | 2015
Chifumi Nishioka; Ansgar Scherp
We present initial results of finding temporal patterns of entity dynamics on the Linked Open Data (LOD) cloud. For the analysis, we use the dataset of the three-year observation of the Dynamic Linked Data Observatory. Using k-means++ clustering with Euclidean distance, we reveal the temporal patterns of entity dynamics. In addition, we conduct the first investigation of periodicity in entity dynamics on the LOD cloud. While a large portion of entities are static, a certain number of entities have a temporal pattern with substantial changes. We observe different periodicity with respect to temporal patterns of entity dynamics. Knowing about the temporal patterns and their periodicity is important for applications that are depending on fresh data caches and indices of the distributed LOD cloud. They can concentrate in crawling and refreshing those parts of the LOD cloud, which are a) known to have changes in the past and b) currently have their highest periodical change rate.
Proceedings of the 15th International Conference on Knowledge Technologies and Data-driven Business | 2015
Chifumi Nishioka; Gregor Große-Bölting; Ansgar Scherp
We conduct two experiments to compare different scoring functions for extracted user interests and measure the influence of using older data. We apply our experiments in the domains of computer science and medicine. The first experiment assesses similarity scores between a users social media profile and a corresponding users publication profile, in order to evaluate to which extend a users social media profile reflects his or her professional interests. The second experiment recommends related researchers profiled by their publications based on a users social media profile. The result revealed that while the functions using spreading activation produce large similarity scores between a user profile and publication profile, the scoring functions with statistical methods (e.g., an extension of BM25 with spreading activation) perform best for recommendation. In terms of the temporal influence, the older data have almost no influence on the performance in the medicine dataset. However, in the computer science dataset, while there is a positive influence in the first experiment, the second experiment demonstrated a negative influence when adding too old data.
acm ieee joint conference on digital libraries | 2018
Chifumi Nishioka; Hiroaki Ogata
This paper shows a research paper recommender system for university students. The recommender system is embedded in an e-book system, which displays learning materials (e.g., slides) and is used at lectures. The recommender system suggests papers related to a learning material. The experiment revealed students do not access to recommended papers during the lecture. Instead, they access to research papers when reviewing the lecture and/or working for an assignment.
Proceedings of the International Conference on Web Intelligence | 2017
Chifumi Nishioka; Ansgar Scherp
Many Linked Open Data applications require fresh copies of RDF data at their local repositories. Since RDF documents constantly change and those changes are not automatically propagated to the LOD applications, it is important to regularly visit the RDF documents to refresh the local copies and keep them up-to-date. For this purpose, crawling strategies determine which RDF documents should be preferentially fetched. Traditional crawling strategies rely only on how an RDF document has been modified in the past. In contrast, we predict on the triple level whether a change will occur in the future. We use the weekly snapshots of the DyLDO dataset as well as the monthly snapshots of the Wikidata dataset. First, we conduct an in-depth analysis of the life span of triples in RDF documents. Through the analysis, we identify which triples are stable and which are ephemeral. We introduce different features based on the triples and apply a simple but effective linear regression model. Second, we propose a novel crawling strategy based on the linear regression model. We conduct two experimental setups where we vary the amount of available bandwidth as well as iteratively observe the quality of the local copies over time. The results demonstrate that the novel crawling strategy outperforms the state of the art in both setups.
acm/ieee joint conference on digital libraries | 2016
Chifumi Nishioka; Ansgar Scherp
ieee international conference semantic computing | 2018
Chifumi Nishioka; Ansgar Scherp
GI-Jahrestagung | 2017
Ahmed Saleh; Florian Mai; Chifumi Nishioka; Ansgar Scherp
extended semantic web conference | 2016
Ansgar Scherp; Daniela Pscheida; Chifumi Nishioka; Annalouise Maas; Vasileios Mezaris; Thomas Köhler; Chrysa Collyda; Michael Wiese