Takahiro Komamizu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takahiro Komamizu is active.

Explore More

Publication

Featured researches published by Takahiro Komamizu.

information integration and web-based applications & services | 2011

A framework of faceted navigation for XML data

Takahiro Komamizu; Toshiyuki Amagasa; Hiroyuki Kitagawa

In this paper, we propose a framework of faceted navigation over XML data. General faceted navigation schemes are used to browse objects (or records) containing multiple properties. However, because XML is semi-structured in nature, it is not straightforward to apply faceted navigation to XML data. Specifically, we need to cope with three major technical issues: 1) objects in XML data are not predetermined, 2) objects may have flexible and/or recursive structure, and 3) properties of an object need to be automatically detected and extracted. To these problems, in this paper, we formulate faceted navigation over XML data by giving definitions of class, property, object, and facet in XML data. We then formulate typical user interactions in faceted navigation as operations over aforementioned concepts (class, object, and facet). We also propose a framework based on these definitions and operations, and construct a prototype system based on the framework. Finally, we show experimental evaluations using the prototype system to show the effectiveness of our proposed scheme.

international database engineering and applications symposium | 2016

Visual Spatial-OLAP for Vehicle Recorder Data on Micro-sized Electric Vehicles

Takahiro Komamizu; Toshiyuki Amagasa; Hiroyuki Kitagawa

Analyzing vehicle recorder data of electric vehicles (EVs) reveals how the EVs are used. This paper proposes an OLAP framework to support analyzing trajectories in vehicle recorder data and applies the framework to vehicle recorder data of EVs. The framework consists of ETL (extract, transform, and load) process for trajectory data and visualization for analyzing the data. The ETL process includes hierarchy definitions for spatial and temporal dimensions, as well as aggregation functions for trajectory data. In the subsequent visualization phase, the framework displays results of OLAP operations on map interface. To ensure the applicability of the framework for real applications, we apply the framework to vehicle recorder data of micro-sized EVs (or μEVs), which are smaller EVs with one or two passengers including one driver and can drive at most 100km distance without charging on the way. The application realizes that the framework successfully enables analyses on the trajectory data for real analytic requirements.

high performance computing and communications | 2016

Towards Real-Time Analysis of Smart City Data: A Case Study on City Facility Utilizations

Takahiro Komamizu; Toshiyuki Amagasa; Salman Ahmed Shaikh; Hiroaki Shiokawa; Hiroyuki Kitagawa

Analyzing smart city data can reveal various important facts related to cities, thus smart city initiatives have attracted much attentions of computer science researchers as well as data scientists. Analyzing smart city data in a real-time manner is an important scenario of smart city data analysis. Real-time analysis can alert emergent situations (e.g., traffic accidents, squalls), recommend useful information (e.g., usable facilities on time, routes for destinations), and so forth. In this paper, we develop a real-time analytical system based on StreamOLAP (an OLAP system for streaming data) for smart city data analysis which enables OLAP-style analytics over streaming data, and apply the analytical system for real-world smart city data (namely, city facility utilization data). The application realizes that our system is feasible to analyse smart city data in a real-time manner.

International Journal of Web Information Systems | 2016

H-SPOOL: A SPARQL-based ETL framework for OLAP over linked data with dimension hierarchy extraction

Takahiro Komamizu; Toshiyuki Amagasa; Hiroyuki Kitagawa

Purpose Linked data (LD) has promoted publishing information, and links published information. There are increasing number of LD datasets containing numerical data such as statistics. For this reason, analyzing numerical facts on LD has attracted attentions from diverse domains. This paper aims to support analytical processing for LD data. Design/methodology/approach This paper proposes a framework called H-SPOOL which provides series of SPARQL (SPARQL Protocol and RDF Query Language) queries extracting objects and attributes from LD data sets, converts them into star/snowflake schemas and materializes relevant triples as fact and dimension tables for online analytical processing (OLAP). Findings The applicability of H-SPOOL is evaluated using exiting LD data sets on the Web, and H-SPOOL successfully processes the LD data sets to ETL (Extract, Transform, and Load) for OLAP. Besides, experiments show that H-SPOOL reduces the number of downloaded triples comparing with existing approach. Originality/value H-SPOOL is the first work for extracting OLAP-related information from SPARQL endpoints, and H-SPOOL drastically reduces the amount of downloaded triples.

information integration and web-based applications & services | 2015

SPOOL: a SPARQL-based ETL framework for OLAP over linked data

Takahiro Komamizu; Toshiyuki Amagasa; Hiroyuki Kitagawa

Linked Data (or LD) has promoted publishing information, and links published information (e.g., vocabularies and facts) for utilization. There are increasing number of LD datasets containing numerical data such as statistics. Analyses using such data require dedicated programs to extract, transform, and load (or ETL) for preparation. Thus, a large effort of developers is required. Also, the LD datasets tend to be large and the dumps (or snapshots) for the datasets easily become not up-to-date due to update frequency of the datasets. Hence, downloading dumps of LD datasets to ETL for OLAP can miss latest records. This paper proposes a framework called SPOOL, which attempts to reduce the effort and to ETL latest numerical records data from LD datasets for OLAP through SPARQL endpoints without downloading whole datasets. SPOOL provides series of SPARQL queries extracting objects and attributes from LD datasets, and converts them into star/snowflake schemas, and materialize relevant triples as fact and dimension tables for OLAP. The applicability of SPOOL is evaluated using exiting LD datasets on the Web, and SPOOL successfully processes the LD datasets to ETL for OLAP.

international database engineering and applications symposium | 2014

A scheme of automated object and facet extraction for faceted search over XML data

Takahiro Komamizu; Toshiyuki Amagasa; Hiroyuki Kitagawa

Applying faceted search for XML data enables users to search XML data in an interactive manner. However, applying faceted search is challenging, because faceted search requires target subtrees (objects) and facets to be defined before-hand. To this problem, existing works assume that such objects and/or facets are defined manually, but it is infeasible to manually specify objects and facets in particular when the XML data are huge and/or its structure is quite complicated. To address this problem, this paper proposes an automatic extraction scheme of objects and facets from XML data. We propose two approaches, namely frequency-based approach and semantic-based approach, and also hybrid approach of them. The basic ideas of these approaches are that the frequently occurring XML elements seem to be objects and facets, and such XML elements may have semantically meaningful name. Although the proposed approaches are rather simple, the experiments using real world XML data show that the proposed approaches can automatically extract objects and facets from the XML data.

asia information retrieval symposium | 2017

FORK: Feedback-Aware ObjectRank-Based Keyword Search over Linked Data

Takahiro Komamizu; Sayami Okumura; Toshiyuki Amagasa; Hiroyuki Kitagawa

Ranking quality for keyword search over Linked Data (LD) is crucial when users look for entities from LD, since datasets in LD have complicated structures as well as much contents. This paper proposes a keyword search method, FORK, which ranks entities in LD by ObjectRank, a well-known link-structure analysis algorithm that can deal with different types of nodes and edges. The first attempt of applying ObjectRank to LD search reveals that ObjectRank with inappropriate settings gives worse ranking results than PageRank which is equivalent to ObjectRank with all the same authority transfer weights. Therefore, deriving appropriate authority transfer weights is the most important issue for encouraging ObjectRank in LD search. FORK involves a relevance feedback algorithm to modify the authority transfer weights according with users’ relevance judgements for ranking results. The experimental evaluation of ranking qualities using an entity search benchmark showcases the effectiveness of FORK, and it proves ObjectRank is more feasible raking method for LD search than PageRank and other comparative baselines including information retrieval techniques and graph analytic methods.

International Journal of Web Information Systems | 2012

Faceted navigation framework for XML data

Takahiro Komamizu; Toshiyuki Amagasa; Hiroyuki Kitagawa

Purpose – XML has become a standard data format for many applications and efficient retrieval methods are required. Typically, there are roughly two kinds of retrieval methods, namely path‐based method (e.g. XPath and XQuery) and keyword search, but these methods do not work when users do not have any concrete information need. To expand feasibility of XML data retrieval is an important task and this is the purpose of this paper.Design/methodology/approach – The papers strategy is to apply faceted navigation for XML data. Faceted navigation is an exploratory search which enables the exploration of data making use of attributes, called facets. General faceted navigation methods are applied for attributed objects but XML data have no criteria because XML nodes are objects and facets. Thus, the papers approach is to construct a framework to enable faceted navigation over XML data. It first extracts objects based on occurrence of nodes and facets. Then it constructs a faceted navigation interface for extrac...

management of emergent digital ecosystems | 2017

SOLA: Stream OLAP-based Analytical Framework for Roadway Maintenance

Takahiro Komamizu; Toshiyuki Amagasa; Salman Ahmed Shaikh; Hiroaki Shiokawa; Hiroyuki Kitagawa

Maintaining infrastructures (e.g., roadway) is a critical issue for local governments. Data from physical devices and reports from citizens through social networks are helpful to observe conditions of infrastructures. This paper proposes a framework called SOLA for integrating and analysing data from multiple sources including streaming data and static data for roadway management. The framework integrates data from multiple sources in the way of stream OLAP architecture, and analyses the integrated data in terms of OLAP analysis. This paper applies the framework to support roadway managements of local governments, and develops the application called SOLAR. SOLAR aims at providing historical views of roadway patrols as well as roadway statuses for assisting in determining roadway patrolling schedules. The real-world use case on a city exhibits the applicability of SOLAR with positive feedbacks from city officers. SOLA is a promising framework for big data analysis and smart city applications, as the number, amount, and speed of generating data increase in the era of big data and smart city.

international conference on software engineering | 2017

Exploring Identical Users on GitHub and Stack Overflow.

Takahiro Komamizu; Yasuhiro Hayase; Toshiyuki Amagasa; Hiroyuki Kitagawa

Analyzing behaviours of developers in different platforms (in particular, GitHub and Stack Overflow in this paper) can reveal interesting facts related to development activities. There are only few datasets for analysing crossplatform user behaviours, especially across GitHub and Stack Overflow. Users on GitHub and Stack Overflow are identifiable by equivalences of email addresses. In order to increase the number of identifiable users on these datasets, this paper retrieves potentially identifiable users between GitHub and Stack Overflow not relying only on email addresses. This paper employs a classification-based link prediction, which design the user identification problem as a link prediction problem on the bipartite graph consisting of users of GitHub and those of Stack Overflow. With the identification method, this paper generates a probabilistic dataset containing pairs of users with probabilities (or confidences). This paper, as well, publishes the identification tool in order to enable further data generation on appearing datasets of GitHub, Stack Overflow and others. The generated dataset and tool are highly helpful to accelerate researches on mining software repositories.

Explore More