Yasushi Sakurai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yasushi Sakurai is active.

Explore More

Publication

Featured researches published by Yasushi Sakurai.

knowledge discovery and data mining | 2012

Rise and fall patterns of information diffusion: model and implications

Yasuko Matsubara; Yasushi Sakurai; B. Aditya Prakash; Lei Li; Christos Faloutsos

The recent explosion in the adoption of search engines and new media such as blogs and Twitter have facilitated faster propagation of news and rumors. How quickly does a piece of news spread over these media? How does its popularity diminish over time? Does the rising and falling pattern follow a simple universal law? In this paper, we propose SpikeM, a concise yet flexible analytical model for the rise and fall patterns of influence propagation. Our model has the following advantages: (a) unification power: it generalizes and explains earlier theoretical models and empirical observations; (b) practicality: it matches the observed behavior of diverse sets of real data; (c) parsimony: it requires only a handful of parameters; and (d) usefulness: it enables further analytics tasks such as fore- casting, spotting anomalies, and interpretation by reverse- engineering the system parameters of interest (e.g. quality of news, count of interested bloggers, etc.). Using SpikeM, we analyzed 7.2GB of real data, most of which were collected from the public domain. We have shown that our SpikeM model accurately and succinctly describes all the patterns of the rise-and-fall spikes in these real datasets.

symposium on principles of database systems | 2005

FTW: fast similarity search under the time warping distance

Yasushi Sakurai; Masatoshi Yoshikawa; Christos Faloutsos

Time-series data naturally arise in countless domains, such as meteorology, astrophysics, geology, multimedia, and economics. Similarity search is very popular, and DTW (Dynamic Time Warping) is one of the two prevailing distance measures. Although DTW incurs a heavy computation cost, it provides scaling along the time axis. In this paper, we propose FTW (Fast search method for dynamic Time Warping), which guarantees no false dismissals in similarity query processing. FTW efficiently prunes a significant number of the search cost. Experiments on real and synthetic sequence data sets reveals that FTW is significantly faster than the best existing method, up to 222 times.

international conference on data engineering | 2007

Stream Monitoring under the Time Warping Distance

Yasushi Sakurai; Christos Faloutsos; Masashi Yamamuro

The goal of this paper is to monitor numerical streams, and to find subsequences that are similar to a given query sequence, under the DTW (dynamic time warping) distance. Applications include word spotting, sensor pattern matching, and monitoring of bio-medical signals (e.g., EKG, ECG), and monitoring of environmental (seismic and volcanic) signals. DTW is a very popular distance measure, permitting accelerations and decelerations, and it has been studied for finite, stored sequence sets. However, in many applications such as network analysis and sensor monitoring, massive amounts of data arrive continuously and it is infeasible to save all the historical data. We propose SPRING, a novel algorithm that can solve the problem. We provide a theoretical analysis and prove that SPRING does not sacrifice accuracy, while it requires constant space and time per time-tick. These are dramatic improvements over the naive method. Our experiments on real and realistic data illustrate that SPRING does indeed detect the qualifying subsequences correctly and that it can offer dramatic improvements in speed over the naive implementation.

international conference on management of data | 2005

BRAID: stream mining through group lag correlations

Yasushi Sakurai; Spiros Papadimitriou; Christos Faloutsos

The goal is to monitor multiple numerical streams, and determine which pairs are correlated with lags, as well as the value of each such lag. Lag correlations (and anti-correlations) are frequent, and very interesting in practice: For example, a decrease in interest rates typically precedes an increase in house sales by a few months; higher amounts of fluoride in the drinking water may lead to fewer dental cavities, some years later. Additional settings include network analysis, sensor monitoring, financial data analysis, and moving object tracking. Such data streams are often correlated (or anti-correlated), but with an unknown lag.We propose BRAID, a method to detect lag correlations between data streams. BRAID can handle data streams of semi-infinite length, incrementally, quickly, and with small resource consumption. We also provide a theoretical analysis, which, based on Nyquists sampling theorem, shows that BRAID can estimate lag correlations with little, and often with no error at all. Our experiments on real and realistic data show that BRAID detects the correct lag perfectly most of the time (the largest relative error was about 1%); while it is up to 40,000 times faster than the naive implementation.

knowledge discovery and data mining | 2010

Online multiscale dynamic topic models

Tomoharu Iwata; Takeshi Yamada; Yasushi Sakurai; Naonori Ueda

We propose an online topic model for sequentially analyzing the time evolution of topics in document collections. Topics naturally evolve with multiple timescales. For example, some words may be used consistently over one hundred years, while other words emerge and disappear over periods of a few days. Thus, in the proposed model, current topic-specific distributions over words are assumed to be generated based on the multiscale word distributions of the previous epoch. Considering both the long-timescale dependency as well as the short-timescale dependency yields a more robust model. We derive efficient online inference procedures based on a stochastic EM algorithm, in which the model is sequentially updated using newly obtained data; this means that past data are not required to make the inference. We demonstrate the effectiveness of the proposed method in terms of predictive performance and computational efficiency by examining collections of real documents with timestamps.

international conference on pervasive computing | 2010

Object-based activity recognition with heterogeneous sensors on wrist

Takuya Maekawa; Yutaka Yanagisawa; Yasue Kishino; Katsuhiko Ishiguro; Koji Kamei; Yasushi Sakurai; Takeshi Okadome

This paper describes how we recognize activities of daily living (ADLs) with our designed sensor device, which is equipped with heterogeneous sensors such as a camera, a microphone, and an accelerometer and attached to a users wrist. Specifically, capturing a space around the users hand by employing the camera on the wrist mounted device enables us to recognize ADLs that involve the manual use of objects such as making tea or coffee and watering plant. Existing wearable sensor devices equipped only with a microphone and an accelerometer cannot recognize these ADLs without object embedded sensors. We also propose an ADL recognition method that takes privacy issues into account because the camera and microphone can capture aspects of a users private life. We confirmed experimentally that the incorporation of a camera could significantly improve the accuracy of ADL recognition.

knowledge discovery and data mining | 2012

Fast mining and forecasting of complex time-stamped events

Yasuko Matsubara; Yasushi Sakurai; Christos Faloutsos; Tomoharu Iwata; Masatoshi Yoshikawa

Given huge collections of time-evolving events such as web-click logs, which consist of multiple attributes (e.g., URL, userID, times- tamp), how do we find patterns and trends? How do we go about capturing daily patterns and forecasting future events? We need two properties: (a) effectiveness, that is, the patterns should help us understand the data, discover groups, and enable forecasting, and (b) scalability, that is, the method should be linear with the data size. We introduce TriMine, which performs three-way mining for all three attributes, namely, URLs, users, and time. Specifically TriMine discovers hidden topics, groups of URLs, and groups of users, simultaneously. Thanks to its concise but effective summarization, it makes it possible to accomplish the most challenging and important task, namely, to forecast future events. Extensive experiments on real datasets demonstrate that TriMine discovers meaningful topics and makes long-range forecasts, which are notoriously difficult to achieve. In fact, TriMine consistently outperforms the best state-of-the-art existing methods in terms of accuracy and execution speed (up to 74x faster).

international conference on management of data | 2014

AutoPlait: automatic mining of co-evolving time sequences

Yasuko Matsubara; Yasushi Sakurai; Christos Faloutsos

Given a large collection of co-evolving multiple time-series, which contains an unknown number of patterns of different durations, how can we efficiently and effectively find typical patterns and the points of variation? How can we statistically summarize all the sequences, and achieve a meaningful segmentation? In this paper we present AutoPlait, a fully automatic mining algorithm for co-evolving time sequences. Our method has the following properties: (a) effectiveness: it operates on large collections of time-series, and finds similar segment groups that agree with human intuition; (b) scalability: it is linear with the input size, and thus scales up very well; and (c) AutoPlait is parameter-free, and requires no user intervention, no prior training, and no parameter tuning. Extensive experiments on 67GB of real datasets demonstrate that AutoPlait does indeed detect meaningful patterns correctly, and it outperforms state-of-the-art competitors as regards accuracy and speed: AutoPlait achieves near-perfect, over 95% precision and recall, and it is up to 472 times faster than its competitors.

knowledge discovery and data mining | 2014

FUNNEL: automatic mining of spatially coevolving epidemics

Yasuko Matsubara; Yasushi Sakurai; Willem G. van Panhuis; Christos Faloutsos

Given a large collection of epidemiological data consisting of the count of d contagious diseases for l locations of duration n, how can we find patterns, rules and outliers? For example, the Project Tycho provides open access to the count infections for U.S. states from 1888 to 2013, for 56 contagious diseases (e.g., measles, influenza), which include missing values, possible recording errors, sudden spikes (or dives) of infections, etc. So how can we find a combined model, for all these diseases, locations, and time-ticks? In this paper, we present FUNNEL, a unifying analytical model for large scale epidemiological data, as well as a novel fitting algorithm, FUNNELFIT, which solves the above problem. Our method has the following properties: (a) Sense-making: it detects important patterns of epidemics, such as periodicities, the appearance of vaccines, external shock events, and more; (b) Parameter-free: our modeling framework frees the user from providing parameter values; (c) Scalable: FUNNELFIT is carefully designed to be linear on the input size; (d) General: our model is general and practical, which can be applied to various types of epidemics, including computer-virus propagation, as well as human diseases. Extensive experiments on real data demonstrate that FUNNELFIT does indeed discover important properties of epidemics: (P1) disease seasonality, e.g., influenza spikes in January, Lyme disease spikes in July and the absence of yearly periodicity for gonorrhea; (P2) disease reduction effect, e.g., the appearance of vaccines; (P3) local/state-level sensitivity, e.g., many measles cases in NY; (P4) external shock events, e.g., historical flu pandemics; (P5) detect incongruous values, i.e., data reporting errors.

IEEE Pervasive Computing | 2008

Object-Blog System for Environment-Generated Content

Takuya Maekawa; Yutaka Yanagisawa; Yasue Kishino; Koji Kamei; Yasushi Sakurai; Takeshi Okadome

The object-blog service application automatically converts raw sensor data to environment-generated content (EGC), including texts, graphs, and figures. This conversion facilitates data searching and browsing. Generated content can serve several purposes, including memory aids, security, and communication media. In object-blog, personified objects automatically post entries to a Weblog about sensor data obtained from sensors attached to the objects. Feedback thus far from participants working with object-blog in an experimental environment has been positive.

Explore More