Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiaoyue Wang is active.

Publication


Featured researches published by Xiaoyue Wang.


very large data bases | 2008

Querying and mining of time series data: experimental comparison of representations and distance measures

Hui Ding; Goce Trajcevski; Peter Scheuermann; Xiaoyue Wang; Eamonn J. Keogh

The last decade has witnessed a tremendous growths of interests in applications that deal with querying and mining of time series data. Numerous representation methods for dimensionality reduction and similarity measures geared towards time series have been introduced. Each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive set of time series experiments re-implementing 8 different representation methods and 9 similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this paper, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. Our experiments have provided both a unified validation of some of the existing achievements, and in some cases, suggested that certain claims in the literature may be unduly optimistic.


Data Mining and Knowledge Discovery | 2013

Experimental comparison of representation methods and distance measures for time series data

Xiaoyue Wang; Abdullah Mueen; Hui Ding; Goce Trajcevski; Peter Scheuermann; Eamonn J. Keogh

The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation methods for dimensionality reduction or novel similarity measures for the underlying data. In the vast majority of cases, each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this article, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. In addition to providing a unified validation of some of the existing achievements, our experiments also indicate that, in some cases, certain claims in the literature may be unduly optimistic.


Data Mining and Knowledge Discovery | 2011

An efficient and effective similarity measure to enable data mining of petroglyphs

Qiang Zhu; Xiaoyue Wang; Eamonn J. Keogh; Sang-Hee Lee

Rock art is an archaeological term for human-made markings on stone, including carved markings, known as petroglyphs, and painted markings, known as pictographs. It is believed that there are millions of petroglyphs in North America alone, and the study of this valued cultural resource has implications even beyond anthropology and history. Surprisingly, although image processing, information retrieval and data mining have had a large impact on many human endeavors, they have had essentially zero impact on the study of rock art. In this work we identify the reasons for this, and introduce a novel distance measure and algorithms which allow efficient and effective data mining of large collections of rock art.


knowledge discovery and data mining | 2009

Augmenting the generalized hough transform to enable the mining of petroglyphs

Qiang Zhu; Xiaoyue Wang; Eamonn J. Keogh; Sang-Hee Lee

Rock art is an archaeological term for human-made markings on stone. It is believed that there are millions of petroglyphs in North America alone, and the study of this valued cultural resource has implications even beyond anthropology and history. Surprisingly, although image processing, information retrieval and data mining have had large impacts on many human endeavors, they have had essentially zero impact on the study of rock art. In this work we identify the reasons for this, and introduce a novel distance measure and algorithms which allow efficient and effective data mining of large collections of rock art.


international conference on tools with artificial intelligence | 2008

Real-Time Classification of Streaming Sensor Data

Shashwati Kasetty; Candice A. Stafford; G. P. Walker; Xiaoyue Wang; Eamonn J. Keogh

The last decade has seen a huge interest in classification of time series. Most of this work assumes that the data resides in main memory and is processed offline. However, recent advances in sensor technologies require resource-efficient algorithms that can be implemented directly on the sensors as real-time algorithms. We show how a recently introduced framework for time series classification, time series bitmaps, can be implemented as efficient classifiers which can be updated in constant time and space in the face of very high data arrival rates. We describe results from a case study of an important entomological problem, and further demonstrate the generality of our ideas with an example from robotics.


acm/ieee joint conference on digital libraries | 2008

Annotating historical archives of images

Xiaoyue Wang; Lexiang Ye; Eamonn J. Keogh; Christian R. Shelton

Recent initiatives like the Million Book Project and Google Print Library Project have already archived several million books in digital format, and within a few years a significant fraction of worlds books will be online. While the majority of the data will naturally be text, there will also be tens of millions of pages of images. Many of these images will defy automation annotation for the foreseeable future, but a considerable fraction of the images may be amiable to automatic annotation by algorithms that can link the historical image with a modern contemporary, with its attendant metatags. In order to perform this linking we must have a suitable distance measure which appropriately combines the relevant features of shape, color, texture and text. However the best combination of these features will vary from application to application and even from one manuscript to another. In this work we propose a simple technique to learn the distance measure by perturbing the training set in a principled way. We show the utility of our ideas on archives of manuscripts containing images from natural history and cultural artifacts.


International Journal of Digital Library Systems | 2010

Annotating Historical Archives of Images

Eamonn J. Keogh; Christian R. Shelton; Lexiang Ye; Xiaoyue Wang

Recent programs like the Million Book Project and Google Print Library Project have archived several million books in digital format, and within a few years a significant fraction of worlds books will be online. While the majority of the data will naturally be text, there will also be tens of millions of pages of images. Many of these images will defy automation annotation for the foreseeable future, but a considerable fraction of the images may be amiable to automatic annotation by algorithms that can link the historical image with a modern contemporary, with its attendant metatags. To perform this linking, there must be a suitable distance measure that appropriately combines the relevant features of shape, color, texture and text. However, the best combination of these features will vary from application to application and even from one manuscript to another. In this work, the authors propose a simple technique to learn the distance measure by perturbing the training set in a principled way.


international symposium on multimedia | 2009

Augmenting Historical Manuscripts with Automatic Hyperlinks

Xiaoyue Wang; Eamonn J. Keogh

Hyperlinks are so useful for searching and browsing modern digital collections that researchers have longer wondered if it is possible to retroactively add hyperlinks to digitized historical documents. There has already been significant research into this endeavor for historical text; however, in this work we consider the problem of adding hyperlinks among graphic elements. While such a system would not have the ubiquitous utility of text-based hyperlinks, as we will show, there are several domains where it can significantly augment textual information. While OCR of historical text is known to be a difficult problem, the actual words themselves are inherently discrete. Thus, two words are either identical or not. This means that off-the-shelf machine learning algorithms, including semi-supervised learning, can be easily used. However, as we shall demonstrate, semi-supervised learning does not work well with images, because we cannot expect binary matching decisions. Rather we must deal with degrees of matching. In this work we make the novel observation that this “degree of matching” biased algorithms make overly confident predictions about simple shapes. We show that a simple technique for correcting this bias, and demonstrate through extensive experiments that our method significantly improves accuracy on diverse historical image collections.


acm/ieee joint conference on digital libraries | 2009

Finding centuries-old hyperlinks with a novel semi-supervised learning technique

Xiaoyue Wang; Eamonn J. Keogh

Hyperlinks are so useful for searching and browsing modern digital collections that researchers have longer wondered if it is possi-ble to retroactively add hyperlinks to digitized historical documents. There has already been significant research into this endeavor for historical text; however, in this work we consider the problem of adding hyperlinks among graphic elements. While such a system would not have the ubiquitous utility of text-based hyperlinks, there are several domains where it can potentially significantly augment textual information.


siam international conference on data mining | 2011

A Complexity-Invariant Distance Measure for Time Series.

Gustavo E. A. P. A. Batista; Xiaoyue Wang; Eamonn J. Keogh

Collaboration


Dive into the Xiaoyue Wang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lexiang Ye

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hui Ding

Northwestern University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qiang Zhu

University of California

View shared research outputs
Top Co-Authors

Avatar

Sang-Hee Lee

University of California

View shared research outputs
Top Co-Authors

Avatar

Abdullah Mueen

University of New Mexico

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge