Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Bo Kang is active.

Publication


Featured researches published by Bo Kang.


knowledge discovery and data mining | 2016

Subjectively Interesting Component Analysis: Data Projections that Contrast with Prior Expectations

Bo Kang; Jefrey Lijffijt; Raul Santos-Rodriguez; Tijl De Bie

Methods that find insightful low-dimensional projections are essential to effectively explore high-dimensional data. Principal Component Analysis is used pervasively to find low-dimensional projections, not only because it is straightforward to use, but it is also often effective, because the variance in data is often dominated by relevant structure. However, even if the projections highlight real structure in the data, not all structure is interesting to every user. If a user is already aware of, or not interested in the dominant structure, Principal Component Analysis is less effective for finding interesting components. We introduce a new method called Subjectively Interesting Component Analysis (SICA), designed to find data projections that are subjectively interesting, i.e, projections that truly surprise the end-user. It is rooted in information theory and employs an explicit model of a users prior expectations about the data. The corresponding optimization problem is a simple eigenvalue problem, and the result is a trade-off between explained variance and novelty. We present five case studies on synthetic data, images, time-series, and spatial data, to illustrate how SICA enables users to find (subjectively) interesting projections.


ieee international conference on data science and advanced analytics | 2015

P-N-RMiner: A generic framework for mining interesting structured relational patterns

Jefrey Lijffijt; Eirini Spyropoulou; Bo Kang; Tijl De Bie

Methods for local pattern mining are fragmented along two dimensions: the pattern syntax, and the data types on which they are applicable. Pattern syntaxes include subgroups, n-sets, itemsets, and many more; common data types include binary, categorical, and real-valued. Recent research on relational pattern mining has shown how the aforementioned pattern syntaxes can be unified in a single framework. However, a unified model to deal with various data types is lacking, certainly for more complexly structured types such as real numbers, time of day—which is circular—, geographical location, terms from a taxonomy, etc. We introduce P-N-RMiner, a generic tool for mining interesting local patterns in (relational) data with structured attributes. We show how to handle the attribute structures in a generic manner, by modelling them as partial orders. We also derive an information-theoretic subjective interestingness measure for such patterns and present an algorithm to efficiently enumerate the patterns. We find that (1) P-N-RMiner finds patterns that are substantially more informative, (2) the new interestingness measure cannot be approximated using existing methods, and (3) we can leverage the partial orders to speed up enumeration.


european conference on machine learning | 2016

Interactive Visual Data Exploration with Subjective Feedback

Kai Puolamäki; Bo Kang; Jefrey Lijffijt; Tijl De Bie

Data visualization and iterative/interactive data mining are growing rapidly in attention, both in research as well as in industry. However, integrated methods and tools that combine advanced visualization and data mining techniques are rare, and those that exist are often specialized to a single problem or domain. In this paper, we introduce a novel generic method for interactive visual exploration of high-dimensional data. In contrast to most visualization tools, it is not based on the traditional dogma of manually zooming and rotating data. Instead, the tool initially presents the user with an ‘interesting’ projection of the data and then employs data randomization with constraints to allow users to flexibly and intuitively express their interests or beliefs using visual interactions that correspond to exactly defined constraints. These constraints expressed by the user are then taken into account by a projection-finding algorithm to compute a new ‘interesting’ projection, a process that can be iterated until the user runs out of time or finds that constraints explain everything she needs to find from the data. We present the tool by means of two case studies, one controlled study on synthetic data and another on real census data. The data and software related to this paper are available at http://www.interesting-patterns.net/forsied/interactive-visual-data-exploration-with-subjective-feedback/.


Data Mining and Knowledge Discovery | 2018

SICA: subjectively interesting component analysis

Bo Kang; Jefrey Lijffijt; Raul Santos-Rodriguez; Tijl De Bie

The information in high-dimensional datasets is often too complex for human users to perceive directly. Hence, it may be helpful to use dimensionality reduction methods to construct lower dimensional representations that can be visualized. The natural question that arises is how do we construct a most informative low dimensional representation? We study this question from an information-theoretic perspective and introduce a new method for linear dimensionality reduction. The obtained model that quantifies the informativeness also allows us to flexibly account for prior knowledge a user may have about the data. This enables us to provide representations that are subjectively interesting. We title the method Subjectively Interesting Component Analysis (SICA) and expect it is mainly useful for iterative data mining. SICA is based on a model of a user’s belief state about the data. This belief state is used to search for surprising views. The initial state is chosen by the user (it may be empty up to the data format) and is updated automatically as the analysis progresses. We study several types of prior beliefs: if a user only knows the scale of the data, SICA yields the same cost function as Principal Component Analysis (PCA), while if a user expects the data to have outliers, we obtain a variant that we term t-PCA. Finally, scientifically more interesting variants are obtained when a user has more complicated beliefs, such as knowledge about similarities between data points. The experiments suggest that SICA enables users to find subjectively more interesting representations.


european conference on machine learning | 2016

A tool for subjective and interactive visual data exploration

Bo Kang; Kai Puolamäki; Jefrey Lijffijt; Tijl De Bie

We present SIDE, a tool for Subjective and Interactive Visual Data Exploration, which lets users explore high dimensional data via subjectively informative 2D data visualizations. Many existing visual analytics tools are either restricted to specific problems and domains or they aim to find visualizations that align with user’s belief about the data. In contrast, our generic tool computes data visualizations that are surprising given a user’s current understanding of the data. The user’s belief state is represented as a set of projection tiles. Hence, this user-awareness offers users an efficient way to interactively explore yet-unknown features of complex high dimensional datasets.


the european symposium on artificial neural networks | 2016

Informative data projections: a framework and two examples

Tijl De Bie; Jefrey Lijffijt; Raul Santos-Rodriguez; Bo Kang


international conference on data engineering | 2018

Interactive Visual Data Exploration with Subjective Feedback: An Information-Theoretic Approach

Kai Puolamäki; Emilia Oikarinen; Bo Kang; Jefrey Lijffijt; Tijl DeBie


international conference on data engineering | 2018

Subjectively Interesting Subgroup Discovery on Real-valued Targets

Jefrey Lijffijt; Bo Kang; Wouter Duivesteijn; Kai Puolamäki; Emilia Oikarinen; Tijl De Bie


arXiv: Machine Learning | 2018

Conditional Network Embeddings.

Bo Kang; Jefrey Lijffijt; Tijl De Bie


knowledge discovery and data mining | 2017

Clipped projections for more informative visualizations [a work-in-progress report]

Bo Kang; Junning Deng; Jefrey Lijffijt; Tijl De Bie

Collaboration


Dive into the Bo Kang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Emilia Oikarinen

Helsinki University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge