Bo Thiesson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bo Thiesson is active.

Explore More

Publication

Featured researches published by Bo Thiesson.

european conference on computer vision | 2004

Image and Video Segmentation by Anisotropic Kernel Mean Shift

Jue Wang; Bo Thiesson; Ying-Qing Xu; Michael F. Cohen

Mean shift is a nonparametric estimator of density which has been applied to image and video segmentation. Traditional mean shift based segmentation uses a radially symmetric kernel to estimate local density, which is not optimal in view of the often structured nature of image and more particularly video data. In this paper we present an anisotropic kernel mean shift in which the shape, scale, and orientation of the kernels adapt to the local structure of the image or video. We decompose the anisotropic kernel to provide handles for modifying the segmentation based on simple heuristics. Experimental results show that the anisotropic kernel mean shift outperforms the original mean shift on image and video segmentation in the following aspects: 1) it gets better results on general images and video in a smoothness sense; 2) the segmented results are more consistent with human visual saliency; 3) the algorithm is robust to initial parameters.

Machine Learning | 2001

Accelerating EM for Large Databases

Bo Thiesson; Christopher Meek; David Heckerman

The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires significant computational resources and has been dismissed as impractical for large databases. We present two approaches that significantly reduce the computational cost of applying the EM algorithm to databases with a large number of cases, including databases with large dimensionality. Both approaches are based on partial E-steps for which we can use the results of Neal and Hinton (In Jordan, M. (Ed.), Learning in Graphical Models, pp. 355–371. The Netherlands: Kluwer Academic Publishers) to obtain the standard convergence guarantees of EM. The first approach is a version of the incremental EM algorithm, described in Neal and Hinton (1998), which cycles through data cases in blocks. The number of cases in each block dramatically effects the efficiency of the algorithm. We provide amethod for selecting a near optimal block size. The second approach, which we call lazy EM, will, at scheduled iterations, evaluate the significance of each data case and then proceed for several iterations actively using only the significant cases. We demonstrate that both methods can significantly reduce computational costs through their application to high-dimensional real-world and synthetic mixture modeling problems for large databases.

Journal of Machine Learning Research | 2002

The learning-curve sampling method applied to model-based clustering

Christopher Meek; Bo Thiesson; David Heckerman

We examine the learning-curve sampling method, an approach for applying machine-learning algorithms to large data sets. The approach is based on the observation that the computational cost of learning a model increases as a function of the sample size of the training data, whereas the accuracy of a model has diminishing improvements as a function of sample size. Thus, the learning-curve sampling method monitors the increasing costs and performance as larger and larger amounts of data are used for training, and terminates learning when future costs outweigh future benefits. In this paper, we formalize the learning-curve sampling method and its associated cost-benefit tradeoff in terms of decision theory. In addition, we describe the application of the learning-curve sampling method to the task of model-based clustering via the expectation-maximization (EM) algorithm. In experiments on three real data sets, we show that the learning-curve sampling method produces models that are nearly as accurate as those trained on complete data sets, but with dramatically reduced learning times. Finally, we describe an extension of the basic learning-curve approach for model-based clustering that results in an additional speedup. This extension is based on the observation that the shape of the learning curve for a given model and data set is roughly independent of the number of EM iterations used during training. Thus, we run EM for only a few iterations to decide how many cases to use for training, and then run EM to full convergence once the number of cases is selected.

Computational Statistics & Data Analysis | 1995

BIFROST—block recursive models induced from relevant knowledge, observations, and statistical techniques

Søren Højsgaard; Bo Thiesson

Abstract The theoretical background for a program for establishing expert systems on the basis of observations and expert knowledge is presented. Block recursive models form the basis of the statistical modelling performed by the program. These models, together with various model selection methods for automatic model selection, are presented. Additionally, the connection between a block recursive model and expert systems based on causal probabilistic networks is treated. A medical example concerning diagnosis of coronary artery disease forms the basis for an evaluation of the expert systems established.

meeting of the association for computational linguistics | 2005

The Wild Thing

Kenneth Ward Church; Bo Thiesson

Suppose you are on a mobile device with no keyboard (e.g., a cell or PDA). How can you enter text quickly? T9? Graffiti? This demo will show how language modeling can be used to speed up data entry, both in the mobile context, as well as the desk-top. The Wild Thing encourages users to use wildcards (*). A language model finds the k-best expansions. Users quickly figure out when they can get away with wildcards. General purpose trigram language models are effective for the general case (unrestricted text), but there are important special cases like searching over popular web queries, where more restricted language models are even more effective.

user interface software and technology | 2008

Search Vox: leveraging multimodal refinement and partial knowledge for mobile voice search

Tim Paek; Bo Thiesson; Yun-Cheng Ju; Bongshin Lee

Internet usage on mobile devices continues to grow as users seek anytime, anywhere access to information. Because users frequently search for businesses, directory assistance has been the focus of many voice search applications utilizing speech as the primary input modality. Unfortunately, mobile settings often contain noise which degrades performance. As such, we present Search Vox, a mobile search interface that not only facilitates touch and text refinement whenever speech fails, but also allows users to assist the recognizer via text hints. Search Vox can also take advantage of any partial knowledge users may have about the business listing by letting them express their uncertainty in an intuitive way using verbal wildcards. In simulation experiments conducted on real voice search data, leveraging multimodal refinement resulted in a 28% relative reduction in error rate. Providing text hints along with the spoken utterance resulted in even greater relative reduction, with dramatic gains in recovery for each additional character.

european conference on principles of data mining and knowledge discovery | 2014

Towards flexibility detection in device-level energy consumption

Bijay Neupane; Torben Bach Pedersen; Bo Thiesson

The increasing drive towards green energy has boosted the installation of Renewable Energy Sources (RES). Increasing the share of RES in the power grid requires demand management by flexibility in the consumption. In this paper, we perform a state-of-the-art analysis on the flexibility and operation patterns of the devices in a set of real households. We propose a number of specific pre-processing steps such as operation stage segmentation, and aberrant operation duration removal to clean device level data. Further, we demonstrate various device operation properties such as hourly and daily regularities and patterns and the correlation between operating different devices. Subsequently, we show the existence of detectable time and energy flexibility in device operations. Finally, we provide various results providing a foundation for load- and flexibility-detection and -prediction at the device level.

human computer interaction with mobile devices and services | 2009

Designing phrase builder: a mobile real-time query expansion interface

Tim Paek; Bongshin Lee; Bo Thiesson

As users enter web queries, real-time query expansion (RTQE) interfaces offer suggestions based on an index garnered from query logs. In selecting a suggestion, users can potentially reduce keystrokes, which can be very beneficial on mobile devices with deficient input means. Unfortunately, RTQE interfaces typically provide little assistance when only parts of an intended query appear among the suggestion choices. In this paper, we introduce Phrase Builder, an RTQE interface that reduces keystrokes by facilitating the selection of individual query words and by leveraging back-off query techniques to offer completions for out-of-index queries. We describe how we implemented a small memory footprint index and retrieval algorithm, and discuss lessons learned from three versions of the user interface, which was iteratively designed through user studies. Compared to standard auto-completion and typing, the last version of Phrase Builder reduced more keystrokes-per-character, was perceived to be faster, and was overall preferred by users.

Springer US | 1994

Diagnostic systems by model selection: a case study

Steffen L. Lauritzen; Bo Thiesson; David J. Spiegelhalter

Probabilistic systems for diagnosing blue babies are constructed by model selection methods applied to a database of cases. Their performance are compared with a system built primarily by use of expert knowledge. Results indicate that purely automatic methods do not quite perform at the level of expert based systems, but when expert knowledge is incorporated properly, the methods look very promising.

north american chapter of the association for computational linguistics | 2007

K-Best Suffix Arrays

Kenneth Ward Church; Bo Thiesson; Robert J. Ragno

Suppose we have a large dictionary of strings. Each entry starts with a figure of merit (popularity). We wish to find the k-best matches for a substring, s, in a dictinoary, dict. That is, grep s dict | sort -n | head -k, but we would like to do this in sublinear time. Example applications: (1) web queries with popularities, (2) products with prices and (3) ads with click through rates. This paper proposes a novel index, k-best suffix arrays, based on ideas borrowed from suffix arrays and kdtrees. A standard suffix array sorts the suffixes by a single order (lexicographic) whereas k-best suffix arrays are sorted by two orders (lexicographic and popularity). Lookup time is between log N and sqrt N.

Explore More