Is this you? Create Your Porfile

Christian A. Lang

University of California, Santa Barbara

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christian A. Lang is active.

Explore More

Publication

Featured researches published by Christian A. Lang.

international conference on management of data | 2001

Modeling high-dimensional index structures using sampling

Christian A. Lang; Ambuj K. Singh

A large number of index structures for high-dimensional data have been proposed previously. In order to tune and compare such index structures, it is vital to have efficient cost prediction techniques for these structures. Previous techniques either assume uniformity of the data or are not applicable to high-dimensional data. We propose the use of sampling to predict the number of accessed index pages during a query execution. Sampling is independent of the dimensionality and preserves clusters which is important for representing skewed data. We present a general model for estimating the index page layout using sampling and show how to compensate for errors. We then give an implementation of our model under restricted memory assumptions and show that it performs well even under these constraints. Errors are minimal and the overall prediction time is up to two orders of magnitude below the time for building and probing the full index without sampling.

Bioorganic & Medicinal Chemistry Letters | 2015

Discovery of LRRK2 inhibitors using sequential in silico joint pharmacophore space (JPS) and ensemble docking

Christian A. Lang; Soumya S. Ray; Min Liu; Ambuj K. Singh; Gregory D. Cuny

Joint pharmacophore space (JPS), ensemble docking and sequential JPS-ensemble docking were used to select three panels of compounds (10 per panel) for evaluation as LRRK2 inhibitors. These computational methods identified four LRRK2 inhibitors with IC50 values <12μM. The sequential JPS-ensemble docking predicted the majority of active hits. One of the inhibitors (Z-8205) identified using this method was also found to inhibit the G2019S mutant of LRRK2 25-fold better than wild-type enzyme. This bias for the G2019S mutant is proposed to arise from an interaction with S2019 in this form of the enzyme. In addition, Z-8205 was found to only inhibit one other kinase when profiled against a panel of 97 kinases at 10μM.

statistical and scientific database management | 2002

Accelerating high-dimensional nearest neighbor queries

Christian A. Lang; Ambuj K. Singh

The performance of nearest neighbor (NN) queries degrades noticeably with increasing dimensionality of the data due to reduced selectivity of high-dimensional data and an increased number of seek operations during NN-query execution. If the NN-radii were known in advance, the disk accesses could be reordered such that seek operations are minimized. We therefore propose a new way of estimating the NN-radius based on the fractal dimensionality and sampling. It is applicable to any page-based index structure. We show that the estimation error is considerably lower than for previous approaches. In the second part of the paper, we present two applications of this technique. We show how the radius estimations can be used to transform k-NN queries into at most two range queries, and how it can be used to reduce the number of page reads during all-NN queries. In both cases, we observe significant speedups over traditional techniques for synthetic and real-world data.

international conference on data engineering | 2003

Joining massive high-dimensional datasets

Tamer Kahveci; Christian A. Lang; Ambuj K. Singh

We consider the problem of joining massive datasets. We propose two techniques for minimizing disk I/O cost of join operations for both spatial and sequence data. Our techniques optimize the available buffer space using a global view of the datasets. We build a boolean matrix on the pages of the given datasets using a lower bounding distance predictor. The marked entries of this matrix represent candidate page pairs to be joined. Our first technique joins the marked pages iteratively. Our second technique clusters the marked entries using rectangular dense regions that have minimal perimeter and fit into buffer. These clusters are then ordered so that the total number of common pages between consecutive clusters is maximal. The clusters are then read from disk and joined. Our experimental results on various real datasets show that our techniques are 2 to 86 times faster than the competing techniques for spatial datasets, and 13 to 133 times faster than the competing techniques for sequence datasets.

International Journal of Image and Graphics | 2003

FASTER SIMILARITY SEARCH FOR MULTIMEDIA DATA VIA QUERY TRANSFORMATIONS

Christian A. Lang; Ambuj K. Singh

The performance of nearest neighbor (NN) queries degrades noticeably with increasing dimensionality of the data due to reduced selectivity of high-dimensional data and an increased number of seek operations during NN-query execution. If the NN-radii would be known in advance, the disk accesses could be reordered such that seek operations are minimized. We therefore propose a new way of estimating the NN-radius based on the fractal dimensionality and sampling. It is applicable to any page-based index structure. We show that the estimation error is considerably lower than for previous approaches. In the second part of the paper, we present two applications of this technique. We show how the radius estimations can be used to transform k-NN queries into at most two range queries, and how it can be used to reduce the number of page reads during all-NN queries. In both cases, we observe significant speedups over traditional techniques for synthetic and real-world data.

Archive | 2017

Scalable image informatics

Dmitry Fedorov; B. S. Manjunath; Christian A. Lang; Kristian Kvilekval

Abstract Images and video play a major role in scientific discoveries. Significant new advances in imaging science over the past two decades have resulted in new devices and technologies that are able to probe the world at nanoscales to planetary scales. These instruments generate massive amounts of multimodal imaging data. In addition to the raw imaging data, these instruments capture additional critical information—the metadata—that include the imaging context. Further, the experimental conditions are often added manually to such metadata that describe processes that are not implicit in the instrumentation metadata. Despite these technological advances in imaging sciences, resources for curation, distribution, sharing, and analysis of such data at scale are still lacking. Robust image analysis workflows have the potential to transform image-based sciences such as biology, ecology, remote sensing, materials science, and medical imaging. In this context, this chapter presents BisQue, a novel eco-system where scientific image analysis methods can be discovered, tested, verified, refined, and shared among users on a shared, cloud-based infrastructure. The vision of BisQue is to enable large-scale, data-driven scientific explorations. The following sections will discuss the core requirements of such an architecture, challenges in developing and deploying the methods, and will conclude with an application to image recognition using deep learning.

Archive | 2005