Yingyu Liang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yingyu Liang is active.

Explore More

Publication

Featured researches published by Yingyu Liang.

international colloquium on automata languages and programming | 2012

Clustering under perturbation resilience

Maria-Florina Balcan; Yingyu Liang

Motivated by the fact that distances between data points in many real-world clustering instances are often based on heuristic measures, Bilu and Linial [6] proposed analyzing objective based clustering problems under the assumption that the optimum clustering to the objective is preserved under small multiplicative perturbations to distances between points. In this paper, we provide several results within this framework. For separable center-based objectives, we present an algorithm that can optimally cluster instances resilient to

siam international conference on data mining | 2015

A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning.

Aurélien Bellet; Yingyu Liang; Alireza Bagheri Garakani; Maria-Florina Balcan; Fei Sha

(1 + \sqrt{2})

NeuroImage | 2017

Mapping between fMRI responses to movies and their natural language annotations

Kiran Vodrahalli; Po-Hsuan Chen; Yingyu Liang; Christopher Baldassano; Janice Chen; Esther Yong; Christopher J. Honey; Uri Hasson; Peter J. Ramadge; Kenneth A. Norman; Sanjeev Arora

-factor perturbations, solving an open problem of Awasthi et al. [2]. For the k-median objective, we additionally give algorithms for a weaker, relaxed, and more realistic assumption in which we allow the optimal solution to change in a small fraction of the points after perturbation. We also provide positive results for min-sum clustering which is a generally much harder objective than k-median (and also non-center-based). Our algorithms are based on new linkage criteria that may be of independent interest.

SIAM Journal on Computing | 2016

Clustering under Perturbation Resilience

Maria-Florina Balcan; Yingyu Liang

Learning sparse combinations is a frequent theme in machine learning. In this paper, we study its associated optimization problem in the distributed setting where the elements to be combined are not centrally located but spread over a network. We address the key challenges of balancing communication costs and optimization errors. To this end, we propose a distributed Frank-Wolfe (dFW) algorithm. We obtain theoretical guarantees on the optimization error

SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition | 2013

Modeling and detecting community hierarchies

Maria-Florina Balcan; Yingyu Liang

\epsilon

acm multimedia | 2009

Vocabulary-based hashing for image search

Yingyu Liang; Jianmin Li; Bo Zhang

and communication cost that do not depend on the total number of combining elements. We further show that the communication cost of dFW is optimal by deriving a lower-bound on the communication cost required to construct an

conference on innovations in theoretical computer science | 2018

Matrix Completion and Related Problems via Strong Duality

Maria-Florina Balcan; Yingyu Liang; David P. Woodruff; Hongyang Zhang

\epsilon

knowledge discovery and data mining | 2016

Communication Efficient Distributed Kernel Principal Component Analysis

Maria-Florina Balcan; Yingyu Liang; Le Song; David P. Woodruff; Bo Xie

-approximate solution. We validate our theoretical analysis with empirical studies on synthetic and real-world data, which demonstrate that dFW outperforms both baselines and competing methods. We also study the performance of dFW when the conditions of our analysis are relaxed, and show that dFW is fairly robust.

conference on multimedia modeling | 2010

Learning vocabulary-based hashing with adaboost

Yingyu Liang; Jianmin Li; Bo Zhang

ABSTRACT Several research groups have shown how to map fMRI responses to the meanings of presented stimuli. This paper presents new methods for doing so when only a natural language annotation is available as the description of the stimulus. We study fMRI data gathered from subjects watching an episode of BBCs Sherlock (Chen et al., 2017), and learn bidirectional mappings between fMRI responses and natural language representations. By leveraging data from multiple subjects watching the same movie, we were able to perform scene classification with 72% accuracy (random guessing would give 4%) and scene ranking with average rank in the top 4% (random guessing would give 50%). The key ingredients underlying this high level of performance are (a) the use of the Shared Response Model (SRM) and its variant SRM‐ICA (Chen et al., 2015; Zhang et al., 2016) to aggregate fMRI data from multiple subjects, both of which are shown to be superior to standard PCA in producing low‐dimensional representations for the tasks in this paper; (b) a sentence embedding technique adapted from the natural language processing (NLP) literature (Arora et al., 2017) that produces semantic vector representation of the annotations; (c) using previous timestep information in the featurization of the predictor data. These optimizations in how we featurize the fMRI data and text annotations provide a substantial improvement in classification performance, relative to standard approaches. HIGHLIGHTSWe learn maps between fMRI data and fine‐grained text annotations.The Shared Response Model highlights movie‐related variance in the fMRI response.Semantic annotations can be featurized with weighted sums of word embeddings.Using previous timepoints helps with fMRI to Text, but hurts Text to fMRI.Our methods attain high performance on scene classification and ranking tasks.

neural information processing systems | 2014

Scalable Kernel Methods via Doubly Stochastic Gradients

Bo Dai; Bo Xie; Niao He; Yingyu Liang; Anant Raj; Maria-Florina Balcan; Le Song

Motivated by the fact that distances between data points in many real-world clustering instances are often based on heuristic measures, Bilu and Linial [Proceedings of the Symposium on Innovations in Computer Science, 2010] proposed analyzing objective based clustering problems under the assumption that the optimum clustering to the objective is preserved under small multiplicative perturbations to distances between points. The hope is that by exploiting the structure in such instances, one can overcome worst case hardness results. In this paper, we provide several results within this framework. For center-based objectives, we present an algorithm that can optimally cluster instances resilient to perturbations of factor

Explore More