Jinliang Wei | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jinliang Wei is active.

Explore More

Publication

Featured researches published by Jinliang Wei.

international world wide web conferences | 2015

LightLDA: Big Topic Models on Modest Computer Clusters

Jinhui Yuan; Fei Gao; Qirong Ho; Wei Dai; Jinliang Wei; Xun Zheng; Eric P. Xing; Tie-Yan Liu; Wei-Ying Ma

When building large-scale machine learning (ML) programs, such as massive topic models or deep neural networks with up to trillions of parameters and training examples, one usually assumes that such massive tasks can only be attempted with industrial-sized clusters with thousands of nodes, which are out of reach for most practitioners and academic researchers. We consider this challenge in the context of topic modeling on web-scale corpora, and show that with a modest cluster of as few as 8 machines, we can train a topic model with 1 million topics and a 1-million-word vocabulary (for a total of 1 trillion parameters), on a document collection with 200 billion tokens --- a scale not yet reported even with thousands of machines. Our major contributions include: 1) a new, highly-efficient O(1) Metropolis-Hastings sampling algorithm, whose running cost is (surprisingly) agnostic of model size, and empirically converges nearly an order of magnitude more quickly than current state-of-the-art Gibbs samplers; 2) a model-scheduling scheme to handle the big model challenge, where each worker machine schedules the fetch/use of sub-models as needed, resulting in a frugal use of limited memory capacity and network bandwidth; 3) a differential data-structure for model storage, which uses separate data structures for high- and low-frequency words to allow extremely large models to fit in memory, while maintaining high inference speed. These contributions are built on top of the Petuum open-source distributed ML framework, and we provide experimental evidence showing how this development puts massive data and models within reach on a small cluster, while still enjoying proportional time cost reductions with increasing cluster size.

symposium on cloud computing | 2015

Managed communication and consistency for fast data-parallel iterative analytics

Jinliang Wei; Wei Dai; Aurick Qiao; Qirong Ho; Henggang Cui; Gregory R. Ganger; Phillip B. Gibbons; Garth A. Gibson; Eric P. Xing

At the core of Machine Learning (ML) analytics is often an expert-suggested model, whose parameters are refined by iteratively processing a training dataset until convergence. The completion time (i.e. convergence time) and quality of the learned model not only depends on the rate at which the refinements are generated but also the quality of each refinement. While data-parallel ML applications often employ a loose consistency model when updating shared model parameters to maximize parallelism, the accumulated error may seriously impact the quality of refinements and thus delay completion time, a problem that usually gets worse with scale. Although more immediate propagation of updates reduces the accumulated error, this strategy is limited by physical network bandwidth. Additionally, the performance of the widely used stochastic gradient descent (SGD) algorithm is sensitive to step size. Simply increasing communication often fails to bring improvement without tuning step size accordingly and tedious hand tuning is usually needed to achieve optimal performance. This paper presents Bösen, a system that maximizes the network communication efficiency under a given inter-machine network bandwidth budget to minimize parallel error, while ensuring theoretical convergence guarantees for large-scale data-parallel ML applications. Furthermore, Bösen prioritizes messages most significant to algorithm convergence, further enhancing algorithm convergence. Finally, Bösen is the first distributed implementation of the recently presented adaptive revision algorithm, which provides orders of magnitude improvement over a carefully tuned fixed schedule of step size refinements for some SGD algorithms. Experiments on two clusters with up to 1024 cores show that our mechanism significantly improves upon static communication schedules.

symposium on cloud computing | 2016

Addressing the straggler problem for iterative convergent parallel ML

Aaron Harlap; Henggang Cui; Wei Dai; Jinliang Wei; Gregory R. Ganger; Phillip B. Gibbons; Garth A. Gibson; Eric P. Xing

FlexRR provides a scalable, efficient solution to the straggler problem for iterative machine learning (ML). The frequent (e.g., per iteration) barriers used in traditional BSP-based distributed ML implementations cause every transient slowdown of any worker thread to delay all others. FlexRR combines a more flexible synchronization model with dynamic peer-to-peer re-assignment of work among workers to address straggler threads. Experiments with real straggler behavior observed on Amazon EC2 and Microsoft Azure, as well as injected straggler behavior stress tests, confirm the significance of the problem and the effectiveness of FlexRRs solution. Using FlexRR, we consistently observe near-ideal run-times (relative to no performance jitter) across all real and injected straggler behaviors tested.

2011 4th Symposium on Configuration Analytics and Automation (SAFECONFIG) | 2011

A software toolkit for visualizing enterprise routing design

Xin Sun; Jinliang Wei; Sanjay G. Rao; Geoffrey G. Xie

Routing design is widely considered as one of the most challenging parts of enterprise network design. The challenges come from the typical large scale of such networks, the diverse objectives to meet through design, and a wide variety of protocols and mechanisms to choose from. As a result network operators often find it difficult to understand and trouble-shoot the routing design of their networks. Furthermore, todays common practice of focusing on one router or one protocol at a time makes it a onerous task to reason about the network-wide routing behavior.

IEEE Transactions on Big Data | 2015

Petuum: A New Platform for Distributed Machine Learning on Big Data

Eric P. Xing; Qirong Ho; Wei Dai; Jin Kyu Kim; Jinliang Wei; Seunghak Lee; Xun Zheng; Pengtao Xie; Abhimanu Kumar; Yaoliang Yu

usenix annual technical conference | 2014

Exploiting bounded staleness to speed up big data analytics

Henggang Cui; James Cipar; Qirong Ho; Jin Kyu Kim; Seunghak Lee; Abhimanu Kumar; Jinliang Wei; Wei Dai; Gregory R. Ganger; Phillip B. Gibbons; Garth A. Gibson; Eric P. Xing

national conference on artificial intelligence | 2015