Markus Weimer
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Markus Weimer.
international conference on management of data | 2013
Tyson Condie; Paul Mineiro; Neoklis Polyzotis; Markus Weimer
Statistical Machine Learning has undergone a phase transition from a pure academic endeavor to being one of the main drivers of modern commerce and science. Even more so, recent results such as those on tera-scale learning [1] and on very large neural networks [2] suggest that scale is an important ingredient in quality modeling. This tutorial introduces current applications, techniques and systems with the aim of cross-fertilizing research between the database and machine learning communities.n The tutorial covers current large scale applications of Machine Learning, their computational model and the workflow behind building those. Based on this foundation, we present the current state-of-the-art in systems support in the bulk of the tutorial. We also identify critical gaps in the state-of-the-art. This leads to the closing of the seminar, where we introduce two sets of open research questions: Better systems support for the already established use cases of Machine Learning and support for recent advances in Machine Learning research.
international conference on management of data | 2015
Markus Weimer; Yingda Chen; Byung-Gon Chun; Tyson Condie; Carlo Curino; Chris Douglas; Yunseong Lee; Tony Majestro; Dahlia Malkhi; Sergiy Matusevych; Brandon Myers; Shravan M. Narayanamurthy; Raghu Ramakrishnan; Sriram Rao; Russell Sears; Beysim Sezgin; Julia Wang
Resource Managers like Apache YARN have emerged as a critical layer in the cloud computing system stack, but the developer abstractions for leasing cluster resources and instantiating application logic are very low-level. This flexibility comes at a high cost in terms of developer effort, as each application must repeatedly tackle the same challenges (e.g., fault-tolerance, task scheduling and coordination) and re-implement common mechanisms (e.g., caching, bulk-data transfers). This paper presents REEF, a development framework that provides a control-plane for scheduling and coordinating task-level (data-plane) work on cluster resources obtained from a Resource Manager. REEF provides mechanisms that facilitate resource re-use for data caching, and state management abstractions that greatly ease the development of elastic data processing work-flows on cloud platforms that support a Resource Manager service. REEF is being used to develop several commercial offerings such as the Azure Stream Analytics service. Furthermore, we demonstrate REEF development of a distributed shell application, a machine learning algorithm, and a port of the CORFU [4] system. REEF is also currently an Apache Incubator project that has attracted contributors from several instititutions.1 http://reef.incubator.apache.org
international conference on data engineering | 2013
Tyson Condie; Paul Mineiro; Neoklis Polyzotis; Markus Weimer
Statistical Machine Learning has undergone a phase transition from a pure academic endeavor to being one of the main drivers of modern commerce and science. Even more so, recent results such as those on tera-scale learning [1] and on very large neural networks [2] suggest that scale is an important ingredient in quality modeling. This tutorial introduces current applications, techniques and systems with the aim of cross-fertilizing research between the database and machine learning communities. The tutorial covers current large scale applications of Machine Learning, their computational model and the workflow behind building those. Based on this foundation, we present the current state-of-the-art in systems support in the bulk of the tutorial. We also identify critical gaps in the state-of-the-art. This leads to the closing of the seminar, where we introduce two sets of open research questions: Better systems support for the already established use cases of Machine Learning and support for recent advances in Machine Learning research.
ACM Transactions on Computer Systems | 2017
Byung-Gon Chun; Tyson Condie; Yingda Chen; Brian Cho; Andrew Chung; Carlo Curino; Chris Douglas; Matteo Interlandi; Beomyeol Jeon; Joo Seong Jeong; Gyewon Lee; Yunseong Lee; Tony Majestro; Dahlia Malkhi; Sergiy Matusevych; Brandon Myers; Mariia Mykhailova; Shravan M. Narayanamurthy; Joseph Noor; Raghu Ramakrishnan; Sriram Rao; Russell Sears; Beysim Sezgin; Taegeon Um; Julia Wang; Markus Weimer; Youngseok Yang
Resource Managers like YARN and Mesos have emerged as a critical layer in the cloud computing system stack, but the developer abstractions for leasing cluster resources and instantiating application logic are very low level. This flexibility comes at a high cost in terms of developer effort, as each application must repeatedly tackle the same challenges (e.g., fault tolerance, task scheduling and coordination) and reimplement common mechanisms (e.g., caching, bulk-data transfers). This article presents REEF, a development framework that provides a control plane for scheduling and coordinating task-level (data-plane) work on cluster resources obtained from a Resource Manager. REEF provides mechanisms that facilitate resource reuse for data caching and state management abstractions that greatly ease the development of elastic data processing pipelines on cloud platforms that support a Resource Manager service. We illustrate the power of REEF by showing applications built atop: a distributed shell application, a machine-learning framework, a distributed in-memory caching system, and a port of the CORFU system. REEF is currently an Apache top-level project that has attracted contributors from several institutions and it is being used to develop several commercial offerings such as the Azure Stream Analytics service.
Archive | 2018
Sergio Escalera; Markus Weimer; Mikhail Burtsev; Valentin Malykh; Varvara Logacheva; Ryan Lowe; Iulian Vlad Serban; Yoshua Bengio; Alexander I. Rudnicky; Alan W. Black; Shrimai Prabhumoye; Łukasz Kidziński; Sharada Prasanna Mohanty; Carmichael F. Ong; Jennifer L. Hicks; Sergey Levine; Marcel Salathé; Scott L. Delp; Iker Huerga; Alexander Grigorenko; Leifur Thorbergsson; Anasuya Das; Kyla Nemitz; Jenna Sandker; Stephen King; Alexander S. Ecker; Leon A. Gatys; Matthias Bethge; Jordan L. Boyd-Graber; Shi Feng
Competitions have become a popular tool in the data science community to solve hard problems, assess the state of the art and spur new research directions. Companies like Kaggle and open source platforms like Codalab connect people with data and a data science problem to those with the skills and means to solve it. Hence, the question arises: What, if anything, could NIPS add to this rich ecosystem?
international conference on computer design | 2017
Alberto Scolari; Yunseong Lee; Markus Weimer; Matteo Interlandi
Machine Learning models are often composed by sequences of transformations. While this design makes easy to decompose and accelerate single model components at training time, predictions requires low latency and high performance predictability whereby end-to-end runtime optimizations and acceleration is needed to meet such goals. This paper shed some light on the problem by using a production-like model, and showing how by redesigning model pipelines for efficient execution over CPUs and FPGAs performance improvements of several folds can be achieved.
arXiv: Databases | 2012
Yingyi Bu; Vinayak R. Borkar; Michael J. Carey; Joshua Rosen; Neoklis Polyzotis; Tyson Condie; Markus Weimer; Raghu Ramakrishnan
IEEE Data(base) Engineering Bulletin | 2012
Vinayak R. Borkar; Yingyi Bu; Michael J. Carey; Joshua Rosen; Neoklis Polyzotis; Tyson Condie; Markus Weimer; Raghu Ramakrishnan
very large data bases | 2013
Byung-Gon Chun; Tyson Condie; Carlo Curino; Chris Douglas; Sergiy Matusevych; Brandon Myers; Shravan M. Narayanamurthy; Raghu Ramakrishnan; Sriram Rao; Josh Rosen; Russell Sears; Markus Weimer
arXiv: Distributed, Parallel, and Cluster Computing | 2013
Joshua Rosen; Neoklis Polyzotis; Vinayak R. Borkar; Yingyi Bu; Michael J. Carey; Markus Weimer; Tyson Condie; Raghu Ramakrishnan