Petko Georgiev
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Petko Georgiev.
international workshop on mobile computing systems and applications | 2015
Nicholas D. Lane; Petko Georgiev
Sensor-equipped smartphones and wearables are transforming a variety of mobile apps ranging from health monitoring to digital assistants. However, reliably inferring user behavior and context from noisy and complex sensor data collected under mobile device constraints remains an open problem, and a key bottleneck to sensor app development. In recent years, advances in the field of deep learning have resulted in nearly unprecedented gains in related inference tasks such as speech and object recognition. However, although mobile sensing shares many of the same data modeling challenges, we have yet to see deep learning be systematically studied within the sensing domain. If deep learning could lead to significantly more robust and efficient mobile sensor inference it would revolutionize the field by rapidly expanding the number of sensor apps ready for mainstream usage. In this paper, we provide preliminary answers to this potentially game-changing question by prototyping a low-power Deep Neural Network (DNN) inference engine that exploits both the CPU and DSP of a mobile device SoC. We use this engine to study typical mobile sensing tasks (e.g., activity recognition) using DNNs, and compare results to learning techniques in more common usage. Our early findings provide illustrative examples of DNN usage that do not overburden modern mobile hardware, while also indicating how they can improve inference accuracy. Moreover, we show DNNs can gracefully scale to larger numbers of inference classes and can be flexibly partitioned across mobile and remote resources. Collectively, these results highlight the critical need for further exploration as to how the field of mobile sensing can best make use of advances in deep learning towards robust and efficient sensor inference.
information processing in sensor networks | 2016
Nicholas D. Lane; Sourav Bhattacharya; Petko Georgiev; Claudio Forlivesi; Lei Jiao; Lorena Qendro; Fahim Kawsar
Breakthroughs from the field of deep learning are radically changing how sensor data are interpreted to extract the high-level information needed by mobile apps. It is critical that the gains in inference accuracy that deep models afford become embedded in future generations of mobile apps. In this work, we present the design and implementation of DeepX, a software accelerator for deep learning execution. DeepX signif- icantly lowers the device resources (viz. memory, computation, energy) required by deep learning that currently act as a severe bottleneck to mobile adoption. The foundation of DeepX is a pair of resource control algorithms, designed for the inference stage of deep learning, that: (1) decompose monolithic deep model network architectures into unit- blocks of various types, that are then more efficiently executed by heterogeneous local device processors (e.g., GPUs, CPUs); and (2), perform principled resource scaling that adjusts the architecture of deep models to shape the overhead each unit-blocks introduces. Experiments show, DeepX can allow even large-scale deep learning models to execute efficently on modern mobile processors and significantly outperform existing solutions, such as cloud-based offloading.
the internet of things | 2015
Nicholas D. Lane; Sourav Bhattacharya; Petko Georgiev; Claudio Forlivesi; Fahim Kawsar
Detecting and reacting to user behavior and ambient context are core elements of many emerging mobile sensing and Internet-of-Things (IoT) applications. However, extracting accurate inferences from raw sensor data is challenging within the noisy and complex environments where these systems are deployed. Deep Learning -- is one of the most promising approaches for overcoming this challenge, and achieving more robust and reliable inference. Techniques developed within this rapidly evolving area of machine learning are now state-of-the-art for many inference tasks (such as, audio sensing and computer vision) commonly needed by IoT and wearable applications. But currently deep learning algorithms are seldom used in mobile/IoT class hardware because they often impose debilitating levels of system overhead (e.g., memory, computation and energy). Efforts to address this barrier to deep learning adoption are slowed by our lack of a systematic understanding of how these algorithms behave at inference time on resource constrained hardware. In this paper, we present the first -- albeit preliminary -- measurement study of common deep learning models (such as Convolutional Neural Networks and Deep Neural Networks) on representative mobile and embedded platforms. The aim of this investigation is to begin to build knowledge of the performance characteristics, resource requirements and the execution bottlenecks for deep learning models when being used to recognize categories of behavior and context. The results and insights of this study, lay an empirical foundation for the development of optimization methods and execution environments that enable deep learning to be more readily integrated into next-generation IoT, smartphones and wearable systems.
international conference on embedded networked sensor systems | 2014
Petko Georgiev; Nicholas D. Lane; Kiran K. Rachuri; Cecilia Mascolo
The rapidly growing adoption of sensor-enabled smartphones has greatly fueled the proliferation of applications that use phone sensors to monitor user behavior. A central sensor among these is the microphone which enables, for instance, the detection of valence in speech, or the identification of speakers. Deploying multiple of these applications on a mobile device to continuously monitor the audio environment allows for the acquisition of a diverse range of sound-related contextual inferences. However, the cumulative processing burden critically impacts the phone battery. To address this problem, we propose DSP.Ear -- an integrated sensing system that takes advantage of the latest low-power DSP co-processor technology in commodity mobile devices to enable the continuous and simultaneous operation of multiple established algorithms that perform complex audio inferences. The system extracts emotions from voice, estimates the number of people in a room, identifies the speakers, and detects commonly found ambient sounds, while critically incurring little overhead to the device battery. This is achieved through a series of pipeline optimizations that allow the computation to remain largely on the DSP. Through detailed evaluation of our prototype implementation we show that, by exploiting a smartphones co-processor, DSP.Ear achieves a 3 to 7 times increase in the battery lifetime compared to a solution that uses only the phones main processor. In addition, DSP.Ear is 2 to 3 times more power efficient than a naïve DSP solution without optimizations. We further analyze a large-scale dataset from 1320 Android users to show that in about 80-90% of the daily usage instances DSP.Ear is able to sustain a full day of operation (even in the presence of other smartphone workloads) with a single battery charge.
international conference on mobile systems, applications, and services | 2015
Nicholas D. Lane; Petko Georgiev; Cecilia Mascolo; Ying Gao
The wearable revolution, as a mass-market phenomenon, has finally arrived. As a result, the question of how wearables should evolve over the next 5 to 10 years is assuming an increasing level of societal and commercial importance. A range of open design and system questions are emerging, for instance: How can wearables shift from being largely health and fitness focused to tracking a wider range of life events? What will become the dominant methods through which users interact with wearables and consume the data collected? Are wearables destined to be cloud and/or smartphone dependent for their operation? Towards building the critical mass of understanding and experience necessary to tackle such questions, we have designed and implemented ZOE - a match-box sized (49g) collar- or lapel-worn sensor that pushes the boundary of wearables in an important set of new directions. First, ZOE aims to perform multiple deep sensor inferences that span key aspects of everyday life (viz. personal, social and place information) on continuously sensed data; while also offering this data not only within conventional analytics but also through a speech dialog system that is able to answer impromptu casual questions from users. (Am I more stressed this week than normal?) Crucially, and unlike other rich-sensing or dialog supporting wearables, ZOE achieves this without cloud or smartphone support - this has important side-effects for privacy since all user information can remain on the device. Second, ZOE incorporates the latest innovations in system-on-a-chip technology together with a custom daughter-board to realize a three-tier low-power processor hierarchy. We pair this hardware design with software techniques that manage system latency while still allowing ZOE to remain energy efficient (with a typical lifespan of 30 hours), despite its high sensing workload, small form-factor, and need to remain responsive to user dialog requests.
acm/ieee international conference on mobile computing and networking | 2016
Petko Georgiev; Nicholas D. Lane; Kiran K. Rachuri; Cecilia Mascolo
Mobile apps that use sensors to monitor user behavior often employ resource heavy inference algorithms that make computational offloading a common practice. However, existing schedulers/offloaders typically emphasize one primary offloading aspect without fully exploring complementary goals (e.g., heterogeneous resource management with only partial visibility into underlying algorithms, or concurrent sensor app execution on a single resource) and as a result, may overlook performance benefits pertinent to sensor processing. We bring together key ideas scattered in existing offloading solutions to build LEO -- a scheduler designed to maximize the performance for the unique workload of continuous and intermittent mobile sensor apps without changing their inference accuracy. LEO makes use of domain specific signal processing knowledge to smartly distribute the sensor processing tasks across the broader range of heterogeneous computational resources of high-end phones (CPU, co-processor, GPU and the cloud). To exploit short-lived, but substantial optimization opportunities, and remain responsive to the needs of near real-time apps such as voice-based natural user interfaces, LEO runs as a service on a low-power co-processor unit (LPU) to perform both frequent and joint schedule optimization for concurrent pipelines. Depending on the workload and network conditions, LEO is between 1.6 and 3 times more energy efficient than conventional cloud offloading with CPU-bound sensor sampling. In addition, even if a general-purpose scheduler is optimized directly to leverage an LPU, we find LEO still uses only a fraction (< 1/7) of the energy overhead for scheduling and is up to 19% more energy efficient for medium to heavy workloads.
international conference on mobile systems, applications, and services | 2017
Petko Georgiev; Nicholas D. Lane; Cecilia Mascolo; David Chu
GPUs have recently enjoyed increased popularity as general purpose software accelerators in multiple application domains including computer vision and natural language processing. However, there has been little exploration into the performance and energy trade-offs mobile GPUs can deliver for the increasingly popular workload of deep-inference audio sensing tasks, such as, spoken keyword spotting in energy-constrained smartphones and wearables. In this paper, we study these trade-offs and introduce an optimization engine that leverages a series of structural and memory access optimization techniques that allow audio algorithm performance to be automatically tuned as a function of GPU device specifications and model semantics. We find that parameter optimized audio routines obtain inferences an order of magnitude faster than sequential CPU implementations, and up to 6.5x times faster than cloud offloading with good connectivity, while critically consuming 3-4x less energy than the CPU. Under our optimized GPU, conventional wisdom about how to use the cloud and low power chips is broken. Unless the network has a throughput of at least 20Mbps (and a RTT of 25 ms or less), with only about 10 to 20 seconds of buffering audio data for batched execution, the optimized GPU audio sensing apps begin to consume less energy than cloud offloading. Under such conditions we find the optimized GPU can provide energy benefits comparable to low-power reference DSP implementations with some preliminary level of optimization; in addition to the GPU always winning with lower latency.
international conference on mobile systems applications and services | 2016
Nicholas D. Lane; Sourav Bhattacharya; Petko Georgiev; Claudio Forlivesi; Fahim Kawsar
Recent breakthroughs in deep learning are enabling new ways of interpreting and analyzing sensor measurements to extract high-level information needed by mobile and IoT apps. Thus for improving usability, it is essential that the deep models are embedded in next generation mobile and IoT apps, where inference tasks are often challenging due to high measurement noise. However, deep learning-based models are yet to become mainstream on embedded platforms, where device resources, e.g., memory, computation and energy, are limited. In this demonstration, we present DeepX, a software accelerator that allows running deep neural network (DNN) and deep convolutional neural network (CNN) efficiently on resource constrained mobile platforms. DeepX significantly lowers device resource requirements during deep model- based inferencing, which currently act as the severe bottleneck to wide-scale mobile adoption.
information processing in sensor networks | 2016
Nicholas D. Lane; Sourav Bhattacharya; Petko Georgiev; Claudio Forlivesi; Fahim Kawsar
Deep learning has revolutionized the way sensor measurements are interpreted and application of deep learning has seen a great leap in inference accuracies in a number of fields. However, the significant requirement for memory and computational power has hindered the wide scale adoption of these novel computational techniques on resource constrained wearable and mobile platforms. In this demonstration we present DeepX, a software accelerator for efficiently running deep neural networks and convolutional neural networks on resource constrained embedded platforms, e.g., Nvidia Tegra K1 and Qualcomm Snapdragon 400.
Proceedings of the 2015 on MobiSys PhD Forum | 2015
Petko Georgiev
The widespread use of sensor-equipped smartphones and wearables has enabled the rapid growth of personal sensing applications that monitor user behavior. Examples include continuously sampling the microphone for the detection of stressed speech or processing the accelerometer sensor stream for transportation mode detection. Deploying multiple of these applications on the mobile device is a rich source of behavioral insights but poses a significant strain on the battery life. The aim of the dissertation work is to make full use of the heterogeneous resources available in off-the-shelf smartphones and wearables (viz. CPU, low-power co-processors and wireless) to efficiently and fairly orchestrate the execution of multiple interacting sensing applications. This requires building an abstraction scheduling layer that transparently distributes sensor processing tasks and optimizes the utilization of shared computing resources to meet individual app requirements.