George Candea
École Polytechnique Fédérale de Lausanne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by George Candea.
architectural support for programming languages and operating systems | 2011
Vitaly Chipounov; Volodymyr Kuznetsov; George Candea
This paper presents S2E, a platform for analyzing the properties and behavior of software systems. We demonstrate S2Es use in developing practical tools for comprehensive performance profiling, reverse engineering of proprietary software, and bug finding for both kernel-mode and user-mode binaries. Building these tools on top of S2E took less than 770 LOC and 40 person-hours each. S2Es novelty consists of its ability to scale to large real systems, such as a full Windows stack. S2E is based on two new ideas: selective symbolic execution, a way to automatically minimize the amount of code that has to be executed symbolically given a target analysis, and relaxed execution consistency models, a way to make principled performance/accuracy trade-offs in complex analyses. These techniques give S2E three key abilities: to simultaneously analyze entire families of execution paths, instead of just one execution at a time; to perform the analyses in-vivo within a real software stack--user programs, libraries, kernel, drivers, etc.--instead of using abstract models of these layers; and to operate directly on binaries, thus being able to analyze even proprietary software. Conceptually, S2E is an automated path explorer with modular path analyzers: the explorer drives the target system down all execution paths of interest, while analyzers check properties of each such path (e.g., to look for bugs) or simply collect information (e.g., count page faults). Desired paths can be specified in multiple ways, and S2E users can either combine existing analyzers to build a custom analysis tool, or write new analyzers using the S2E API.
operating systems design and implementation | 2014
Volodymyr Kuznetsov; Laszlo Szekeres; Mathias Payer; George Candea; R. Sekar; Dawn Song
Systems code is often written in low-level languages like C/C++, which offer many benefits but also delegate memory management to programmers. This invites memory safety bugs that attackers can exploit to divert control flow and compromise the system. Deployed defense mechanisms (e.g., ASLR, DEP) are incomplete, and stronger defense mechanisms (e.g., CFI) often have high overhead and limited guarantees [19, 15, 9]. We introduce code-pointer integrity (CPI), a new design point that guarantees the integrity of all code pointers in a program (e.g., function pointers, saved return addresses) and thereby prevents all control-flow hijack attacks, including return-oriented programming. We also introduce code-pointer separation (CPS), a relaxation of CPI with better performance properties. CPI and CPS offer substantially better security-to-overhead ratios than the state of the art, they are practical (we protect a complete FreeBSD system and over 100 packages like apache and postgresql), effective (prevent all attacks in the RIPE benchmark), and efficient: on SPEC CPU2006, CPS averages 1.2% overhead for C and 1.9% for C/C++, while CPIs overhead is 2.9% for C and 8.4% for C/C++. A prototype implementation of CPI and CPS can be obtained from http://levee.epfl.ch.
Operating Systems Review | 2010
Liviu Ciortea; Cristian Zamfir; Stefan Bucur; Vitaly Chipounov; George Candea
Cloud9 aims to reduce the resource-intensive and laborintensive nature of high-quality software testing. First, Cloud9 parallelizes symbolic execution (an effective, but still poorly scalable test automation technique) to large shared-nothing clusters. To our knowledge, Cloud9 is the first symbolic execution engine that scales to large clusters of machines, thus enabling thorough automated testing of real software in conveniently short amounts of time. Preliminary results indicate one to two orders of magnitude speedup over a state-of-the-art symbolic execution engine. Second, Cloud9 is an on-demand software testing service: it runs on compute clouds, like Amazon EC2, and scales its use of resources over a wide dynamic range, proportionally with the testing task at hand.
autonomic computing workshop | 2003
George Candea; Emre Kiciman; Steve Zhang; Pedram Keyani; Armando Fox
This paper demonstrates that the dependability of generic, evolving J2EE applications can be enhanced through a combination of a few recovery-oriented techniques. Our goal is to reduce downtime by automatically and efficiently recovering from a broad class of transient software failures without having to modify applications. We describe here the integration of three new techniques into JBoss, an open-source J2EE application server. The resulting system is JAGR-JBoss with application-generic recovery - a self-recovering execution platform. JAGR combines application-generic failure-path inference (AFPI), path-based failure detection, and micro-reboots. AFPI uses controlled fault injection and observation to infer paths that faults follow through a J2EE application. Path-based failure detection uses tagging of client requests and statistical analysis to identify anomalous component behavior. Micro-reboots are fast reboots we perform at the sub-application level to recover components from transient failures; by selectively rebooting only those components that are necessary to repair the failure, we reduce recovery time. These techniques are designed to be autonomous and application-generic, making them well suited to the rapidly changing software of Internet services.
programming language design and implementation | 2012
Volodymyr Kuznetsov; Johannes Kinder; Stefan Bucur; George Candea
Symbolic execution has proven to be a practical technique for building automated test case generation and bug finding tools. Nevertheless, due to state explosion, these tools still struggle to achieve scalability. Given a program, one way to reduce the number of states that the tools need to explore is to merge states obtained on different paths. Alas, doing so increases the size of symbolic path conditions (thereby stressing the underlying constraint solver) and interferes with optimizations of the exploration process (also referred to as search strategies). The net effect is that state merging may actually lower performance rather than increase it. We present a way to automatically choose when and how to merge states such that the performance of symbolic execution is significantly increased. First, we present query count estimation, a method for statically estimating the impact that each symbolic variable has on solver queries that follow a potential merge point; states are then merged only when doing so promises to be advantageous. Second, we present dynamic state merging, a technique for merging states that interacts favorably with search strategies in automated test case generation and bug finding tools. Experiments on the 96 GNU Coreutils show that our approach consistently achieves several orders of magnitude speedup over previously published results. Our code and experimental data are publicly available at http://cloud9.epfl.ch.
international conference on management of data | 2008
Emmanuel Cecchet; George Candea; Anastassia Ailamaki
The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.
Proceedings the Third IEEE Workshop on Internet Applications. WIAPP 2003 | 2003
George Candea; Mauricio R. Delgado; Michael Chen; Armando Fox
Automatic failure-path inference (AFPI) is an application-generic, automatic technique for dynamically discovering the failure dependency graphs of componentized Internet applications. AFPIs first phase is invasive, and relies on controlled fault injection to determine failure propagation; this phase requires no a priori knowledge of the application and takes on the order of hours to run. Once the system is deployed in production, the second, noninvasive phase of AFPI passively monitors the system, and updates the dependency graph as new failures are observed. This process is a good match for the perpetually-evolving software found in Internet systems; since no performance overhead is introduced, AFPI is feasible for live systems. We applied AFPI to J2EE and tested it by injecting Java exceptions into an e-commerce application and an online auction service. The resulting graphs of exception propagation are more detailed and accurate than what could be derived by time-consuming manual inspection or analysis of readily-available static application descriptions.
dependable systems and networks | 2002
George Candea; James W. Cutler; Armando Fox; Rushabh Doshi; Priyank Garg; Rakesh Gowda
We present ideas on how to structure software systems for high availability by considering MTTR/MTTF characteristics of components in addition to the traditional criteria, such as functionality or state sharing. Recursive restartability (RR), a recently proposed technique for achieving high availability, exploits partial restarts at various levels within complex software infrastructures to recover from transient failures and rejuvenate software components. Here we refine the original proposal and apply the RR philosophy to Mercury, a COTS-based satellite ground station that has been in operation for over 2 years. We develop three techniques for transforming component group boundaries such that time-to-recover is reduced, hence increasing system availability. We also further RR by defining the notions of an oracle, restart group and restart policy, while showing how to reason about system properties in terms of restart groups. From our experience with applying RR to Mercury, we draw design guidelines and lessons for the systematic application of recursive restartability to other software systems amenable to RR.
architectural support for programming languages and operating systems | 2012
Baris Kasikci; Cristian Zamfir; George Candea
Even though most data races are harmless, the harmful ones are at the heart of some of the worst concurrency bugs. Alas, spotting just the harmful data races in programs is like finding a needle in a haystack: 76%-90% of the true data races reported by state-of-the-art race detectors turn out to be harmless [45]. We present Portend, a tool that not only detects races but also automatically classifies them based on their potential consequences: Could they lead to crashes or hangs? Could their effects be visible outside the program? Are they harmless? Our proposed technique achieves high accuracy by efficiently analyzing multiple paths and multiple thread schedules in combination, and by performing symbolic comparison between program outputs. We ran Portend on 7 real-world applications: it detected 93 true data races and correctly classified 92 of them, with no human effort. 6 of them are harmful races. Portends classification accuracy is up to 88% higher than that of existing tools, and it produces easy-to-understand evidence of the consequences of harmful races, thus both proving their harmfulness and making debugging easier. We envision Portend being used for testing and debugging, as well as for automatically triaging bug reports.
ACM Transactions on Computer Systems | 2012
Vitaly Chipounov; Volodymyr Kuznetsov; George Candea
This article presents S2E, a platform for analyzing the properties and behavior of software systems, along with its use in developing tools for comprehensive performance profiling, reverse engineering of proprietary software, and automated testing of kernel-mode and user-mode binaries. Conceptually, S2E is an automated path explorer with modular path analyzers: the explorer uses a symbolic execution engine to drive the target system down all execution paths of interest, while analyzers measure and/or check properties of each such path. S2E users can either combine existing analyzers to build custom analysis tools, or they can directly use S2E’s APIs. S2E’s strength is the ability to scale to large systems, such as a full Windows stack, using two new ideas: selective symbolic execution, a way to automatically minimize the amount of code that has to be executed symbolically given a target analysis, and execution consistency models, a way to make principled performance/accuracy trade-offs during analysis. These techniques give S2E three key abilities: to simultaneously analyze entire families of execution paths instead of just one execution at a time; to perform the analyses in-vivo within a real software stack---user programs, libraries, kernel, drivers, etc.---instead of using abstract models of these layers; and to operate directly on binaries, thus being able to analyze even proprietary software.