Christian Murphy
University of Pennsylvania
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Christian Murphy.
software engineering and knowledge engineering | 2008
Christian Murphy; Gail E. Kaiser; Lifeng Hu
It is challenging to test machine learning (ML) applications, which are intended to learn properties of data sets where the correct answers are not already known. In the absence of a test oracle, one approach to testing these applications is to use metamorphic testing, in which properties of the application are exploited to define transformation functions on the input, such that the new output will be unchanged or can easily be predicted based on the original output; if the output is not as expected, then a defect must exist in the application. Here, we seek to enumerate and classify the metamorphic properties of some machine learning algorithms, and demonstrate how these can be applied to reveal defects in the applications of interest. In addition to the results of our testing, we present a set of properties that can be used to define these metamorphic relationships so that metamorphic testing can be used as a general approach to testing machine learning applications.
international conference on software testing, verification, and validation | 2009
Christian Murphy; Gail E. Kaiser; Ian Vo; Matt Chu
Software products released into the field typically have some number of residual defects that either were not detected or could not have been detected during testing. This may be the result of flaws in the test cases themselves, incorrect assumptions made during the creation of test cases, or the infeasibility of testing the sheer number of possible configurations for a complex system; these defects may also be due to application states that were not considered during lab testing, or corrupted states that could arise due to a security violation. One approach to this problem is to continue to test these applications even after deployment, in hopes of finding any remaining flaws. In this paper, we present a testing methodology we call in vivo testing, in which tests are continuously executed in the deployment environment. We also describe a type of test we call in vivo tests that are specifically designed for use with such an approach: these tests execute within the current state of the program (rather than by creating a clean slate) without affecting or altering that state from the perspective of the end-user. We discuss the approach and the prototype testing framework for Java applications called Invite. We also provide the results of case studies that demonstrate Invites effectiveness and efficiency.
software engineering in health care | 2011
Christian Murphy; Mohammad S. Raunak; Andrew L. King; Sanjian Chen; Christopher Imbriano; Gail E. Kaiser; Insup Lee; Oleg Sokolsky; Lori A. Clarke; Leon J. Osterweil
Health care professionals rely on software to simulate anatomical and physiological elements of the human body for purposes of training, prototyping, and decision making. Software can also be used to simulate medical processes and protocols to measure cost effectiveness and resource utilization. Whereas much of the software engineering research into simulation software focuses on validation (determining that the simulation accurately models real-world activity), to date there has been little investigation into the testing of simulation software itself, that is, the ability to effectively search for errors in the implementation. This is particularly challenging because often there is no test oracle to indicate whether the results of the simulation are correct. In this paper, we present an approach to systematically testing simulation software in the absence of test oracles, and evaluate the effectiveness of the technique.
automated software engineering | 2008
Christian Murphy; Swapneel Sheth; Gail E. Kaiser; Lauren Wilcox
Many collaborative applications, especially in scientific research, focus only on the sharing of tools or the sharing of data. We seek to introduce an approach to scientific collaboration that is based on the sharing of knowledge. We do this by automatically building organizational memory and enabling knowledge sharing by observing what users do with a particular tool or set of tools in the domain, through the addition of activity and usage monitoring facilities to standalone applications. Once this knowledge has been gathered, we apply social networking models to provide collaborative features to users, such as suggestions on tools to use, and automatically-generated sequences of actions based on past usage amongst the members of a social network or the entire community. In this work, we investigate social networking models as an approach to scientific knowledge sharing, and present an implementation called genSpace, which is built as an extension to the geWorkbench platform for computational biologists. Last, we discuss the approach from the viewpoint of social software engineering.
technical symposium on computer science education | 2008
Christian Murphy; Eunhee Kim; Gail E. Kaiser; Adam Cannon
The errors that Java programmers are likely to encounter can roughly be categorized into three groups: compile-time (semantic and syntactic), logical, and runtime (exceptions). While much work has focused on the first two, there are very few tools that exist for interpreting the sometimes cryptic messages that result from runtime errors. Novice programmers in particular have difficulty dealing with uncaught exceptions in their code and the resulting stack traces, which are by no means easy to understand. We present Backstop, a tool for debugging runtime errors in Java applications. This tool provides more user-friendly error messages when an uncaught exception occurs, and also provides debugging support by allowing users to watch the execution of the program and the changes to the values of variables. We also present the results of two preliminary studies conducted on introductory-level programmers using the two different features of the tool.
software engineering and knowledge engineering | 2007
Christian Murphy; Gail E. Kaiser; Marta Arias
Some machine learning applications are intended to learn properties of data sets where the correct answers are not already known to human users. It is challenging to test such ML software, because there is no reliable test oracle. We describe a software testing approach aimed at addressing this problem. We present our findings from testing implementations of two different ML ranking algorithms: Support Vector Machines and MartiRank.
Archive | 2008
Christian Murphy; Gail E. Kaiser
As machine learning (ML) applications become prevalent in various aspects of everyday life, their dependability takes on increasing importance. It is challenging to test such applications, however, because they are intended to learn properties of data sets where the correct answers are not already known. Our work is not concerned with testing how well an ML algorithm learns, but rather seeks to ensure that an application using the algorithm implements the specification correctly and fulfills the users’ expectations. These are critical to ensuring the application’s dependability. This paper presents three approaches to testing these types of applications. In the first, we create a set of limited test cases for which it is, in fact, possible to predict what the correct output should be. In the second approach, we use random testing to generate large data sets according to parameterization based on the application’s equivalence classes. Our third approach is based on metamorphic testing, in which properties of the application are exploited to define transformation functions on the input, such that the new output can easily be predicted based on the original output. Here we discuss these approaches, and our findings from testing the dependability of three real-world ML applications.
Archive | 2009
Christian Murphy; Gail E. Kaiser
Challenges arise in assuring the quality of applications that do not have test oracles, i.e., for which it is difficult or impossible to know what the correct output should be for arbitrary input. Recently, metamorphic testing [7] has been shown to be a simple yet effective technique in addressing the quality assurance of these so-called “non-testable programs” [51]. In metamorphic testing, existing test case input is modified to produce new test cases in such a manner that, when given the new input, the function should produce an output that can easily be computed based on the original output. That is, if input x produces output f (x), then we create input x’ such that we can predict f (x’ ) based on f (x); if the application does not produce the expected output, then a defect must exist, and either f (x) or f (x’ ) (or both) is wrong. Previously we have presented an approach called “Automated Metamorphic System Testing” [37], in which metamorphic testing is conducted automatically as the program executes. In the approach, metamorphic properties of the entire application are specified, and then checked after execution is complete. Here, we improve upon that work by presenting a technique in which the metamorphic properties of individual functions are used, allowing for the specification of more complex properties and enabling finer-grained runtime checking. Our goal is to demonstrate that such an approach will be more effective than one based on specifying metamorphic properties at the system level, and is also feasible for use in the deployment environment. This technique, called Metamorphic Runtime Checking, is a system testing approach in which the metamorphic properties of individual functions are automatically checked during the program’s execution. The tester is able to easily specify the functions’ properties so that metamorphic testing can be conducted in a running application, allowing the tests to execute using real input data and in the context of real system states, without affecting those states. We also describe an implementation framework called Columbus, and present the results of empirical studies that demonstrate that checking Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. FSE 2009 Amsterdam, The Netherlands Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...
Archive | 2010
Christian Murphy; Gail E. Kaiser
5.00. the metamorphic properties of individual functions increases the effectiveness of the approach in detecting defects, with minimal performance impact.
technical symposium on computer science education | 2011
Christian Murphy; Rita Manco Powell; Kristen Parton; Adam Cannon
Software testing of applications in fields like scientific computing, simulation, machine learning, etc. is particularly challenging because many applications in these domains have no reliable “test oracle” to indicate whether the program’s output is correct when given arbitrary input. A common approach to testing such applications has been to use a “pseudo-oracle”, in which multiple independently-developed implementations of an algorithm process an input and the results are compared: if the results are not the same, then at least one of the implementations contains a defect. Other approaches include the use of program invariants, formal specification languages, trace and log file analysis, and metamorphic testing. In this paper, we present the results of two empirical studies in which we compare the effectiveness of some of these approaches, including metamorphic testing and runtime assertion checking. These results demonstrate that metamorphic testing is generally more effective at revealing defects in applications without test oracles in various application domains, including non-deterministic programs. We also analyze the results in terms of the software development process, and discuss suggestions for both practitioners and researchers who need to test software without the help of a test oracle.