Sarah Heckman
North Carolina State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sarah Heckman.
Information & Software Technology | 2011
Sarah Heckman; Laurie Williams
ContextAutomated static analysis (ASA) identifies potential source code anomalies early in the software development lifecycle that could lead to field failures. Excessive alert generation and a large proportion of unimportant or incorrect alerts (unactionable alerts) may cause developers to reject the use of ASA. Techniques that identify anomalies important enough for developers to fix (actionable alerts) may increase the usefulness of ASA in practice. ObjectiveThe goal of this work is to synthesize available research results to inform evidence-based selection of actionable alert identification techniques (AAIT). MethodRelevant studies about AAITs were gathered via a systematic literature review. ResultsWe selected 21 peer-reviewed studies of AAITs. The techniques use alert type selection; contextual information; data fusion; graph theory; machine learning; mathematical and statistical models; or dynamic detection to classify and prioritize actionable alerts. All of the AAITs are evaluated via an example with a variety of evaluation metrics. ConclusionThe selected studies support (with varying strength), the premise that the effective use of ASA is improved by supplementing ASA with an AAIT. Seven of the 21 selected studies reported the precision of the proposed AAITs. The two studies with the highest precision built models using the subject programs history. Precision measures how well a technique identifies true actionable alerts out of all predicted actionable alerts. Precision does not measure the number of actionable alerts missed by an AAIT or how well an AAIT identifies unactionable alerts. Inconsistent use of evaluation metrics, subject programs, and ASAs in the selected studies preclude meta-analysis and prevent the current results from informing evidence-based selection of an AAIT. We propose building on an actionable alert identification benchmark for comparison and evaluation of AAIT from literature on a standard set of subjects and utilizing a common set of evaluation metrics.
empirical software engineering and measurement | 2008
Sarah Heckman; Laurie Williams
Benchmarks provide an experimental basis for evaluating software engineering processes or techniques in an objective and repeatable manner. We present the FAULTBENCH v0.1 benchmark, as a contribution to current benchmark materials, for evaluation and comparison of techniques that prioritize and classify alerts generated by static analysis tools. Static analysis tools may generate an overwhelming number of alerts, the majority of which are likely to be false positives (FP). Two FP mitigation techniques, alert prioritization and classification, provide an ordering or classification of alerts, identifying those likely to be anomalies. We evaluate FAULTBENCH using three versions of a FP mitigation technique within the AWARE adaptive prioritization model. Individual FAULTBENCH subjects vary in their optimal FP mitigation techniques. Together, FAULTBENCH subjects provide a precise and general evaluation of FP mitigation techniques.
international conference on software testing, verification, and validation | 2009
Sarah Heckman; Laurie Williams
Automated static analysis can identify potential source code anomalies early in the software process that could lead to field failures. However, only a small portion of static analysis alerts may be important to the developer (actionable). The remainder are false positives (unactionable). We propose a process for building false positive mitigation models to classify static analysis alerts as actionable or unactionable using machine learning techniques. For two open source projects, we identify sets of alert characteristics predictive of actionable and unactionable alerts out of 51 candidate characteristics. From these selected characteristics, we evaluate 15 machine learning algorithms, which build models to classify alerts. We were able to obtain 88-97% average accuracy for both projects in classifying alerts using three to 14 alert characteristics. Additionally, the set of selected alert characteristics and best models differed between the two projects, suggesting that false positive mitigation models should be project-specific.
ACM Crossroads Student Magazine | 2007
Sarah Heckman
Static analysis tools are useful for finding common programming mistakes that often lead to field failures. However, static analysis tools regularly generate a high number of false positive alerts, requiring manual inspection by the developer to determine if an alert is an indication of a fault. The adaptive ranking model presented in this paper utilizes feedback from developers about inspected alerts in order to rank the remaining alerts by the likelihood that an alert is an indication of a fault. Alerts are ranked based on the homogeneity of populations of generated alerts, historical developer feedback in the form of suppressing false positives and fixing true positive alerts, and historical, application-specific data about the alert ranking factors. The ordering of alerts generated by the adaptive ranking model is compared to a baseline of randomly-, optimally-, and static analysis tool-ordered alerts in a small role-based health care application. The adaptive ranking model provides developers with 81% of true positive alerts after investigating only 20% of the alerts whereas an average of 50 random orderings of the same alerts found only 22% of true positive alerts after investigating 20% of the generated alerts.
international computing education research workshop | 2015
Sarah Heckman
Active learning increases student learning through collaborative engagement with materials during class time. A CS1.5 course at NC State, CSC216, uses active learning lectures involving short simplified think-pair-share in-class exercises to engage students with course materials. However, students still struggle with the course materials and several students do not successfully complete the course on their first attempt. To increase student learning and engagement, we conducted a quasi-experimental study incorporating in-class labs into two sections of CSC216 during the linear data structures unit in the Fall 2014 semester. Both sections completed in-class labs on the Java Collections Framework and iterators. One section completed in-class labs on array-based lists; the other section completed in-class labs on linked lists, in a counter-balanced study design. The active learning lecture delivery was used for the control section and an Exam was administered between the array-based list and linked list topics. Overall, we found no significant difference in student learning on array-based and linked lists as measured by the final exam. Students displayed half as much disengaged behavior during in-class labs and were five times more likely to ask for help from the teaching staff during in-class labs.
conference on software engineering education and training | 2011
Sarah Heckman; Thomas B. Horton; Mark Sherriff
Over the past two years, second-year Java and software engineering courses have been taught at the University of Virginia and North Carolina State University utilizing the Android OS platform. Instructors taught a variety of traditional second-year topics, including abstraction, design, requirements, and testing, utilizing a variety of Android-based mobile devices. Anecdotal responses from student surveys and evaluations from five course sessions indicate that teaching lower-level courses with more advanced and current technology, even with a steeper learning curve, is beneficial. In this tutorial proposal, we outline our plan for presenting a session that would help educators incorporate the Android OS into their curriculum and how to use the system even if mobile devices are not available.
foundations of software engineering | 2007
Mark Sherriff; Sarah Heckman; J. Michael Lake; Laurie Williams
In this paper, we propose a technique for leveraging historical field failure records in conjunction with automated static analysis alerts to determine which alerts or sets of alerts are predictive of a field failure. Our technique uses singular value decomposition to generate groupings of static analysis alert types, which we call alert signatures, that have been historically linked to field failure-prone files in previous releases of a software system. The signatures can be applied to sets of alerts from a current build of a software system. Files that have a matching alert signature are identified as having similar static analysis alert characteristics to files with known field failures in a previous release of the system. We performed a case study involving an industrial software system at IBM and found three distinct alert signatures that could be applied to the system. We found that 50% of the field failures reported since the last static analysis run could be discovered by examining the 10% of the files and static analysis alerts indicated by these three alert signatures. The remaining failures were either not detected by a signature which could be an indication of a new type of error in the field, or they were on areas of the code where no static analysis alerts were detected.
technical symposium on computer science education | 2017
Aaron J. Smith; Kristy Elizabeth Boyer; Jeffrey M. Forbes; Sarah Heckman; Ketan Mayer-Patel
Increased enrollments in computer science programs presents a new challenge of quickly accommodating higher enrollment in computer science introductory courses. Because peer teaching scales with enrollment size, it is a promising solution for supporting computer science students in this setting. However, pedagogical and logistical challenges can arise when implementing a large peer teaching program. To study these challenges, we developed a transparent online tool, My Digital Hand, for tracking one-to-one peer teaching interactions. We deployed the tool across three universities in large CS2 computer science courses. The data gathered confirms the pedagogical and logistical challenges that exist at scale and gives insight into ways we might address them. Using this information, we developed the second iteration of My Digital Hand to better support peer teaching. This paper presents the modified tool for use by the computer science education community.
conference of the centre for advanced studies on collaborative research | 2007
Mark Sherriff; Sarah Heckman; Mike Lake; Laurie Williams
Static analysis tools tend to generate more alerts than a development team can reasonably examine without some form of guidance. In this paper, we propose a technique for leveraging field failures and historical change records to determine which sets of alerts are often associated with a field failure using singular value decomposition. We performed a case study on six major components of an industrial software system at IBM over six builds spanning eighteen months of development. Our technique identified fourteen alert types that comprised sets of alerts that could identify, on average, 45% of future fault-prone files and up to 65% in some instances.
international conference on software engineering | 2007
Sarah Heckman
Software engineers tend to repeat mistakes when developing software. Automated static analysis tools can detect some of these mistakes early in the software process. However, these tools tend to generate a significant number of false positive alerts. Due to the need for manual inspection of alerts, the high number of false positives may make an automated static analysis tool too costly to use. In this research, we propose to rank alerts generated from automated static analysis tools via an adaptive model that predicts the probability an alert is a true fault in a system. The model adapts based upon a history of the actions the software engineer has taken to either filter false positive alerts or fix true faults. We hypothesize that by providing this adaptive ranking, software engineers will be more likely to act upon highly ranked alerts until the probability that remaining alerts are true positives falls below a subjective threshold.