Eddie Antonio Santos
University of Alberta
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Eddie Antonio Santos.
mining software repositories | 2016
Joshua Charles Campbell; Eddie Antonio Santos; Abram Hindle
Organizations like Mozilla, Microsoft, and Apple are floodedwith thousands of automated crash reports per day. Although crash reports contain valuable information for debugging, there are often too many for developers to examineindividually. Therefore, in industry, crash reports are oftenautomatically grouped together in buckets. Ubuntu’s repository contains crashes from hundreds of software systemsavailable with Ubuntu. A variety of crash report bucketing methods are evaluated using data collected by Ubuntu’sApport automated crash reporting system. The trade-off between precision and recall of numerous scalable crash deduplication techniques is explored. A set of criteria that acrash deduplication method must meet is presented and several methods that meet these criteria are evaluated on anew dataset. The evaluations presented in this paper showthat using off-the-shelf information retrieval techniques, thatwere not designed to be used with crash reports, outperformother techniques which are specifically designed for the taskof crash bucketing at realistic industrial scales. This researchindicates that automated crash bucketing still has a lot ofroom for improvement, especially in terms of identifier tokenization.
PeerJ | 2016
Eddie Antonio Santos; Abram Hindle
Developers summarize their changes to code in commit messages. When a message seems “unusual”, however, this puts doubt into the quality of the code contained in the commit. We trained n-gram language models and used cross-entropy as an indicator of commit message “unusualness” of over 120 000 commits from open source projects. Build statuses collected from Travis-CI were used as a proxy for code quality. We then compared the distributions of failed and successful commits with regards to the “unusualness” of their commit message. Our analysis yielded significant results when correlating cross-entropy with build status.
Journal of Systems and Software | 2018
Eddie Antonio Santos; Carson McLean; Christopher Solinas; Abram Hindle
Context: Virtual machines provide isolation of services at the cost of hypervisors and more resource usage. This spurred the growth of systems like Docker that enable single hosts to isolate several applications, similar to VMs, within a low-overhead abstraction called containers. Motivation: Although containers tout low overhead performance, do they still have low energy consumption? Methodology: This work statistically compares (
PeerJ | 2017
Eddie Antonio Santos; Joshua Charles Campbell; Abram Hindle; José Nelson Amaral
t
software visualization | 2016
Michael D. Feist; Eddie Antonio Santos; Ian Watts; Abram Hindle
-test, Wilcoxon) the energy consumption of three application workloads in Docker and on bare-metal Linux. Results: In all cases, there was a statistically significant (
PeerJ | 2017
Eddie Antonio Santos; Karim Ali
t
PeerJ | 2016
Joshua Charles Campbell; Eddie Antonio Santos; Abram Hindle
-test and Wilcoxon
PeerJ | 2015
Eddie Antonio Santos; Abram Hindle
p < 0.05
mining software repositories | 2016
Eddie Antonio Santos; Abram Hindle
) increase in energy consumption when running tests in Docker, mostly due to the performance of I/O system calls.
ieee international conference on software analysis evolution and reengineering | 2018
Eddie Antonio Santos; Joshua Charles Campbell; Dhvani Patel; Abram Hindle; José Nelson Amaral
Minor syntax errors are made by novice and experienced programmers alike; however, novice programmers lack the years of intuition that help them resolve these tiny errors. Standard LR parsers typically resolve syntax errors and their precise location poorly. We propose a methodology that helps locate where syntax errors occur, but also suggests possible changes to the token stream that can fix the error identified. This methodology finds syntax errors by checking if two language models “agree” on each token. If the models disagree, it indicates a possible syntax error; the methodology tries to suggest a fix by finding an alternative token sequence obtained from the models. We trained two LSTM (Long short-term memory) language models on a large corpus of JavaScript code collected from GitHub. The dual LSTM neural network model predicts the correct location of the syntax error 54.74% in its top 4 suggestions and produces an exact fix up to 35.50% of the time. The results show that this tool and methodology can locate and suggest corrections for syntax errors. Our methodology is of practical use to all programmers, but will be especially useful to novices frustrated with incomprehensible syntax errors.