Philip J. Guo
University of California, San Diego
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Philip J. Guo.
Science of Computer Programming | 2007
Michael D. Ernst; Jeff H. Perkins; Philip J. Guo; Stephen McCamant; Carlos Pacheco; Matthew S. Tschantz; Chen Xiao
Daikon is an implementation of dynamic detection of likely invariants; that is, the Daikon invariant detector reports likely program invariants. An invariant is a property that holds at a certain point or points in a program; these are often used in assert statements, documentation, and formal specifications. Examples include being constant (x=a), non-zero (x 0), being in a range (a@?x@?b), linear relationships (y=ax+b), ordering (x@?y), functions from a library (x=fn(y)), containment (x@?y), sortedness (xissorted), and many more. Users can extend Daikon to check for additional invariants. Dynamic invariant detection runs a program, observes the values that the program computes, and then reports properties that were true over the observed executions. Dynamic invariant detection is a machine learning technique that can be applied to arbitrary data. Daikon can detect invariants in C, C++, Java, and Perl programs, and in record-structured data sources; it is easy to extend Daikon to other applications. Invariants can be useful in program understanding and a host of other applications. Daikons output has been used for generating test cases, predicting incompatibilities in component integration, automating theorem proving, repairing inconsistent data structures, and checking the validity of data streams, among other tasks. Daikon is freely available in source and binary form, along with extensive documentation, at http://pag.csail.mit.edu/daikon/.
Legal Studies | 2014
Philip J. Guo; Juho Kim; Rob Rubin
Videos are a widely-used kind of resource for online learning. This paper presents an empirical study of how video production decisions affect student engagement in online educational videos. To our knowledge, ours is the largest-scale study of video engagement to date, using data from 6.9 million video watching sessions across four courses on the edX MOOC platform. We measure engagement by how long students are watching each video, and whether they attempt to answer post-video assessment problems. Our main findings are that shorter videos are much more engaging, that informal talking-head videos are more engaging, that Khan-style tablet drawings are more engaging, that even high-quality pre-recorded classroom lectures might not make for engaging online videos, and that students engage differently with lecture and tutorial videos. Based upon these quantitative findings and qualitative insights from interviews with edX staff, we developed a set of recommendations to help instructors and video producers take better advantage of the online video format. Finally, to enable researchers to reproduce and build upon our findings, we have made our anonymized video watching data set and analysis scripts public. To our knowledge, ours is one of the first public data sets on MOOC resource usage.
international conference on software engineering | 2009
Adam Kieyzun; Philip J. Guo; Karthick Jayaraman; Michael D. Ernst
We present a technique for finding security vulnerabilities in Web applications. SQL Injection (SQLI) and cross-site scripting (XSS) attacks are widespread forms of attack in which the attacker crafts the input to the application to access or modify user data and execute malicious code. In the most serious attacks (called second-order, or persistent, XSS), an attacker can corrupt a database so as to cause subsequent users to execute malicious code.
international symposium on software testing and analysis | 2009
Adam Kiezun; Vijay Ganesh; Philip J. Guo; Pieter Hooimeijer; Michael D. Ernst
Many automatic testing, analysis, and verification techniques for programs can be effectively reduced to a constraint generation phase followed by a constraint-solving phase. This separation of concerns often leads to more effective and maintainable tools. The increasing efficiency of off-the-shelf constraint solvers makes this approach even more compelling. However, there are few effective and sufficiently expressive off-the-shelf solvers for string constraints generated by analysis techniques for string-manipulating programs. We designed and implemented Hampi, a solver for string constraints over fixed-size string variables. Hampi constraints express membership in regular languages and fixed-size context-free languages. Hampi constraints may contain context-free-language definitions, regular language definitions and operations, and the membership predicate. Given a set of constraints, Hampi outputs a string that satisfies all the constraints, or reports that the constraints are unsatisfiable. Hampi is expressive and efficient, and can be successfully applied to testing and analysis of real programs. Our experiments use Hampi in: static and dynamic analyses for finding SQL injection vulnerabilities in Web applications; automated bug finding in C programs using systematic testing; and compare Hampi with another string solver. Hampis source code, documentation, and the experimental data are available at http://people.csail.mit.edu/akiezun/hampi.
international conference on software engineering | 2010
Philip J. Guo; Thomas Zimmermann; Nachiappan Nagappan; Brendan Murphy
We performed an empirical study to characterize factors that affect which bugs get fixed in Windows Vista and Windows 7, focusing on factors related to bug report edits and relationships between people involved in handling the bug. We found that bugs reported by people with better reputations were more likely to get fixed, as were bugs handled by people on the same team and working in geographical proximity. We reinforce these quantitative results with survey feedback from 358 Microsoft employees who were involved in Windows bugs. Survey respondents also mentioned additional qualitative influences on bug fixing, such as the importance of seniority and interpersonal skills of the bug reporter. Informed by these findings, we built a statistical model to predict the probability that a new bug will be fixed (the first known one, to the best of our knowledge). We trained it on Windows Vista bugs and got a precision of 68% and recall of 64% when predicting Windows 7 bug fixes. Engineers could use such a model to prioritize bugs during triage, to estimate developer workloads, and to decide which bugs should be closed or migrated to future product versions.
Legal Studies | 2014
Juho Kim; Philip J. Guo; Daniel T. Seaton; Piotr Mitros; Krzysztof Z. Gajos; Robert C. Miller
With thousands of learners watching the same online lecture videos, analyzing video watching patterns provides a unique opportunity to understand how students learn with videos. This paper reports a large-scale analysis of in-video dropout and peaks in viewership and student activity, using second-by-second user interaction data from 862 videos in four Massive Open Online Courses (MOOCs) on edX. We find higher dropout rates in longer videos, re-watching sessions (vs first-time), and tutorials (vs lectures). Peaks in re-watching sessions and play events indicate points of interest and confusion. Results show that tutorials (vs lectures) and re-watching sessions (vs first-time) lead to more frequent and sharper peaks. In attempting to reason why peaks occur by sampling 80 videos, we observe that 61% of the peaks accompany visual transitions in the video, e.g., a slide view to a classroom view. Based on this observation, we identify five student activity patterns that can explain peaks: starting from the beginning of a new material, returning to missed content, following a tutorial step, replaying a brief segment, and repeating a non-visual explanation. Our analysis has design implications for video authoring, editing, and interface design, providing a richer understanding of video learning on MOOCs.
Legal Studies | 2014
Philip J. Guo; Katharina Reinecke
The current generation of Massive Open Online Courses (MOOCs) attract a diverse student audience from all age groups and over 196 countries around the world. Researchers, educators, and the general public have recently become interested in how the learning experience in MOOCs differs from that in traditional courses. A major component of the learning experience is how students navigate through course content. This paper presents an empirical study of how students navigate through MOOCs, and is, to our knowledge, the first to investigate how navigation strategies differ by demographics such as age and country of origin. We performed data analysis on the activities of 140,546 students in four edX MOOCs and found that certificate earners skip on average 22% of the course content, that they frequently employ non-linear navigation by jumping backward to earlier lecture sequences, and that older students and those from countries with lower student-teacher ratios are more comprehensive and non-linear when navigating through the course. From these findings, we suggest design recommendations such as for MOOC platforms to develop more detailed forms of certification that incentivize students to deeply engage with the content rather than just doing the minimum necessary to earn a passing grade. Finally, to enable other researchers to reproduce and build upon our findings, we have made our data set and analysis scripts publicly available.
international conference on software engineering | 2012
Thomas Zimmermann; Nachiappan Nagappan; Philip J. Guo; Brendan Murphy
Fixing bugs is an important part of the software development process. An underlying aspect is the effectiveness of fixes: if a fair number of fixed bugs are reopened, it could indicate instability in the software system. To the best of our knowledge there has been on little prior work on understanding the dynamics of bug reopens. Towards that end, in this paper, we characterize when bug reports are reopened by using the Microsoft Windows operating system project as an empirical case study. Our analysis is based on a mixed-methods approach. First, we categorize the primary reasons for reopens based on a survey of 358 Microsoft employees. We then reinforce these results with a large-scale quantitative study of Windows bug reports, focusing on factors related to bug report edits and relationships between people involved in handling the bug. Finally, we build statistical models to describe the impact of various metrics on reopening bugs ranging from the reputation of the opener to how the bug was found.
conference on computer supported cooperative work | 2011
Philip J. Guo; Thomas Zimmermann; Nachiappan Nagappan; Brendan Murphy
Bug reporting/fixing is an important social part of the soft-ware development process. The bug-fixing process inher-ently has strong inter-personal dynamics at play, especially in how to find the optimal person to handle a bug report. Bug report reassignments, which are a common part of the bug-fixing process, have rarely been studied. In this paper, we present a large-scale quantitative and qualitative analysis of the bug reassignment process in the Microsoft Windows Vista operating system project. We quantify social interactions in terms of both useful and harmful reassignments. For instance, we found that reassignments are useful to determine the best person to fix a bug, contrary to the popular opinion that reassignments are always harmful. We categorized five primary reasons for reassignments: finding the root cause, determining ownership, poor bug report quality, hard to determine proper fix, and workload balancing. We then use these findings to make recommendations for the design of more socially-aware bug tracking systems that can overcome some of the inefficiencies we observed in our study.
international symposium on software testing and analysis | 2006
Philip J. Guo; Jeff H. Perkins; Stephen McCamant; Michael D. Ernst
An abstract type groups variables that are used for related purposes in a program. We describe a dynamic unification-based analysis for inferring abstract types. Initially, each run-time value gets a unique abstract type. A run-time interaction among values indicates that they have the same abstract type, so their abstract types are unified. Also at run time, abstract types for variables are accumulated from abstract types for values. The notion of interaction may be customized, permitting the analysis to compute finer or coarser abstract types; these different notions of abstract type are useful for different tasks. We have implemented the analysis for compiled x86 binaries and for Java bytecodes. Our experiments indicate that the inferred abstract types are useful for program comprehension, improve both the results and the run time of a follow-on program analysis, and are more precise than the output of a comparable static analysis, without suffering from overfitting.