Is this you? Create Your Porfile

Henry Feild

University of Massachusetts Amherst

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Henry Feild is active.

Explore More

Publication

Featured researches published by Henry Feild.

international acm sigir conference on research and development in information retrieval | 2010

Predicting searcher frustration

Henry Feild; James Allan; Rosie Jones

When search engine users have trouble finding information, they may become frustrated, possibly resulting in a bad experience (even if they are ultimately successful). In a user study in which participants were given difficult information seeking tasks, half of all queries submitted resulted in some degree of self-reported frustration. A third of all successful tasks involved at least one instance of frustration. By modeling searcher frustration, search engines can predict the current state of user frustration and decide when to intervene with alternative search strategies to prevent the user from becoming more frustrated, giving up, or switching to another search engine. We present several models to predict frustration using features extracted from query logs and physical sensors. We are able to predict frustration with a mean average precision of 65% from the physical sensors, and 87% from the query log features.

Innovations in Systems and Software Engineering | 2007

Effective identifier names for comprehension and memory

Dawn J. Lawrie; Christopher H. Morrell; Henry Feild; David W. Binkley

Readers of programs have two main sources of domain information: identifier names and comments. When functions are uncommented, as many are, comprehension is almost exclusively dependent on the identifier names. Assuming that writers of programs want to create quality identifiers (e.g., identifiers that include relevant domain knowledge), one must ask how should they go about it. For example, do the initials of a concept name provide enough information to represent the concept? If not, and a longer identifier is needed, is an abbreviation satisfactory or does the concept need to be captured in an identifier that includes full words? What is the effect of longer identifiers on limited short term memory capacity? Results from a study designed to investigate these questions are reported. The study involved over 100 programmers who were asked to describe 12 different functions and then recall identifiers that appeared in each function. The functions used three different levels of identifiers: single letters, abbreviations, and full words. Responses allow the extent of comprehension associated with the different levels to be studied along with their impact on memory. The functions used in the study include standard computer science textbook algorithms and functions extracted from production code. The results show that full-word identifiers lead to the best comprehension; however, in many cases, there is no statistical difference between using full words and abbreviations. When considered in the light of limited human short-term memory, well-chosen abbreviations may be preferable in some situations since identifiers with fewer syllables are easier to remember.

source code analysis and manipulation | 2007

Extracting Meaning from Abbreviated Identifiers

Dawn J. Lawrie; Henry Feild; David W. Binkley

Informative identifiers are made up of full (natural language) words and (meaningful) abbreviations. Readers of programs typically have little trouble understanding the purpose of identifiers composed of full words. In addition, those familiar with the code can (most often) determine the meaning of abbreviations used in identifiers. However, when faced with unfamiliar code, abbreviations often carry little useful information. Furthermore, tools that focus on the natural language used in the code have a hard time in the presence of abbreviations. One approach to providing meaning to programmers and tools is to translate (expand) abbreviations into full words. This paper presents a methodology for expanding identifiers and evaluates the process on a code based of just over 35 million lines of code. For example, using phrase extraction, fs_exists is expanded to file_status_exists illustrating how the expansion process can facilitate comprehension. On average, 16 percent of the identifiers in a program are expanded. Finally, as an example application, the approach is used to improve the syntactic identification of violations to Deissenbock and Pizkas rules for concise and consistent identifier construction.

international conference on program comprehension | 2006

Leveraged Quality Assessment using Information Retrieval Techniques

Dawn J. Lawrie; Henry Feild; David W. Binkley

The goal of this research is to apply language processing techniques to extend human judgment into situations where obtaining direct human judgment is impractical due to the volume of information that must be considered. On aspect of this is leveraged quality assessments, which can be used to evaluate third-party coded subsystems, to track quality across the versions of a program, to assess the compression effort (and subsequent cost) required to make a change, and to identify parts of a program in need of preventative maintenance. A description of the QALP tool, its output from just under two million lines of code, and an experiment aimed at evaluating the tools use in leveraged quality assessment are presented. Statistically significant results from this experiment validate the use of the QALP tool in human leverage quality assessment

Empirical Software Engineering | 2007

Quantifying identifier quality: an analysis of trends

Dawn J. Lawrie; Henry Feild; David W. Binkley

Identifiers, which represent the defined concepts in a program, account for, by some measures, almost three quarters of source code. The makeup of identifiers plays a key role in how well they communicate these defined concepts. An empirical study of identifier quality based on almost 50 million lines of code, covering thirty years, four programming languages, and both open and proprietary source is presented. For the purposes of the study, identifier quality is conservatively defined as the possibility of constructing the identifier out of dictionary words or known abbreviations. Four hypotheses related to identifier quality are considered using linear mixed effect regression models. For example, the first hypothesis is that modern programs include higher quality identifiers than older ones. In this case, the results show that better programming practices are producing higher quality identifies. Results also confirm some commonly held beliefs, such as proprietary code having more acronyms than open source code.

international acm sigir conference on research and development in information retrieval | 2013

Task-aware query recommendation

Henry Feild; James Allan

When generating query recommendations for a user, a natural approach is to try and leverage not only the users most recently submitted query, or reference query, but also information about the current search context, such as the users recent search interactions. We focus on two important classes of queries that make up search contexts: those that address the same information need as the reference query (on-task queries), and those that do not (off-task queries). We analyze the effects on query recommendation performance of using contexts consisting of only on-task queries, only off-task queries, and a mix of the two. Using TREC Session Track data for simulations, we demonstrate that on-task context is helpful on average but can be easily overwhelmed when off-task queries are interleaved---a common situation according to several analyses of commercial search logs. To minimize the impact of off-task queries on recommendation performance, we consider automatic methods of identifying such queries using a state of the art search task identification technique. Our experimental results show that automatic search task identification can eliminate the effect of off-task queries in a mixed context. We also introduce a novel generalized model for generating recommendations over a search context. While we only consider query text in this study, the model can handle integration over arbitrary user search behavior, such as page visits, dwell times, and query abandonment. In addition, it can be used for other types of recommendation, including personalized web search.

Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007) | 2007

Software Fault Prediction using Language Processing

David W. Binkley; Henry Feild; Dawn J. Lawrie; Maurizio Pighin

Accurate prediction of faulty modules reduces the cost of software development and evolution. Two case studies with a language-processing based fault prediction measure are presented. The measure, refereed to as a QALP score, makes use of techniques from information retrieval to judge software quality. The QALP score has been shown to correlate with human judgements of software quality. The two case studies consider the measures application to fault prediction using two programs (one open source, one proprietary). Linear mixed-effects regression models are used to identify relationships between defects and QALP score. Results, while complex, show that little correlation exists in the first case study, while statistically significant correlations exists in the second. In this second study the QALP score is helpful in predicting faults in modules (files) with its usefulness growing as module size increases.

Journal of Software Maintenance and Evolution: Research and Practice | 2007

An empirical study of rules for well-formed identifiers

Dawn J. Lawrie; Henry Feild; David W. Binkley

Readers of programs have two main sources of domain information: identifier names and comments. In order to efficiently maintain source code, it is important that the identifier names (as well as comments) communicate clearly the concepts they represent. Deisenbock and Pizka recently introduced two rules for creating well-formed identifiers: one considers the consistency of identifiers and the other their conciseness. These rules require a mapping from identifiers to the concepts they represent, which may be costly to develop after the initial release of a system. An approach for verifying whether identifiers are well formed without any additional information (e.g., a concept mapping) is developed. Using a pool of 48 million lines of code, experiments with the resulting syntactic rules for well-formed identifiers illustrate that violations of the syntactic pattern exist. Two case studies show that three-quarters of these violations are ‘real’. That is, they could be identified using a concept mapping. Three related studies show that programmers tend to use a rather limited vocabulary, that, contrary to many other aspects of system evolution, maintenance does not introduce additional rule violations, and that open and proprietary sources differ in their percentage of violations. Copyright

Journal of Systems and Software | 2009

Increasing diversity: Natural language measures for software fault prediction

David W. Binkley; Henry Feild; Dawn J. Lawrie; Maurizio Pighin

While challenging, the ability to predict faulty modules of a program is valuable to a software project because it can reduce the cost of software development, as well as software maintenance and evolution. Three language-processing based measures are introduced and applied to the problem of fault prediction. The first measure is based on the usage of natural language in a programs identifiers. The second measure concerns the conciseness and consistency of identifiers. The third measure, referred to as the QALP score, makes use of techniques from information retrieval to judge software quality. The QALP score has been shown to correlate with human judgments of software quality. Two case studies consider the language processing measures applicability to fault prediction using two programs (one open source, one proprietary). Linear mixed-effects regression models are used to identify relationships between defects and the measures. Results, while complex, show that language processing measures improve fault prediction, especially when used in combination. Overall, the models explain one-third and two-thirds of the faults in the two case studies. Consistent with other uses of language processing, the value of the three measures increases with the size of the program module considered.

international acm sigir conference on research and development in information retrieval | 2011

CrowdLogging: distributed, private, and anonymous search logging

Henry Feild; James Allan; Joshua Glatt

We describe CrowdLogging, an approach for distributed search log collection, storage, and mining, with the dual goals of preserving privacy and making the mined information broadly available. Most search log mining approaches and most privacy enhancing schemes have focused on centralized search logs and methods for disseminating them to third parties. In our approach, a users search log is encrypted and shared in such a way that (a) the source of a search behavior artifact, such as a query, is unknown and (b) extremely rare artifacts---that is, artifacts more likely to contain private information---are not revealed. The approach works with any search behavior artifact that can be extracted from a search log, including queries, query reformulations, and query-click pairs. In this work, we: (1) present a distributed search log collection, storage, and mining framework; (2) compare several privacy policies, including differential privacy, showing the trade-offs between strong guarantees and the utility of the released data; (3) demonstrate the impact of our approach using two existing research query logs; and (4) describe a pilot study for which we implemented a version of the framework.

Explore More