Terence John Parr | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Terence John Parr is active.

Explore More

Publication

Featured researches published by Terence John Parr.

Software - Practice and Experience | 1995

ANTLR: a predicated- LL(k) parser generator

Terence John Parr; Russell W. Quong

Despite the parsing power of LR/LALR algorithms, e.g. YACC, programmers often choose to write recursive‐descent parsers by hand to obtain increased flexibility, better error handling, and ease of debugging. We introduce ANTLR, a public‐domain parser generator that combines the flexibility of hand‐coded parsing with the convenience of a parser generator, which is a component of PCCTS. ANTLR has many features that make it easier to use than other language tools. Most important, ANTLR provides predicates which let the programmer systematically direct the parse via arbitrary expressions using semantic and syntactic context; in practice, the use of predicates eliminates the need to hand‐tweak the ANTLR output, even for difficult parsing problems. ANTLR also integrates the description of lexical and syntactic analysis, accepts LL(k) grammars for k > 1 with extended BNF notation, and can automatically generate abstract syntax trees.

international world wide web conferences | 2004

Enforcing strict model-view separation in template engines

Terence John Parr

The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development. This situation is due mostly to a lack of formal definition of separation and fear that enforcing separation emasculates a templates power. I show that not only is strict separation a worthy design principle, but that we can enforce separation while providing a potent template engine. I demonstrate my StringTemplate engine, used to build jGuru.com and other commercial sites, at work solving some nontrivial generational tasks.My goal is to formalize the study of template engines, thus, providing a common nomenclature, a means of classifying template generational power, and a way to leverage interesting results from formal language theory. I classify three types of restricted templates analogous to Chomskys type 1..3 grammar classes and formally define separation including the rules that embody separation.Because this paper provides a clear definition of model-view separation, template engine designers may no longer blindly claim enforcement of separation. Moreover, given theoretical arguments and empirical evidence, programmers no longer have an excuse to entangle model and view.

programming language design and implementation | 2011

LL(*): the foundation of the ANTLR parser generator

Terence John Parr; Kathleen Fisher

Despite the power of Parser Expression Grammars (PEGs) and GLR, parsing is not a solved problem. Adding nondeterminism (parser speculation) to traditional LL and LR parsers can lead to unexpected parse-time behavior and introduces practical issues with error handling, single-step debugging, and side-effecting embedded grammar actions. This paper introduces the LL(*) parsing strategy and an associated grammar analysis algorithm that constructs LL(*) parsing decisions from ANTLR grammars. At parse-time, decisions gracefully throttle up from conventional fixed k>=1 lookahead to arbitrary lookahead and, finally, fail over to backtracking depending on the complexity of the parsing decision and the input symbols. LL(*) parsing strength reaches into the context-sensitive languages, in some cases beyond what GLR and PEGs can express. By statically removing as much speculation as possible, LL(*) provides the expressivity of PEGs while retaining LLs good error handling and unrestricted grammar actions. Widespread use of ANTLR (over 70,000 downloads/year) shows that it is effective for a wide variety of applications.

compiler construction | 1994

Adding Semantic and Syntactic Predicates To LL(k): pred-LL(k)

Terence John Parr; Russell W. Quong

Most language translation problems can be solved with existing LALR(1) or LL(k) language tools; e.g., YACC [Joh78] or ANTLR [PDC92]. However, there are language constructs that defy almost all parsing strategy commonly in use. Some of these constructs cannot be parsed without semantics, such as symbol table information, and some cannot be properly recognized without first examining the entire construct, that is we need “infinite lookahead.”

Sigplan Notices | 1996

LL and LR translators need k >1 lookahead

Terence John Parr; Russell W. Quong

Language translation is a harder and more important problem than language recognition. In particular, programmers implement translators not recognizers. Yet too often, translation is equated with the simpler task of syntactic parsing. This misconception coupled with computing limitations of past computers has led to the almost exclusive use of LR(1) and LL(1) in parser generators. We claim that use of k < 1 lookahead can and should be available in practice, as it simplifies the translator development. We give several examples justifying our arguments.

international conference on web engineering | 2006

Web application internationalization and localization in action

Terence John Parr

A template engine that strictly enforces model-view separation has been shown to be at least as expressive as a context free grammar allowing the engine to, for example, easily generate any file describable by an XML DTD. When faced with supporting internationalized web applications, however, template engine designers have backed off from enforcing strict separation, allowing unrestricted embedded code segments because it was unclear how localization could otherwise occur. The consequence, unfortunately, is that each reference to a localized data value, such as a date or monetary value, replicates essentially the same snippet of code thousands of times across hundreds of templates for a large site. The potential for cut-and-paste induced bugs and the duplication of code proves a maintenance nightmare. Moreover, page designers are ill-equipped to deal with code fragments. But the difficult question remains: How can localization be done without allowing unrestricted embedded code segments that open the door to model-view entanglement? The answer is simply to automate the localization of data values, thus, avoiding code duplication, making it easier on the developer and designer, and reducing opportunities for the introduction of bugs--all-the-while maintaining the sanctity of strict model-view separation. This paper describes how the StringTemplate template engine strictly enforces model-view separation while handily supporting internationalized web application architectures. Demonstrations of page text localization, locale-specific site designs, and automatic data localization are provided.

international conference on program comprehension | 2008

The Reuse of Grammars with Embedded Semantic Actions

Terence John Parr

Reusing syntax specifications without embedded arbitrary semantic actions is straightforward because the semantic analysis phases of new applications can feed off trees or other intermediate structures constructed by the pre-existing parser. The presence of arbitrary embedded semantic actions, however, makes reuse difficult with existing mechanisms such as grammar inheritance and modules. This short paper proposes a mechanism based upon prototype grammars that automatically pushes changes from prototypes to derived grammars even in the presence of semantic actions. The prototype mechanism alone would be unsuitable for creating a new grammar from multiple preexisting grammars. When combined with grammar composition, however, the prototype mechanism would improve grammar reuse because imported pre-existing grammars could be altered to suit each new application.

Scientific Programming | 1995

The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

Matthew T. O'Keefe; Terence John Parr; Kevin Edgar; Steve Anderson; Paul R. Woodward; Hank Dietz

Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. We have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.

software language engineering | 2016

Towards a universal code formatter through machine learning

Terence John Parr; Jurgen J. Vinju

There are many declarative frameworks that allow us to implement code formatters relatively easily for any specific language, but constructing them is cumbersome. The first problem is that “everybody” wants to format their code differently, leading to either many formatter variants or a ridiculous number of configuration options. Second, the size of each implementation scales with a language’s grammar size, leading to hundreds of rules. In this paper, we solve the formatter construction problem using a novel approach, one that automatically derives formatters for any given language without intervention from a language expert. We introduce a code formatter called CodeBuff that uses machine learning to abstract formatting rules from a representative corpus, using a carefully designed feature set. Our experiments on Java, SQL, and ANTLR grammars show that CodeBuff is efficient, has excellent accuracy, and is grammar invariant for a given language. It also generalizes to a 4th language tested during manuscript preparation.

Archive | 2013