Vadim Zaytsev | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vadim Zaytsev is active.

Explore More

Publication

Featured researches published by Vadim Zaytsev.

integrated formal methods | 2009

An Introduction to Grammar Convergence

Ralf Lämmel; Vadim Zaytsev

Grammar convergence is a lightweight verification method for establishing and maintaining the correspondence between grammar knowledge ingrained in all kinds of software artifacts, e.g., object models, XML schemas, parser descriptions, or language documents. The central idea is to extract grammars from diverse software artifacts, and to transform the grammars until they become syntactically identical. The present paper introduces and illustrates the basics of grammar convergence.

Software Quality Journal | 2011

Recovering grammar relationships for the Java Language Specification

Ralf Lämmel; Vadim Zaytsev

Grammar convergence is a method that helps in discovering relationships between different grammars of the same language or different language versions. The key element of the method is the operational, transformation-based representation of those relationships. Given input grammars for convergence, they are transformed until they are structurally equal. The transformations are composed from primitive operators; properties of these operators and the composed chains provide quantitative and qualitative insight into the relationships between the grammars at hand. We describe a refined method for grammar convergence, and we use it in a major study, where we recover the relationships between all the grammars that occur in the different versions of the Java Language Specification (JLS). The relationships are represented as grammar transformation chains that capture all accidental or intended differences between the JLS grammars. This method is mechanized and driven by nominal and structural differences between pairs of grammars that are subject to asymmetric, binary convergence steps. We present the underlying operator suite for grammar transformation in detail, and we illustrate the suite with many examples of transformations on the JLS grammars. We also describe the extraction effort, which was needed to make the JLS grammars amenable to automated processing. We include substantial metadata about the convergence process for the JLS so that the effort becomes reproducible and transparent.

model driven engineering languages and systems | 2014

Parsing in a Broad Sense

Vadim Zaytsev; Anya Helene Bagge

Having multiple representations of the same instance is common in software language engineering: models can be visualised as graphs, edited as text, serialised as XML. When mappings between such representations are considered, terms “parsing” and “unparsing” are often used with incompatible meanings and varying sets of underlying assumptions. We investigate 12 classes of artefacts found in software language processing, present a case study demonstrating their implementations and state-of-the-art mappings among them, and systematically explore the technical research space of bidirectional mappings to build on top of the existing body of work and discover as of yet unused relationships.

Electronic Communication of The European Association of Software Science and Technology | 2012

Language Evolution, Metasyntactically

Vadim Zaytsev

Currently existing syntactic definitions employ many different notations (usually dialects of EBNF) with slight deviations among them, which prevent efficient automated processing. When changes in such notation are required either due to maintenance activities such as correction or evolution, or because a grammar collection is written in a different notation than the one required by the grammarware toolkit, we speak of metalanguage evolution: i.e., a special language evolution scenario when the language itself does not necessarily evolve, but the notation in which it is written, does. Notational changes need to be propagated to different levels, such as to parsers that used to work with the old notation, to grammars of those notations that served as explanation material, and to the existing grammarbase. The solution proposed in this paper, relies on composing a notation specification and expressing notation changes as transformations of that specification. These transformation steps are coupled to changes in the notation grammar (i.e., grammar for grammars) and to changes in other grammars written in the original notation. This paper explains the general setup of such an infrastructure, with links to the prototypical implementation of the solution.

software language engineering | 2010

A unified format for language documents

Vadim Zaytsev; Ralf Lämmel

We have analyzed a substantial number of language documentation artifacts, including language standards, language specifications, language reference manuals, as well as internal documents of standardization bodies. We have reverse-engineered their intended internal structure, and compared the results. The Language Document Format (LDF), was developed to specifically support the documentation domain. We have also integrated LDF into an engineering discipline for language documents including tool support, for example, for rendering language documents, extracting grammars and samples, and migrating existing documents into LDF. The definition of LDF, tool support for LDF, and LDF applications are freely available through SourceForge.

conference on software maintenance and reengineering | 2014

Formal foundations for semi-parsing

Vadim Zaytsev

There exist many techniques for imprecise manipulation of source code (robust parsing, error repair, lexical analysis, etc), mostly relying on heuristic-based tolerance. Such techniques are rarely fully formalised and quite often idiosyncratic, which makes them very hard to compare with respect to their applicability, tolerance level and general usefulness. With a combination of recently developed formal methods such as Boolean grammars and parsing schemata, we can model different tolerant methods of modelling software and formally argue about relationships between them.

software language engineering | 2011

Comparison of context-free grammars based on parsing generated test data

Bernd Fischer; Ralf Lämmel; Vadim Zaytsev

There exist a number of software engineering scenarios that essentially involve equivalence or correspondence assertions for some of the context-free grammars in the scenarios. For instance, when applying grammar transformations during parser development--be it for the sake of disambiguation or grammar-class compliance--one would like to preserve the generated language. Even though equivalence is generally undecidable for context-free grammars, we have developed an automated approach that is practically useful in revealing evidence of nonequivalence of grammars and discovering correspondence mappings for grammar nonterminals. Our approach is based on systematic test data generation and parsing. We discuss two studies that show how the approach is used in comparing grammars of open source Java parsers as well as grammars from the course work for a compiler construction class.

language descriptions tools and applications | 2012

Notation-parametric grammar recovery

Vadim Zaytsev

Automation of grammar recovery is an important research area that received attention over the last decade and a half. Given the abundance of available documentation for software languages that is only going to keep increasing in the future, there is need for reliable extraction techniques that allow grammar engineers to derive useful information from it. This information can be further used to build grammarware, like parsers or test generators, or to perform grammar investigation. Grammars obtained systematically from existing sources always have preference over manually constructed ones due to traceability of their issues, including errors and design weaknesses. This paper focuses on automated grammar recovery from sources that utilise a family of metasyntaxes known as EBNF: many language specifications extend the well-studied Backus Naur Form in different directions, resulting in unnecessary diversity of syntactic notations. To enable manipulation of EBNF families, we use EDD, the EBNF Dialect Definition, a recently published DSL for notation specification, and base our approach on human-specified indications that guide the subsequent automated heuristic-based recovery process. Two separate scenarios are considered in the paper: a reliable syntactic notation and an unreliable one, with the latter being remarkably more difficult to handle, but also substantially more useful since it is so often encountered in practice. The proposed approach has been verified by two prototypes that were capable of extracting dozens of grammars written in 42 different syntactic notations.

Proceedings of the 6th International Workshop on Multi-Paradigm Modeling | 2012

Renarrating linguistic architecture: a case study

Vadim Zaytsev

We study the use of megamodels (models of linguistic architecture) for presenting software language engineering scenarios. Megamodels and techniques similar to them are frequently found in situations when a linguistic architecture needs to be understood without the implicit knowledge that was originally present, and in situations when such knowledge needs to be propagated. In this paper we specifically address the possibility of using one megamodel to tell several related stories --- that is, to renarrate it. Various re-narrations can address different aspects of the megamodel, without cluttering the readers view with irrelevant details. The renarration method is presented with the case study of a software language engineering technique of guided grammar convergence, and MegaL as a metamegamodel.

GTTSE'09 Proceedings of the 3rd international summer school conference on Generative and transformational techniques in software engineering III | 2009

Language convergence infrastructure

Vadim Zaytsev

The process of grammar convergence involves grammar extraction and transformation for structural equivalence and contains a range of technical challenges. These need to be addressed in order for the method to deliver useful results. The paper describes a DSL and the infrastructure behind it that automates the convergence process, hides negligible back-end details, aids development/debugging and enables application of grammar convergence technology to large scale projects. The necessity of having a strong framework is explained by listing case studies. Domain elements such as extractors and transformation operators are described to illustrate the issues that were successfully addressed.

Explore More