Jan Kurs
University of Bern
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jan Kurs.
Computer Languages, Systems and Structures archive | 2015
Jan Kurs; Mircea Lungu; Rathesan Iyadurai; Oscar Nierstrasz
Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely.Usually water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a language engineer has to create water tailored to each individual island. Such an approach is fragile, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by an engineer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually.In this paper we propose a new technique of island parsing - bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. Our work focuses on applications of island parsing to data extraction from source code. We have integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability. HighlightsTraditional island grammars are difficult to define and are not flexible enough.Bounded seas - a new technique of island parsing - are composable, robust, reusable and easy to define.Bounded seas are specified using our extension of parsing expression grammars.Parsers utilizing bounded seas require less effort to define and provide both good precision and performance in the two performed case studies.
TOOLS'12 Proceedings of the 50th international conference on Objects, Models, Components, Patterns | 2012
Jan Vraný; Jan Kurs; Claus Gittinger
Programming languages are still evolving, and programming languages and language features are being designed and implemented every year. Since it is not a trivial task to provide a runtime system for a new language, existing runtime systems such as the Java Virtual Machine or the Common Language Runtime are used to host the new language. However, most of the high-performance runtime systems were designed for a specific language with a specific semantics. Therefore, if the new language semantics differs from the semantics hard-coded in a runtime system, it has to be emulated on top of features supported by the runtime. The emulation causes performance overhead. To overcome the limitations of an emulation, a runtime system may provide a meta-object protocol to alter the runtime semantics. The protocol should fulfill opposing goals: it should be flexible, easy to use, fast and easy to implement at the same time. We propose a simple meta-object protocol for customization of a method lookup in Smalltalk. A programmer may define his own custom method lookup routine in Smalltalk and let the runtime system to call it when needed. Therefore there is no need to modify the runtime system itself. Our solution provides reasonable performance thanks to low-level support in a runtime system, nevertheless the changes to the runtime system are small and local. At the same time, it provides the flexibility to implement a wide range of features present in modern programming languages. The presented approach has been implemented and validated on a Smalltalk virtual machine.
Proceedings of the International Workshop on Smalltalk Technologies | 2012
Marcel Hlopko; Jan Kurs; Jan Vraný; Claus Gittinger
After decades of development in programming languages and programming environments, Smalltalk is still one of few environments that provide advanced features and is still widely used in the industry. However, as Java became prevalent, the ability to call Java code from Smalltalk and vice versa becomes important. Traditional approaches to integrate the Java and Smalltalk languages are through low-level communication between separate Java and Smalltalk virtual machines. We are not aware of any attempt to execute and integrate the Java language directly in the Smalltalk environment. A direct integration allows for very tight and almost seamless integration of the languages and their objects within a single environment. Yet integration and language interoperability impose challenging issues related to method naming conventions, method overloading, exception handling and thread-locking mechanisms. In this paper we describe ways to overcome these challenges and to integrate Java into the Smalltalk environment. Using techniques described in this paper, the programmer can call Java code from Smalltalk using standard Smalltalk idioms while the semantics of each language remains preserved. We present STX:LIBJAVA --- an implementation of Java virtual machine within Smalltalk/X --- as a validation of our approach.
7th International Conference, SLE 2014 | 2014
Jan Kurs; Mircea Lungu; Oscar Nierstrasz
Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely. Usually, water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a programmer has to create water tailored to each individual island. Such an approach is fragile, however, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by a programmer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually. In this paper we propose a new technique of island parsing - bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. We integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability.
Proceedings of the 11th edition of the International Workshop on Smalltalk Technologies | 2016
Jan Kurs; Jan Vraný; Mohammad Ghafari; Mircea Lungu; Oscar Nierstrasz
Parser combinators are a popular approach to parsing. Parser combinators follow the structure of an underlying grammar, are modular, well-structured, easy to maintain, and can recognize a large variety of languages including context-sensitive ones. However, their universality and flexibility introduces a noticeable performance overhead. Time-wise, parser combinators cannot compete with parsers generated by well-performing parser generators or optimized handwritten code. Techniques exist to achieve a linear asymptotic performance of parser combinators, yet there is still a significant constant multiplier. This can be further lowered using meta-programming techniques. In this work we present a more traditional approach to optimization --- a compiler --- applied to the domain of parser combinators. A parser combinator compiler (pc-compiler) analyzes a parser combinator, applies parser combinatorspecific optimizations and, generates an equivalent high-performance top-down parser. Such a compiler preserves the advantages of parser combinators while complementing them with better performance.
Science of Computer Programming | 2014
Marcel Hlopko; Jan Kurs; Jan Vraný; Claus Gittinger
After decades of development in programming languages and programming environments, Smalltalk is still one of few environments that provide advanced features and is used in the industry. However, as Java became prevalent, the ability to call a Java code from Smalltalk became important. A traditional approach to integrate the Java and Smalltalk languages is through low-level communication between separate Java and Smalltalk virtual machines. To our best knowledge there is no other project attempting to execute and integrate the Java language directly in the Smalltalk environment. A direct integration allows for a very tight integration of the languages and their objects within a single environment. Yet integration and language interoperability impose challenging issues related to method naming conventions, method overloading, exception handling and thread-locking mechanisms.In this paper we describe ways to overcome these challenges and to integrate Java into the Smalltalk environment. We focus on a possibility to call a Java code from Smalltalk using standard Smalltalk idioms while the semantics of both languages remains preserved. We present stx:libjava - an implementation of a Java virtual machine within Smalltalk/X - as a validation of our approach. We identify and solve problems imposed by Java and Smalltalk integration.STX:LIBJAVA, a JVM implementation built into Smalltalk/X is presented.Interoperability features of STX:LIBJAVA (e.g. dynamic proxy methods) are shown.We validated our implementation on various Java projects, e.g. Groovy or Tomcat.Performance comparation of Oracle JVM and STX:LIBJAVA is provided.
2013 2nd International Workshop on User Evaluations for Software Engineering Researchers (USER) | 2013
Mircea Lungu; Jan Kurs
One of the long running debates between programmers is whether camelCaseldentifiers are better than underscore_identifiers. This is ultimately a matter of programming language culture and personal taste, and to our best knowledge none of the camps has won the argument yet. It is our intuition that a solution exists which is superior to both the previous ones from the point of view of usability: the solution we name sentence case identifiers allows phrases as nams for program entities such as classes or methods. In this paper we propose a study in which to evaluate the impact of sentence case identifiers in practice.
Science of Computer Programming | 2017
Jan Kurs; Jan Vraný; Mohammad Ghafari; Mircea Lungu; Oscar Nierstrasz
Abstract Parser combinators offer a universal and flexible approach to parsing. They follow the structure of an underlying grammar, are modular, well-structured, easy to maintain, and can recognize a large variety of languages including context-sensitive ones. However, these advantages introduce a noticeable performance overhead mainly because the same powerful parsing algorithm is used to recognize even simple languages. Time-wise, parser combinators cannot compete with parsers generated by well-performing parser generators or optimized hand-written code. Techniques exist to achieve a linear asymptotic performance of parser combinators, yet there is a significant constant multiplier. The multiplier can be lowered to some degree, but this requires advanced meta-programming techniques, such as staging or macros, that depend heavily on the underlying language technology. In this work we present a language-agnostic solution. We optimize the performance of parsing combinators with specializations of parsing strategies. For each combinator, we analyze the language parsed by the combinator and choose the most efficient parsing strategy. By adapting a parsing strategy for different parser combinators we achieve performance comparable to that of hand-written or optimized parsers while preserving the advantages of parsers combinators.
Archive | 2016
Jan Kurs; Mircea Lungu; Oscar Nierstrasz; Thomas Steinmann
Polite Smalltalk is a programming language that allows programmers to use sentence case identifiers — a notation for embedding spaces in identifiers. We hope that a syntax like that of Polite will encourage developers to write more readable code. Even the smallest increase in code readability is to be desired since software developers spend the largest part of their time reading code rather than writing it.
Science of Computer Programming | 2015
Oscar Nierstrasz; Jan Kurs