Alexander Serebrenik
Eindhoven University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alexander Serebrenik.
symposium on principles of database systems | 1999
Sara Cohen; Alexander Serebrenik
We investigate the problem of rewriting queries with aggregate operators using views that may or may not contain aggregate operators. A rewriting of a query is a second query that uses view predicates such that evaluating first the views and then the rewriting yields the same result as evaluating the original query. In this sense, the original query and the rewriting are equivalent modulo the view definitions. The queries and views we consider correspond to unnested SQL queries, possibly with union, that employ the operators min, max, count, and sum. Our approach is based on syntactic characterizations of the equivalence of aggregate queries. One contribution of this paper are characterizations of the equivalence of disjunctive aggregate queries, which generalize our previous results for the conjunctive case. For each operator a, we introduce several types of queries using views as candidates for rewritings. We unfold such a candidate by replacing each occurrence of a view predicate with its definition, thus obtaining a regular aggregate query. The candidates have a different, usually more complex operator than a. We prove that unfolding the candidate, however, results in a regular aggregate query that is equivalent to the candidate modulo the view definitions. This property justifies considering these types of queries as natural candidates for rewritings. In this way, we reduce the problem of whether there exist rewritings of a particular type to a problem involving equivalence. We distinguish between partial rewritings that contain at least one view predicate and complete rewritings that contain only view predicates. In contrast to previous work on this topic, we not only give sufficient, but also necessary conditions for a rewriting to exist. More precisely, we show for each type of candidate that the existence of both, partial and complete rewritings is decidable, and we provide upper and lower complexity bounds.
applications and theory of petri nets | 2008
Jan Martijn E. M. van der Werf; Boudewijn F. van Dongen; Cor A. J. Hurkens; Alexander Serebrenik
The research domain of process discovery aims at constructing a process model (e.g. a Petri net) which is an abstract representation of an execution log. Such a model should (1) be able to reproduce the log under consideration and (2) be independent of the number of cases in the log. In this paper, we present a process discovery algorithm where we use concepts taken from the language-based theory of regions, a well-known Petri net research area. We identify a number of shortcomings of this theory from the process discovery perspective, and we provide solutions based on integer linear programming.
conference on computer supported cooperative work | 2014
Bogdan Vasilescu; Alexander Serebrenik; Premkumar T. Devanbu; Vladimir Filkov
Historically, mailing lists have been the preferred means for coordinating development and user support activities. With the emergence and popularity growth of social Q&A sites such as the StackExchange network (e.g., StackOverflow), this is beginning to change. Such sites offer different socio-technical incentives to their participants than mailing lists do, e.g., rich web environments to store and manage content collaboratively, or a place to showcase their knowledge and expertise more vividly to peers or potential recruiters. A key difference between StackExchange and mailing lists is gamification, i.e., StackExchange participants compete to obtain reputation points and badges. In this paper, we use a case study of R (a widely-used tool for data analysis) to investigate how mailing list participation has evolved since the launch of StackExchange. Our main contribution is the assembly of a joint data set from the two sources, in which participants in both the texttt{r-help} mailing list and StackExchange are identifiable. This permits their activities to be linked across the two resources and also over time. With this data set we found that user support activities show a strong shift away from texttt{r-help}. In particular, mailing list experts are migrating to StackExchange, where their behaviour is different. First, participants active both on texttt{r-help} and on StackExchange are more active than those who focus exclusively on only one of the two. Second, they provide faster answers on StackExchange than on texttt{r-help}, suggesting they are motivated by the emph{gamified} environment. To our knowledge, our study is the first to directly chart the changes in behaviour of specific contributors as they migrate into gamified environments, and has important implications for knowledge management in software engineering.
Applicable Algebra in Engineering, Communication and Computing | 2001
Nachum Dershowitz; Naomi Lindenstrauss; Yehoshua Sagiv; Alexander Serebrenik
Abstract. This paper describes a general framework for automatic termination analysis of logic programs, where we understand by “termination” the finiteness of the LD-tree constructed for the program and a given query. A general property of mappings from a certain subset of the branches of an infinite LD-tree into a finite set is proved. From this result several termination theorems are derived, by using different finite sets. The first two are formulated for the predicate dependency and atom dependency graphs. Then a general result for the case of the query-mapping pairs relevant to a program is proved (cf. [29, 21]). The correctness of the TermiLog system described in [22] follows from it. In this system it is not possible to prove termination for programs involving arithmetic predicates, since the usual order for the integers is not well-founded. A new method, which can be easily incorporated in TermiLog or similar systems, is presented, which makes it possible to prove termination for programs involving arithmetic predicates. It is based on combining a finite abstraction of the integers with the technique of the query-mapping pairs, and is essentially capable of dividing a termination proof into several cases, such that a simple termination function suffices for each case. Finally several possible extensions are outlined.
conference on software maintenance and reengineering | 2011
W Wouter Poncin; Alexander Serebrenik; Mgj Mark van den Brand
Software developers’ activities are in general recorded in software repositories such as version control systems, bug trackers and mail archives. While abundant information is usually present in such repositories, successful information extraction is often challenged by the necessity to simultaneously analyze different repositories and to combine the information obtained. We propose to apply process mining techniques, originally developed for business process analysis, to address this challenge. However, in order for process mining to become applicable, different software repositories should be combined, and “related” software development events should be matched: e.g., mails sent about a file, modifications of the file and bug reports that can be traced back to it. The combination and matching of events has been implemented in FRASR (Framework for Analyzing Software Repositories), augmenting the process mining framework ProM. FRASR has been successfully applied in a series of case studies addressing such aspects of the development process as roles of different developers and the way bug reports are handled.
human factors in computing systems | 2015
Bogdan Vasilescu; Daryl Posnett; Baishakhi Ray; Mark van den Brand; Alexander Serebrenik; Premkumar T. Devanbu; Vladimir Filkov
Software development is usually a collaborative venture. Open Source Software (OSS) projects are no exception; indeed, by design, the OSS approach can accommodate teams that are more open, geographically distributed, and dynamic than commercial teams. This, we find, leads to OSS teams that are quite diverse. Team diversity, predominantly in offline groups, is known to correlate with team output, mostly with positive effects. How about in OSS? Using GitHub, the largest publicly available collection of OSS projects, we studied how gender and tenure diversity relate to team productivity and turnover. Using regression modeling of GitHub data and the results of a survey, we show that both gender and tenure diversity are positive and significant predictors of productivity, together explaining a sizable fraction of the data variability. These results can inform decision making on all levels, leading to better outcomes in recruiting and performance.
Empirical Software Engineering | 2014
Bogdan Vasilescu; Alexander Serebrenik; Mathieu Goeminne; Tom Mens
Most empirical studies of open source software repositories focus on the analysis of isolated projects, or restrict themselves to the study of the relationships between technical artifacts. In contrast, we have carried out a case study that focuses on the actual contributors to software ecosystems, being collections of software projects that are maintained by the same community. To this aim, we defined a new series of workload and involvement metrics, as well as a novel approach—
Journal of Software: Evolution and Process | 2013
Karine Mordal; Nicolas Anquetil; Jannik Laval; Alexander Serebrenik; Bogdan Vasilescu; Stéphane Ducasse
\widetilde{\mathbf{T}}
applications and theory of petri nets | 2006
Kees M. van Hee; Irina A. Lomazova; Oi Olivia Oanea; Alexander Serebrenik; Natalia Sidorova; Marc Voorhoeve
-graphs—for reporting the results of comparing multiple distributions. We used these techniques to statistically study how workload and involvement of ecosystem contributors varies across projects and across activity types, and we explored to which extent projects and contributors specialise in particular activity types. Using Gnome as a case study we observed that, next to coding, the activities of localization, development documentation and building are prevalent throughout the ecosystem. We also observed notable differences between frequent and occasional contributors in terms of the activity types they are involved in and the number of projects they contribute to. Occasional contributors and contributors that are involved in many different projects tend to be more involved in the localization activity, while frequent contributors tend to be more involved in the coding activity in a limited number of projects.
mining software repositories | 2014
D Daniel Pletea; Bogdan Vasilescu; Alexander Serebrenik
With the growing need for quality assessment of entire software systems in the industry, new issues are emerging. First, because most software quality metrics are defined at the level of individual software components, there is a need for aggregation methods to summarize the results at the system level. Second, because a software evaluation requires the use of different metrics, with possibly widely varying output ranges, there is a need to combine these results into a unified quality assessment. In this paper we derive, from our experience on real industrial cases and from the scientific literature, requirements for an aggregation method. We then present a solution through the Squale model for metric aggregation, a model specifically designed to address the needs of practitioners. We empirically validate the adequacy of Squale through experiments on Eclipse. Additionally, we compare the Squale model to both traditional aggregation techniques (e.g., the arithmetic mean), and to econometric inequality indices (e.g., the Gini or the Theil indices), recently applied to aggregation of software metrics. Copyright