Jacek Czerwonka | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jacek Czerwonka is active.

Explore More

Publication

Featured researches published by Jacek Czerwonka.

international conference on software testing verification and validation | 2011

CRANE: Failure Prediction, Change Analysis and Test Prioritization in Practice -- Experiences from Windows

Jacek Czerwonka; Rajiv Das; Nachiappan Nagappan; Alex Tarvo; Alex Teterev

Building large software systems is difficult. Maintaining large systems is equally hard. Making post-release changes requires not only thorough understanding of the architecture of a software component about to be changed but also its dependencies and interactions with other components in the system. Testing such changes in reasonable time and at a reasonable cost is a difficult problem as infinitely many test cases can be executed for any modification. It is important to obtain a risk assessment of impact of such post-release change fixes. Further, testing of such changes is complicated by the fact that they are applicable to hundreds of millions of users, even the smallest mistakes can translate to a very costly failure and re-work. There has been significant amount of research in the software engineering community on failure prediction, change analysis and test prioritization. Unfortunately, there is little evidence on the use of these techniques in day-to-day software development in industry. In this paper, we present our experiences with CRANE: a failure prediction, change risk analysis and test prioritization system at Microsoft Corporation that leverages existing research for the development and maintenance of Windows Vista. We describe the design of CRANE, validation of its useful-ness and effectiveness in practice and our learnings to help enable other organizations to implement similar tools and practices in their environment.

IEEE Software | 2013

CODEMINE: Building a Software Development Data Analytics Platform at Microsoft

Jacek Czerwonka; Nachiappan Nagappan; Wolfram Schulte; Brendan Murphy

The scale and speed of todays software development efforts impose unprecedented constraints on the pace and quality of decisions made during planning, implementation, and postrelease maintenance and support for software. Decisions during the planning process include level of staffing and choosing a development model given the scope of a project and timelines. Tracking progress, course correcting, and identifying and mitigating risks are key in the development phase, as are monitoring aspects of and improving overall customer satisfaction in the maintenance and support phase. Availability of relevant data can greatly increase both the speed and likelihood of making a decision that leads to a successful software system. This article outlines the process Microsoft has gone through developing CODEMINE--a software development data analytics platform for collecting and analyzing engineering process data—its constraints, and pivotal organizational and technical choices.

international conference on software engineering | 2015

The art of testing less without sacrificing quality

Kim Herzig; Michaela Greiler; Jacek Czerwonka; Brendan Murphy

Testing is a key element of software development processes for the management and assessment of product quality. In most development environments, the software engineers are responsible for ensuring the functional correctness of code. However, for large complex software products, there is an additional need to check that changes do not negatively impact other parts of the software and they comply with system constraints such as backward compatibility, performance, security etc. Ensuring these system constraints may require complex verification infrastructure and test procedures. Although such tests are time consuming and expensive and rarely find defects they act as an insurance process to ensure the software is compliant. However, long lasting tests increasingly conflict with strategic aims to shorten release cycles. To decrease production costs and to improve development agility, we created a generic test selection strategy called THEO that accelerates test processes without sacrificing product quality. THEO is based on a cost model, which dynamically skips tests when the expected cost of running the test exceeds the expected cost of removing it. We replayed past development periods of three major Microsoft products resulting in a reduction of 50% of test executions, saving millions of dollars per year, while maintaining product quality.

foundations of software engineering | 2009

Test case comparison and clustering using program profiles and static execution

Vipindeep Vangala; Jacek Czerwonka; Phani Kishore Talluri

Selection of diverse test cases and elimination of duplicates are two major problems in product testing life cycle, especially in sustained engineering environment. In order to solve these, we introduce a framework of test case comparison metrics which will quantitatively describe the distance between any arbitrary test case pair of an existing test suite, allowing various test case analysis applications. We combine program profiles from test execution, static analysis and statistical techniques to capture various aspects of test execution and compute a specialized test case distance measurement. Using these distance metrics, we drive a customized hierarchical test suite clustering algorithm that groups similar test cases together. We present an industrial strength framework called SPIRiT that works at binary level, implementing different metrics in the form of coverage, control, data, def-use, temporal variances and does test case clustering. This is step towards integrating runtime analysis, static analysis, statistical techniques and machine learning to drive new generation of test suite analysis algorithms.

mining software repositories | 2015

Code ownership and software quality: a replication study

Michaela Greiler; Kim Herzig; Jacek Czerwonka

In a traditional sense, ownership determines rights and duties in regard to an object, for example a property. The owner of source code usually refers to the person that invented the code. However, larger code artifacts, such as files, are usually composed by multiple engineers contributing to the entity over time through a series of changes. Frequently, the person with the highest contribution, e.g. The most number of code changes, is defined as the code owner and takes responsibility for it. Thus, code ownership relates to the knowledge engineers have about code. Lacking responsibility and knowledge about code can reduce code quality. In an earlier study, Bird et al. [1] showed that Windows binaries that lacked clear code ownership were more likely to be defect prone. However recommendations for large artifacts such as binaries are usually not actionable. E.g. Changing the concept of binaries and refactoring them to ensure strong ownership would violate system architecture principles. A recent replication study by Foucault et al. [2] on open source software replicate the original results and lead to doubts about the general concept of ownership impacting code quality. In this paper, we replicated and extended the previous two ownership studies [1, 2] and reflect on their findings. Further, we define several new ownership metrics to investigate the dependency between ownership and code quality on file and directory level for 4 major Microsoft products. The results confirm the original findings by Bird et al. [1] that code ownership correlates with code quality. Using new and refined code ownership metrics we were able to classify source files that contained at least one bug with a median precision of 0.74 and a median recall of 0.38. On directory level, we achieve a precision of 0.76 and a recall of 0.60.

international conference on software engineering | 2015

Code reviews do not find bugs: how the current code review best practice slows us down

Jacek Czerwonka; Michaela Greiler; Jack Tilford

Because of its many uses and benefits, code reviews are a standard part of the modern software engineering workflow. Since they require involvement of people, code reviewing is often the longest part of the code integration activities. Using experience gained at Microsoft and with support of data, we posit (1) that code reviews often do not find functionality issues that should block a code submission; (2) that effective code reviews should be performed by people with specific set of skills; and (3) that the social aspect of code reviews cannot be ignored. We find that we need to be more sophisticated with our guidelines for the code review workflow. We show how our findings from code reviewing practice influence our code review tools at Microsoft. Finally, we assert that, due to its costs, code reviewing practice is a topic deserving to be better understood, systematized and applied to software engineering workflow with more precision than the best practice currently prescribes.

international conference on software engineering | 2014

Transition from centralized to decentralized version control systems: a case study on reasons, barriers, and outcomes

Kıvanç Muşlu; Christian Bird; Nachiappan Nagappan; Jacek Czerwonka

In recent years, software development has started to transition from centralized version control systems (CVCSs) to decentralized version control systems (DVCSs). Although CVCSs and DVCSs have been studied extensively, there has been little research on the transition across these systems. This paper investigates the transition process, from the developer’s view, in a large company. The paper captures the transition reasons, barriers, and outcomes through 10 developer interviews, and investigates these findings through a survey, participated by 70 developers. The paper identifies that the majority of the developers need to work incrementally and offline, and manage multiple contexts efficiently. DVCSs fulfill these developer needs; however the transition comes with a cost depending on the previous development workflow. The paper discusses the transition reasons, barriers and outcomes, and provides recommendations for teams planning such a transition. The paper shows that lightweight branches, and local and incremental commits were the main reasons for developers wanting to move to a DVCS. Further, the paper identifies the main problems with the transition process as: steep DVCS learning curve; incomplete DVCS integration with the rest of the development workflow; and DVCS scaling issues.

international conference on software testing verification and validation workshops | 2014

An Empirical Comparison of Combinatorial and Random Testing

Laleh Shikh Gholamhossein Ghandehari; Jacek Czerwonka; Yu Lei; Soheil Shafiee; Raghu N. Kacker; D. Richard Kuhn

Some conflicting results have been reported on the comparison between t-way combinatorial testing and random testing. In this paper, we report a new study that applies t-way and random testing to the Siemens suite. In particular, we investigate the stability of the two techniques. We measure both code coverage and fault detection effectiveness. Each program in the Siemens suite has a number of faulty versions. In addition, mutation faults are used to better evaluate fault detection effectiveness in terms of both number and diversity of faults. The experimental results show that in most cases, t-way testing performed as good as or better than random testing. There are few cases where random testing performed better, but with a very small margin. Overall, the differences between the two techniques are not as significant as one would have probably expected. We discuss the practical implications of the results. We believe that more studies are needed to better understand the comparison of the two techniques.

international conference on software maintenance | 2011

An integration resolution algorithm for mining multiple branches in version control systems

Alexander Tarvo; Thomas Zimmermann; Jacek Czerwonka

The high cost of software maintenance necessitates methods to improve the efficiency of the maintenance process. Such methods typically need a vast amount of knowledge about a system, which is often mined from software repositories. Collecting this data becomes a challenge if the system was developed using multiple code branches. In this paper we present an integration resolution algorithm that facilitates data collection across multiple code branches. The algorithm tracks code integrations across different branches and associates code changes in the main development branch with corresponding changes in other branches. We provide evidence for the practical relevance of this algorithm during the development of the Windows Vista Service Pack 2.

international symposium on software reliability engineering | 2013

Predicting risk of pre-release code changes with Checkinmentor

Alexander Tarvo; Nachiappan Nagappan; Thomas Zimmermann; Thirumalesh Bhat; Jacek Czerwonka

Code defects introduced during the development of the software system can result in failures after its release. Such post-release failures are costly to fix and have negative impact on the reputation of the released software. In this paper we propose a methodology for early detection of faulty code changes. We describe code changes with metrics and then use a statistical model that discriminates between faulty and non-faulty changes. The predictions are done not at a file or binary level but at the change level thereby assessing the impact of each change. We also study the impact of code branches on collecting code metrics and on the accuracy of the model. The model has shown high accuracy and was developed into a tool called CheckinMentor. CheckinMentor was deployed to predict risk for the Windows Phone software. However, our methodology is versatile and can be used to predict risk in a variety of large complex software systems.

Explore More