Nuthan Munaiah
Rochester Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nuthan Munaiah.
Empirical Software Engineering | 2017
Nuthan Munaiah; Steven Kroh; Craig Cabrey; Meiyappan Nagappan
Software forges like GitHub host millions of repositories. Software engineering researchers have been able to take advantage of such a large corpora of potential study subjects with the help of tools like GHTorrent and Boa. However, the simplicity in querying comes with a caveat: there are limited means of separating the signal (e.g. repositories containing engineered software projects) from the noise (e.g. repositories containing home work assignments). The proportion of noise in a random sample of repositories could skew the study and may lead to researchers reaching unrealistic, potentially inaccurate, conclusions. We argue that it is imperative to have the ability to sieve out the noise in such large repository forges. We propose a framework, and present a reference implementation of the framework as a tool called reaper, to enable researchers to select GitHub repositories that contain evidence of an engineered software project. We identify software engineering practices (called dimensions) and propose means for validating their existence in a GitHub repository. We used reaper to measure the dimensions of 1,857,423 GitHub repositories. We then used manually classified data sets of repositories to train classifiers capable of predicting if a given GitHub repository contains an engineered software project. The performance of the classifiers was evaluated using a set of 200 repositories with known ground truth classification. We also compared the performance of the classifiers to other approaches to classification (e.g. number of GitHub Stargazers) and found our classifiers to outperform existing approaches. We found stargazers-based classifier (with 10 as the threshold for number of stargazers) to exhibit high precision (97%) but an inversely proportional recall (32%). On the other hand, our best classifier exhibited a high precision (82%) and a high recall (86%). The stargazer-based criteria offers precision but fails to recall a significant portion of the population.
Proceedings of the International Workshop on App Market Analytics | 2016
Iván Tactuk Mercado; Nuthan Munaiah; Andrew Meneely
Mobile app developers today have a hard decision to make: to independently develop native apps for different operating systems or to develop an app that is cross-platform compatible. The availability of different tools and approaches to support cross-platform app development only makes the decision harder. In this study, we used user reviews of apps to empirically understand the relationship (if any) between the approach used in the development of an app and its perceived quality. We used Natural Language Processing (NLP) models to classify 787,228 user reviews of the Android version and iOS version of 50 apps as complaints in one of four quality concerns: performance, usability, security, and reliability. We found that hybrid apps (on both Android and iOS platforms) tend to be more prone to user complaints than interpreted/generated apps. In a study of Facebook, an app that underwent a change in development approach from hybrid to native, we found that change in the development approach was accompanied by a reduction in user complaints about performance and reliability.
Empirical Software Engineering | 2017
Nuthan Munaiah; Felivel Camilo; Wesley Wigham; Andrew Meneely; Meiyappan Nagappan
As developers face an ever-increasing pressure to engineer secure software, researchers are building an understanding of security-sensitive bugs (i.e. vulnerabilities). Research into mining software repositories has greatly increased our understanding of software quality via empirical study of bugs. Conceptually, however, vulnerabilities differ from bugs: they represent an abuse of functionality as opposed to insufficient functionality commonly associated with traditional, non-security bugs. We performed an in-depth analysis of the Chromium project to empirically examine the relationship between bugs and vulnerabilities. We mined 374,686 bugs and 703 post-release vulnerabilities over five Chromium releases that span six years of development. We used logistic regression analysis, ranking analysis, bug type classifications, developer experience, and vulnerability severity metrics to examine the overarching question: are bugs and vulnerabilities in the same files? While we found statistically significant correlations between pre-release bugs and post-release vulnerabilities, we found the association to be weak. Number of features, source lines of code, and pre-release security bugs are, in general, more closely associated with post-release vulnerabilities than any of our non-security bug categories. In further analysis, we examined sub-types of bugs, such as stability-related bugs, and the associations did not improve. Even the files with the most severe vulnerabilities (by measure of CVSS or bounty payouts) did not show strong correlations with number of bugs. These results indicate that bugs and vulnerabilities are empirically dissimilar groups, motivating the need for security engineering research to target vulnerabilities specifically.
Proceedings of the 2nd International Workshop on Software Analytics | 2016
Nuthan Munaiah; Andrew Meneely
The Common Vulnerability Scoring System (CVSS) is the de facto standard for vulnerability severity measurement today and is crucial in the analytics driving software fortification. Required by the U.S. National Vulnerability Database, over 75,000 vulnerabilities have been scored using CVSS. We compare how the CVSS correlates with another, closely-related measure of security impact: bounties. Recent economic studies of vulnerability disclosure processes show a clear relationship between black market value and bounty payments. We analyzed the CVSS scores and bounty awarded for 703 vulnerabilities across 24 products. We found a weak (Spearmans ρ = 0.34) correlation between CVSS scores and bounties, with CVSS being more likely to underestimate bounty. We believe such a negative result is a cause for concern. We investigated why these measurements were so discordant by (a) analyzing the individual questions of CVSS with respect to bounties and (b) conducting a qualitative study to find the similarities and differences between CVSS and the publicly-available criteria for awarding bounties. Among our findings were that the bounty criteria were more explicit about code execution and privilege escalation whereas CVSS makes no explicit mention of those. We also found that bounty valuations are evaluated solely by project maintainers, whereas CVSS has little provenance in practice.
ieee international conference on requirements engineering | 2017
Nuthan Munaiah; Andrew Meneely; Pradeep K. Murukannaiah
Existing work on identifying security requirements relies on training binary classification models using domain-specific data sets to achieve a high accuracy. Considering that domain-specific data sets are often not readily available, we propose a domain-independent model for classifying security requirements based on two key ideas. First, we train our model on the description of weaknesses from the Common Weakness Enumeration (CWE) data set. Although CWE does not describe requirements, it describes security weaknesses that are manifestations of unrealized security requirements. Second, we exploit a one-class classification model that relies only on positive samples (description of weaknesses in CWE), eliminating the need for negative samples, collecting which can be nontrivial.We evaluated our model on three industrial requirements documents from different domains. We found that a One-Class Support Vector Machine trained with domain-independent CWE data set outperforms a model from prior literature by identifying security requirements with an average precision, recall and F-score of 67.35%, 70.48% and 67.68%, respectively. Further, considering data sets from prior literature (consisting of both positive and negative examples), we found that one-class classifiers trained with only positive examples outperformed binary classifiers trained with both positive and negative examples in two out of three evaluation data sets, demonstrating the potential value of one-class classification for security requirements identification.
engineering secure software and systems | 2017
Nuthan Munaiah; Benjamin S. Meyers; Cecilia Ovesdotter Alm; Andrew Meneely; Pradeep K. Murukannaiah; Emily Prud’hommeaux; Josephine Wolff; Yang Yu
Engineering secure software is challenging. Software development organizations leverage a host of processes and tools to enable developers to prevent vulnerabilities in software. Code reviewing is one such approach which has been instrumental in improving the overall quality of a software system. In a typical code review, developers critique a proposed change to uncover potential vulnerabilities. Despite best efforts by developers, some vulnerabilities inevitably slip through the reviews. In this study, we characterized linguistic features—inquisitiveness, sentiment and syntactic complexity—of conversations between developers in a code review, to identify factors that could explain developers missing a vulnerability. We used natural language processing to collect these linguistic features from 3,994,976 messages in 788,437 code reviews from the Chromium project. We collected 1,462 Chromium vulnerabilities to empirically analyze the linguistic features. We found that code reviews with lower inquisitiveness, higher sentiment, and lower complexity were more likely to miss a vulnerability. We used a Naive Bayes classifier to assess if the words (or lemmas) in the code reviews could differentiate reviews that are likely to miss vulnerabilities. The classifier used a subset of all lemmas (over 2 million) as features and their corresponding TF-IDF scores as values. The average precision, recall, and F-measure of the classifier were 14%, 73%, and 23%, respectively. We believe that our linguistic characterization will help developers identify problematic code reviews before they result in a vulnerability being missed.
Proceedings of the International Workshop on App Market Analytics | 2016
Nuthan Munaiah; Casey Klimkowsky; Shannon McRae; Adam Blaine; Samuel A. Malachowsky; Cesar Perez; Daniel E. Krutz
The Android platform comprises the vast majority of the mobile market. Unfortunately, Android apps are not immune to issues that plague conventional software including security vulnerabilities, bugs, and permission-based problems. In order to address these issues, we need a better understanding of the apps we use everyday. Over the course of more than a year, we collected and reverse engineered 64,868 Android apps from the Google Play store as well as 1,669 malware samples collected from several sources. Each app was analyzed using several static analysis tools to collect a variety of quality and security related information. The apps spanned 41 different categories, and constituted a total of 576,174 permissions, 39,780 unique signing keys and 125,159 over-permissions. We present the dataset of these apps, and a sample set of analytics, on our website---http://darwin.rit.edu---with the option of downloading the dataset for offline evaluation.
international conference on software engineering | 2018
Nuthan Munaiah
As more aspects of our daily lives rely on technology, the software that enables the technology must be secure. Developers rely on practices such as threat modeling, static and dynamic analyses, code review, and fuzz and penetration testing to engineer secure software. These practices, while effective at identifying vulnerabilities in software, are limited in their ability to describe the potential reasons for the existence of vulnerabilities. In order to overcome this limitation, researchers have proposed empirically validated metrics to identify factors that may have led to the introduction of vulnerabilities in the past. Developers must be made aware of these factors so that they can proactively consider the security implications of each line of code that they contribute. The goal of our research is to assist developers in engineering secure software by providing a technique that generates scientific, interpretable, and actionable feedback on security as the software evolves. In this paper, we provide an overview of our proposed approach to accomplish this research goal through a series of three research studies in which we (1) systematize the knowledge on vulnerability discovery metrics, (2) leverage the metrics to generate feedback on security, and (3) implement a framework for providing automatically generated feedback on security using code reviews as a medium.
Information & Software Technology | 2018
Christopher Theisen; Nuthan Munaiah; Mahran Al-Zyoud; Jeffrey C. Carver; Andrew Meneely; Laurie Williams
Abstract Context Michael Howard conceptualized the attack surface of a software system as a metaphor for risk assessment during the development and maintenance of software. While the phrase attack surface is used in a variety of contexts in cybersecurity, professionals have different conceptions of what the phrase means. Objective The goal of this systematic literature review is to aid researchers and practitioners in reasoning about security in terms of attack surface by exploring various definitions of the phrase attack surface. Method We reviewed 644 works from prior literature, including research papers, magazine articles, and technical reports, that use the phrase attack surface and categorized them into those that provided their own definition; cited another definition; or expected the reader to intuitively understand the phrase. Results In our study, 71% of the papers used the phrase without defining it or citing another paper. Additionally, we found six themes of definitions for the phrase attack surface. Conclusion Based on our analysis, we recommend practitioners choose a definition of attack surface appropriate for their domain based on the six themes we identified in our study.
Proceedings of the International Workshop on App Market Analytics | 2016
Daniel E. Krutz; Nuthan Munaiah; Andrew Meneely; Samuel A. Malachowsky