Shubhashis Sengupta | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shubhashis Sengupta is active.

Explore More

Publication

Featured researches published by Shubhashis Sengupta.

Proceedings of the Second International Workshop on CrowdSourcing in Software Engineering | 2015

Crowd Build: A Methodology for Enterprise Software Development Using Crowdsourcing

Anurag Dwarakanath; Upendra Chintala; N. C. Shrikanth; Gurdeep Virdi; Alex Kass; Anitha Chandran; Shubhashis Sengupta; Sanjoy Paul

We present and evaluate a software development methodology that addresses key challenges for the application of Crowd sourcing to an enterprise application development. Our methodology presents a mechanism to systematically break the overall business application into small tasks such that the tasks can be completed independently and in parallel by the crowd. Our methodology supports automated testing and automatic integration. We evaluate our methodology by developing a web application through Crowd sourcing. The methodology was tested through two Crowd sourcing models: one through contests and the other through hiring freelancers. We present various metrics of the Crowd sourcing experiment and compare against the estimate for the traditional software development methodology.

Future Generation Computer Systems | 2014

Multi-site data distribution for disaster recovery-A planning framework

Shubhashis Sengupta; K. M. Annervaz

In this paper, we present DDP-DR: a Data Distribution Planner for Disaster Recovery. DDP-DR provides an optimal way of backing-up critical business data into data centers (DCs) across several Geographic locations. DDP-DR provides a plan for replication of backup data across potentially large number of data centers so that (i) the client data is recoverable in the event of catastrophic failure at one or more data centers (disaster recovery) and, (ii) the client data is replicated and distributed in an optimal way taking into consideration major business criteria such as cost of storage, protection level against site failures, and other business and operational parameters like recovery point objective (RPO), and recovery time objective (RTO). The planner uses Erasure Coding (EC) to divide and codify data chunks into fragments and distribute the fragments across DR sites or storage zones so that failure of one or more site / zone can be tolerated and data can be regenerated. We describe data distribution planning approaches for both single customer and multiple customer scenarios. We describe a fault-tolerant multi-cloud data backup scheme using erasure coding.The data is distributed using a plan driven by a multi-criteria optimization.The plan uses parameters like cost, replication level, recoverability objective etc.Both single customer and multiple customer cases are tackled.Simulation results for the plans and sensitivity analyses are discussed.

working conference on reverse engineering | 2012

Software Clustering: Unifying Syntactic and Semantic Features

Janardan Misra; K. M. Annervaz; Vikrant Kaulgud; Shubhashis Sengupta; Gary Titus

Software clustering is an important technique for extracting high level component architecture from the underlying source code. One of the limitations of the existing approaches is that most of the proposed techniques use only similar types of features for estimating distance between source code elements. Therefore, in cases, where the selected features are poorly present in the source code, these techniques may not produce good quality results in absence of adequate inputs to work on. In this paper we propose an approach to overcome this limitation. Proposed approach uses a combination of multiple types of features together and applies automated weighing on the extracted features to enhance their information quality and to reduce noise. We define a way to estimate distance between code elements in terms of combination of multiple types of features. Weighted graph partitioning with a multi-objective global modularity criterion is used to select the clusters as architectural components. We describe methods for automated labeling of the extracted components and for generating inter-component interactions. We further discuss how the suggested approach extends to clustering at multiple hierarchical levels, to application portfolios, and even for improving precision for the feature location problem.

international conference on cloud computing | 2012

ReLoC: A Resilient Loosely Coupled Application Architecture for State Management in the Cloud

Vibhu Saujanya Sharma; Shubhashis Sengupta; K. M. Annervaz

Maintaining the state of applications and user sessions is difficult in large scale web-based software systems. This problem is particularly accentuated in the context of Cloud computing as Cloud providers, especially Platform as a Service (PaaS) vendors, do not explicitly support state management infrastructure - such as clustering. In a PaaS environment, a user has little or no access and control over the server platform and session management layer. Additionally, the platform tiers are generally loosely coupled and service-oriented. These make traditional session-state management techniques non-usable. In this work, we present ReLoC - a session-state management architecture for Cloud that uses loosely-coupled services and platform agnostic scalable messaging technology to propagate and save session states. Preliminary experiments show a very high level of tolerance to failures of the platform tiers without corresponding disruptions in user sessions. We argue that, in the context of PaaS Clouds, ReLoC architecture will be more scalable compared to traditional clustering environments.

Proceedings of the 4th International Workshop on Twin Peaks of Requirements and Architecture | 2014

A framework for identifying and analyzing non-functional requirements from text

Vibhu Saujanya Sharma; Roshni R. Ramnani; Shubhashis Sengupta

Early identification of Non-Functional Requirements (NFRs) is important as this has direct bearing on the design and architecture of the system. NFRs form the basis for architects to create the technical architecture of the system which acts as the scaffolding in which the functionality of the same is delivered. Failure to identify and analyze NFRs early-on can result in unclassified, incomplete or conflicting NFRs, and this typically results in costly rework in later stages of the software development. In practice, this activity is primarily done manually. In this paper, we present a framework to automatically detect and classify non-functional requirements from textual natural language requirements. Our approach to identify NFRs is based on extracting multiple features by parsing the natural language requirement whereby the presence of a certain combination of and relationship among the features uniquely identifies the requirement as an NFR of a particular category. These features are specified as pattern based rules which can be specified in a human readable language through the use of a domain specific language that we have defined. This enables great ease and flexibility in creating and extending rules. Our approach has been implemented as a prototype tool and here we also present the results of applying our approach on a publicly available requirement corpus.

international conference on software maintenance | 2017

Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques

Jayati Deshmukh; Annervaz K. M; Sanjay Podder; Shubhashis Sengupta; Neville Dubash

Duplicate Bug Detection is the problem of identifying whether a newly reported bug is a duplicate of an existing bug in the system and retrieving the original or similar bugs from the past. This is required to avoid costly rediscovery and redundant work. In typical software projects, the number of duplicate bugs reported may run into the order of thousands, making it expensive in terms of cost and time for manual intervention. This makes the problem of duplicate or similar bug detection an important one in Software Engineering domain. However, an automated solution for the same is not quite accurate yet in practice, in spite of many reported approaches using various machine learning techniques. In this work, we propose a retrieval and classification model using Siamese Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) for accurate detection and retrieval of duplicate and similar bugs. We report an accuracy close to 90% and recall rate close to 80%, which makes possible the practical use of such a system. We describe our model in detail along with related discussions from the Deep Learning domain. By presenting the detailed experimental results, we illustrate the effectiveness of the model in practical systems, including for repositories for which supervised training data is not available.

applications of natural language to data bases | 2012

Litmus: generation of test cases from functional requirements in natural language

Anurag Dwarakanath; Shubhashis Sengupta

Generating Test Cases from natural language requirements pose a formidable challenge as requirements often do not follow a defined structure. In this paper, we present a tool to generate Test Cases from a functional requirement document. No restriction on the structure of the sentence is imposed. The tool works on each requirement sentence and generates one or more Test Cases through a five step process --- 1) The sentence is analyzed through a syntactic parser to identify whether it is testable; 2) A compound or complex testable sentence is split into individual simple sentences; 3) Test Intents are generated from each simple sentence (Test Intents map to the aspects on which the requirement is to be tested); 4) The Test Intents are grouped and sequenced in temporal order to generate Positive Test Cases. A Positive Test Case verifies the affirmative action of the system; 5) Wherever applicable, Boundary Value Analysis and other techniques are used generate Negative Test Cases. Negative Test Cases verifies the behavior of the system in exception conditions. The automated generation of the Test Cases has been implemented in a tool called Litmus. We provide experimental results of our tool on actual requirement documents across domains and discuss the advantages and shortcomings of our approach.

2016 IEEE/ACM 5th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE) | 2016

Topic cohesion preserving requirements clustering

Janardan Misra; Shubhashis Sengupta; Sanjay Podder

This paper focuses on the problem of generating human interpretable clusters of semantically related plain-text requirements. Presented approach applies techniques from information retrieval, natural language processing, network analysis, and machine learning for identifying semantically central terms as themes and clustering requirements into semantically coherent groups together with meaningful explanatory themes associated with the clusters to assist in user comprehension of the clusters. Presented approach is generic in nature and can be used for other phases of SDLC (Software Development Life Cycle) including code-comprehension and architectural discovery. Suggested approach is particularly suitable for developing automated tool support for requirements management and analysis.

automated software engineering | 2013

Natural language requirements quality analysis based on business domain models

K. M. Annervaz; Vikrant Kaulgud; Shubhashis Sengupta; Milind Savagaonkar

Quality of requirements written in natural language has always been a critical concern in software engineering. Poorly written requirements lead to ambiguity and false interpretation in different phases of a software delivery project. Further, incomplete requirements lead to partial implementation of the desired system behavior. In this paper, we present a model for harvesting domain (functional or business) knowledge. Subsequently we present natural language processing and ontology based techniques for leveraging the model to analyze requirements quality and for requirements comprehension. The prototype also provides an advisory to business analysts so that the requirements can be aligned to the expected domain standard. The prototype developed is currently being used in practice, and the initial results are very encouraging.

advances in computing and communications | 2013

Detecting SOQL-injection vulnerabilities in SalesForce applications

Amitabh Saxena; Shubhashis Sengupta; Pradeepkumar Duraisamy; Vikrant Kaulgud; Amit Chakraborty

The two most common web-attacks used by hackers to steal data are SQL-injection and cross-site scripting (XSS). These are examples of taint vulnerabilities where maliciously crafted code (for example, a SQL query) is injected into a Web application by embedding it inside innocuous looking user inputs. We present the design of TRAP (Taint Removal and Analysis Platform), a static data-flow analysis tool to detect SOQL-injection problems in SalesForce applications. TRAP is designed to be language independent as it uses an XML intermediate language called STAC (STatic Analysis Code), on which the analysis is done. Currently, we have implemented STAC compilers for Apex and Java.

Explore More