Debabrata Dey | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Debabrata Dey is active.

Explore More

Publication

Featured researches published by Debabrata Dey.

ACM Transactions on Database Systems | 1996

A probabilistic relational model and algebra

Debabrata Dey; Sumit Sarkar

Although the relational model for databases provides a great range of advantages over other data models, it lacks a comprehensive way to handle incomplete and uncertain data. Uncertainty in data values, however, is pervasive in all real-world environments and has received much attention in the literature. Several methods have been proposed for incorporating uncertain data into relational databases. However, the current approaches have many shortcomings and have not established an acceptable extension of the relational model. In this paper, we propose a consistent extension of the relational model. We present a revised relational structure and extend the relational algebra. The extended algebra is shown to be closed, a consistent extension of the conventional relational algebra, and reducible to the latter.

IEEE Transactions on Knowledge and Data Engineering | 2002

A distance-based approach to entity reconciliation in heterogeneous databases

Debabrata Dey; Sumit Sarkar; Prabuddha De

In modern organizations, decision makers must often be able to quickly access information from diverse sources in order to make timely decisions. A critical problem facing many such organizations is the inability to easily reconcile the information contained in heterogeneous data sources. To overcome this limitation, an organization must resolve several types of heterogeneity problems that may exist across different sources. We examine one such problem called the entity heterogeneity problem, which arises when the same real-world entity type is represented using different identifiers in different applications. A decision-theoretic model to resolve the problem is proposed. Our model uses a distance measure to express the similarity between two entity instances. We have implemented the model and tested it on real-world data. The results indicate that the model performs quite well in terms of its ability to predict whether two entity instances should be matched or not. The model is shown to be computationally efficient. It also scales well to large relations from the perspective of the accuracy of prediction. Overall, the test results imply that this is certainly a viable approach in practical situations.

ACM Transactions on Database Systems | 1999

Improving database design through the analysis of relationships

Debabrata Dey; Veda C. Storey; Terence M. Barron

Much of the work on conceptual modeling involves the use of an entity-relationship model in which binary relationships appear as associations between two entities. Relationships involving more than two entities are considered rare and, therefore, have not received adequate attention. This research provides a general framework for the analysis of relationships in which binary relationships simply become a special case. The framework helps a designer to identify ternary and other higher-degree relationships that are commonly represented, often inappropriately, as either entities or binary relationships. Generalized rules are also provided for representing higher-degree relationships in the relational model. This uniform treatment of relationships should significantly ease the burden on a designer by enabling him or her to extract more information from a real-world situation and represent it properly in a conceptual design.

ACM Transactions on Database Systems | 1997

Database design with common sense business reasoning and learning

Veda C. Storey; Roger H. L. Chiang; Debabrata Dey; Robert C. Goldstein; Shankar Sudaresan

Automated database design systems embody knowledge about the database design process. However, their lack of knowledge about the domains for which databases are being developed significantly limits their usefulness. A methodology for acquiring and using general world knowledge about business for database design has been developed and implemented in a system called the Common Sense Business Reasoner, which acquires facts about application domains and organizes them into a a hierarchical, context-dependent knowledge base. This knowledge is used to make intelligent suggestions to a user about the entities, attributes, and relationships to include in a database design. A distance function approach is employed for integrating specific facts, obtained from individual design sessions, into the knowledge base (learning) and for applying the knowledge to subsequent design problems (reasoning).

IEEE Transactions on Knowledge and Data Engineering | 2011

Efficient Techniques for Online Record Linkage

Debabrata Dey; Vijay S. Mookerjee; Dengpan Liu

The need to consolidate the information contained in heterogeneous data sources has been widely documented in recent years. In order to accomplish this goal, an organization must resolve several types of heterogeneity problems, especially the entity heterogeneity problem that arises when the same real-world entity type is represented using different identifiers in different data sources. Statistical record linkage techniques could be used for resolving this problem. However, the use of such techniques for online record linkage could pose a tremendous communication bottleneck in a distributed environment (where entity heterogeneity problems are often encountered). In order to resolve this issue, we develop a matching tree, similar to a decision tree, and use it to propose techniques that reduce the communication overhead significantly, while providing matching decisions that are guaranteed to be the same as those obtained using the conventional linkage technique. These techniques have been implemented, and experiments with real-world and synthetic databases show significant reduction in communication overhead.

Management Science | 2013

Effects of Piracy on Quality of Information Goods

Atanu Lahiri; Debabrata Dey

It is commonly believed that piracy of information goods leads to lower profits, which translate to lower incentives to invest in innovation and eventually to lower-quality products. Manufacturers, policy makers, and researchers all claim that inadequate piracy enforcement efforts translate to lower investments in product development. However, we find many practical examples that contradict this claim. Therefore, to examine this claim more carefully, we develop a rigorous economic model of the manufacturers quality decision problem in the presence of piracy. We consider a monopolist who does not have any marginal costs but has a product development cost quadratic in the quality level produced. The monopolist faces a consumer market heterogeneous in its preference for quality and offers a quality level that maximizes its profit. We also allow for the possibility that the manufacturer may use versioning to counter piracy. We unexpectedly find that in certain situations, lower piracy enforcement increases the monopolists incentive to invest in quality. We explain the reasons and welfare implications of our findings. This paper was accepted by Lorin Hitt, information systems.

hawaii international conference on system sciences | 1998

Entity matching in heterogeneous databases: a distance-based decision model

Debabrata Dey; Sumit Sarkar; Prabuddha De

The need to leverage the information contained in heterogeneous data sources has been widely documented. In order to accomplish this goal, an organization must resolve several types of heterogeneity problems that may exist across different data sources. We investigate one such problem called the entity heterogeneity problem. This problem arises when the same real-world entity type is represented using different identifiers in different applications. We propose a decision theoretic model to resolve the problem. Our model uses a distance-based measure to express the similarity between two entity instances. We have implemented the model, and our experimental results indicate that this is a viable approach in real-world situations.

very large data bases | 1996

A complete temporal relational algebra

Debabrata Dey; Terence M. Barron; Veda C. Storey

Abstract. Various temporal extensions to the relational model have been proposed. All of these, however, deviate significantly from the original relational model. This paper presents a temporal extension of the relational algebra that is not significantly different from the original relational model, yet is at least as expressive as any of the previous approaches. This algebra employs multidimensional tuple time-stamping to capture the complete temporal behavior of data. The basic relational operations are redefined as consistent extensions of the existing operations in a manner that preserves the basic algebraic equivalences of the snapshot (i.e., conventional static) algebra. A new operation, namely temporal projection, is introduced. The complete update semantics are formally specified and aggregate functions are defined. The algebra is closed, and reduces to the snapshot algebra. It is also shown to be at least as expressive as the calculus-based temporal query language TQuel. In order to assess the algebra, it is evaluated using a set of twenty-six criteria proposed in the literature, and compared to existing temporal relational algebras. The proposed algebra appears to satisfy more criteria than any other existing algebra.

Journal of Management Information Systems | 2013

Consumer Learning and Time-locked Trials of Software Products

Debabrata Dey; Atanu Lahiri; Dengpan Liu

Manufacturers of information goods often offer free trial versions of their products. Information goods are experience goods, and trials often promote consumer learning with respect to quality. However, the downside of this strategy is that trials may cannibalize sales in the after-trial period. Recent research in information systems has identified this trade-off but has stopped short of comprehensively analyzing it. As a result, it has drawn unexpected and unrealistic conclusions, such as that offering a free time-locked trial of the fully functional version is optimal for “any” information good that does not exhibit significant network effects. We show that, when this trade-off is considered, a time-locked trial may not be optimal even in situations in which there is no network effect and the overall impact on consumers’ valuations is positive. With a simple model, we characterize the conditions necessary for optimality and explain their implications. The main insight is that, unless learning effects are appropriately incorporated in the analysis, there is a risk of overestimating the benefits of free trials. Using extensions to the basic model, we find that this insight is quite robust and applies to a wider context.

decision support systems | 2009

Price competition with service level guarantee in web services

Zhongju Zhang; Yong Tan; Debabrata Dey

Web services have become quite popular over the last few years as they allow easier development and integration of business applications. Unlike traditional software systems, web services are self-contained modular software components that are delivered over a network (such as the Internet) and executed on a remote system hosting the requested services. However, the network and processing overhead associated with web services have also presented a significant challenge to its performance. As a result, a web service provider often announces a service-level agreement when launching a service. The service-level agreement provides a guarantee to the consumers that they can get the service they pay for at an assured level of quality. In this paper, we study the competition between two such providers offering functionally the same web services. Each provider needs to decide a service level (standard or premium) she would offer and a corresponding price for the selected service level to meet the QoS guarantee (in terms of an average response time of the service). We first analyze the case where the providers choose service levels and prices simultaneously, and then extend it to a sequential-move situation. Finally, we examine strategic choices of providers when the processing capacity is endogenized into the model.

Explore More