Daniel Barbará | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel Barbará is active.

Explore More

Publication

Featured researches published by Daniel Barbará.

Journal of the ACM | 1985

How to assign votes in a distributed system

Hector Garcia-Molina; Daniel Barbará

In a distributed system, one strategy for achieving mutual exclusion of groups of nodes without communication is to assign to each node a number of votes. Only a group with a majority of votes can execute the critical operations, and mutual exclusion is achieved because at any given time there is at most one such group. A second strategy, which appears to be similar to votes, is to define a priori a set of groups that intersect each other. Any group of nodes that finds itself in this set can perform the restricted operations. In this paper, both of these strategies are studied in detail and it is shown that they are not equivalent in general (although they are in some cases). In doing so, a number of other interesting properties are proved. These properties will be of use to a system designer who is selecting a vote assignment or a set of groups for a specific application.

IEEE Transactions on Knowledge and Data Engineering | 1992

The management of probabilistic data

Daniel Barbará; Hector Garcia-Molina; Daryl Porter

It is often desirable to represent in a database, entities whose properties cannot be deterministically classified. The authors develop a data model that includes probabilities associated with the values of the attributes. The notion of missing probabilities is introduced for partially specified probability distributions. This model offers a richer descriptive language allowing the database to more accurately reflect the uncertain real world. Probabilistic analogs to the basic relational operators are defined and their correctness is studied. A set of operators that have no counterpart in conventional relational systems is presented. >

international conference on management of data | 1994

Sleepers and workaholics: caching strategies in mobile environments

Daniel Barbará; Tomasz Imielinski

In the mobile wireless computing environment of the future a large number of users equipped with low powered palm-top machines will query databases over the wireless communication channels. Palmtop based units will often be disconnected for prolonged periods of time due to the battery power saving measures; palmtops will also frequencly relocate between different cells and connect to different data servers at different times. Caching of frequently accessed data items will be an important technique that will reduce contention on the narrow bandwidth wireless channel. However, cache invalidation strategies will be severely affected by the disconnection and mobility of the clients. The server may no longer know which clients are currently residing under its cell and which of them are currently on. We propose a taxonomy of different cache invalidation strategies and study the impact of clients disconnection times on their performance. We determine that for the units which are often disconnected (sleepers) the best cache invalidation strategy is based on signatures previously used for efficient file comparison. On the other hand, for units which are connected most of the time (workaholics), the best cache invalidation strategy is based on the periodic broadcast of changed data items.

IEEE Transactions on Knowledge and Data Engineering | 1999

Mobile computing and databases-a survey

Daniel Barbará

The emergence of powerful portable computers, along with advances in wireless communication technologies, has made mobile computing a reality. Among the applications that are finding their way to the market of mobile computing-those that involve data management-hold a prominent position. In the past few years, there has been a tremendous surge of research in the area of data management in mobile computing. This research has produced interesting results in areas such as data dissemination over limited bandwidth channels, location-dependent querying of data, and advanced interfaces for mobile computers. This paper is an effort to survey these techniques and to classify this research in a few broad areas.

conference on information and knowledge management | 2002

COOLCAT: an entropy-based algorithm for categorical clustering

Daniel Barbará; Yi Li; Julia Couto

In this paper we explore the connection between clustering categorical data and entropy: clusters of similar poi lower entropy than those of dissimilar ones. We use this connection to design an incremental heuristic algorithm, COOLCAT, which is capable of efficiently clustering large data sets of records with categorical attributes, and data streams. In contrast with other categorical clustering algorithms published in the past, COOLCATs clustering results are very stable for different sample sizes and parameter settings. Also, the criteria for clustering is a very intuitive one, since it is deeply rooted on the well-known notion of entropy. Most importantly, COOLCAT is well equipped to deal with clustering of data streams(continuously arriving streams of data point) since it is an incremental algorithm capable of clustering new points without having to look at every point that has been clustered so far. We demonstrate the efficiency and scalability of COOLCAT by a series of experiments on real and synthetic data sets.

international conference on data mining | 2008

On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking

Loulwah AlSumait; Daniel Barbará; Carlotta Domeniconi

This paper presents online topic model (OLDA), a topic model that automatically captures the thematic patterns and identifies emerging topics of text streams and their changes over time. Our approach allows the topic modeling framework, specifically the latent Dirichlet allocation (LDA) model, to work in an online fashion such that it incrementally builds an up-to-date model (mixture of topics per document and mixture of words per topic) when a new document (or a set of documents) appears. A solution based on the empirical Bayes method is proposed. The idea is to incrementally update the current model according to the information inferred from the new stream of data with no need to access previous data. The dynamics of the proposed approach also provide an efficient mean to track the topics over time and detect the emerging topics in real time. Our method is evaluated both qualitatively and quantitatively using benchmark datasets. In our experiments, the OLDA has discovered interesting patterns by just analyzing a fraction of data at a time. Our tests also prove the ability of OLDA to align the topics across the epochs with which the evolution of the topics over time is captured. The OLDA is also comparable to, and sometimes better than, the original LDA in predicting the likelihood of unseen documents.

international conference on management of data | 2001

ADAM: a testbed for exploring the use of data mining in intrusion detection

Daniel Barbará; Julia Couto; Sushil Jajodia; Ningning Wu

Intrusion detection systems have traditionally been based on the characterization of an attack and the tracking of the activity on the system to see if it matches that characterization. Recently, new intrusion detection systems based on data mining are making their appearance in the field. This paper describes the design and experiences with the ADAM (Audit Data Analysis and Mining) system, which we use as a testbed to study how useful data mining techniques can be in intrusion detection.

Archive | 2002

Applications of Data Mining in Computer Security

Daniel Barbará; Sushil Jajodia

List of Figures. List of Tables. Preface. 1. Modern Intrusion Detection, Data Mining, and Degrees of Attack Guilt S. Noel, et al. 2. Data Mining for Intrusion Detection K. Julisch. 3. An Architecture for Anomaly Detection D. Barbara, et al. 4. A Geometric Framework for Unsupervised Anomaly Detection E. Eskin, et al. 5. Fusing a Heterogeneous Alert Stream into Scenarios O. Dain, K. Cunningham. 6. Using MIB II Variables for Network Intrusion Detection Xinzhou Qin, et al. 7. Adaptive Model Generation A. Honig, et al. 8. Proactive Intrusion Detection J.B.D. Cabrera, et al. 9. References. Index.

electronic commerce | 2001

Preserving QoS of e-commerce sites through self-tuning: a performance model approach

Daniel A. Menascé; Daniel Barbará; Ronald Dodge

The Quality of Service (QoS) of e-commerce sites plays a crucial role in attracting and retaining customers. The workload experienced by these sites tends to vary in a very dynamic way. The complexity of the sites combined with the large short-terms variations of the workload calls for automated methods for site configuration. This paper describes a method for dynamically monitoring and tuning e-commerce sites so that desired QoS levels are attained. Our approach uses hill climbing techniques combined with analytic queuing models to guide the search for the best combination of configuration parameters. We validate our approach in an experimental setting by comparing the QoS levels of a TPC-W e-commerce site with and without control. We showed that under increasing loads, the controlled system meets its QoS goals, while the uncontrolled site fails to do so.

Sigkdd Explorations | 2002

Requirements for clustering data streams

Daniel Barbará

Scientific and industrial examples of data streams abound in astronomy, telecommunication operations, banking and stock-market applications, e-commerce and other fields. A challenge imposed by continuously arriving data streams is to analyze them and to modify the models that explain them as new data arrives. In this paper, we analyze the requirements needed for clustering data streams. We review some of the latest algorithms in the literature and assess if they meet these requirements.

Explore More