Marcos R. Vieira | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marcos R. Vieira is active.

Explore More

Publication

Featured researches published by Marcos R. Vieira.

very large data bases | 2012

On the spatiotemporal burstiness of terms

Theodoros Lappas; Marcos R. Vieira; Dimitrios Gunopulos; Vassilis J. Tsotras

Thousands of documents are made available to the users via the web on a daily basis. One of the most extensively studied problems in the context of such document streams is burst identification. Given a term t, a burst is generally exhibited when an unusually high frequency is observed for t. While spatial and temporal burstiness have been studied individually in the past, our work is the first to simultaneously track and measure spatiotemporal term burstiness. In addition, we use the mined burstiness information toward an efficient document-search engine: given a users query of terms, our engine returns a ranked list of documents discussing influential events with a strong spatiotemporal impact. We demonstrate the efficiency of our methods with an extensive experimental evaluation on real and synthetic datasets.

very large data bases | 2007

The Omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient

Caetano Traina; Roberto F. Santos Filho; Agma J. M. Traina; Marcos R. Vieira; Christos Faloutsos

Similarity search operations require executing expensive algorithms, and although broadly useful in many new applications, they rely on specific structures not yet supported by commercial DBMS. In this paper we discuss the new Omni-technique, which allows to build a variety of dynamic Metric Access Methods based on a number of selected objects from the dataset, used as global reference objects. We call them as the Omni-family of metric access methods. This technique enables building similarity search operations on top of existing structures, significantly improving their performance, regarding the number of disk access and distance calculations. Additionally, our methods scale up well, exhibiting sub-linear behavior with growing database size.

international conference on social computing | 2010

Characterizing Dense Urban Areas from Mobile Phone-Call Data: Discovery and Social Dynamics

Marcos R. Vieira; Vanessa Frias-Martinez; Nuria Oliver; Enrique Frias-Martinez

The recent adoption of ubiquitous computing technologies (e.g. GPS, WLAN networks) has enabled capturing large amounts of spatio-temporal data about human motion. The digital footprints computed from these datasets provide complementary information for the study of social and human dynamics, with applications ranging from urban planning to transportation and epidemiology. A common problem for all these applications is the detection of dense areas, i.e. areas where individuals concentrate within a specific geographical region and time period. Nevertheless, the techniques used so far face an important limitation: they tend to identify as dense areas regions that do not respect the natural tessellation of the underlying space. In this paper, we propose a novel technique, called DADMST, to detect dense areas based on the Maximum Spanning Tree (MST) algorithm applied over the communication antennas of a cell phone infrastructure. We evaluate and validate our approach with a real dataset containing the Call Detail Records (CDR) of over one million individuals, and apply the methodology to study social dynamics in an urban environment.

conference on information and knowledge management | 2006

Efficient processing of complex similarity queries in RDBMS through query rewriting

Caetano Traina; Agma J. M. Traina; Marcos R. Vieira; Adriano S. Arantes; Christos Faloutsos

Multimedia and complex data are usually queried by similarity predicates. Whereas there are many works dealing with algorithms to answer basic similarity predicates, there are not generic algorithms able to efficiently handle similarity complex queries combining several basic similarity predicates. In this work we propose a simple and effective set of algorithms that can be combined to answer complex similarity queries, and a set of algebraic rules useful to rewrite similarity query expressions into an adequate format for those algorithms. Those rules and algorithms allow relational database management systems to turn complex queries into efficient query execution plans. We present experiments that highlight interesting scenarios. They show that the proposed algorithms are orders of magnitude faster than the traditional similarity algorithms. Moreover, they are linearly scalable considering the database size.

Journal of the Brazilian Computer Society | 2005

DBM-Tree: trading height-balancing for performance in metric access methods

Marcos R. Vieira; Caetano Traina; Fabio Jun Takada Chino; Agma J. M. Traina

Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such as the range and the k-nearest neighbor queries. Current methods, such as the Slim-tree and the M-tree, improve the query performance minimizing the number of disk accesses, keeping a constant height of the structures stored on disks (height-balanced trees). However, the overlapping between their nodes has a very high influence on their performance. This paper presents a new dynamic MAM called theDBM-tree (Density-Based Metric tree), which can minimize the overlap between high-density nodes by relaxing the height-balancing of the structure. Thus, the height of the tree is larger in denser regions, in order to keep a tradeoff between breadth-searching and depth-searching. An underpinning for cost estimation on tree structures is their height, so we show a non-height dependable cost model that can be applied for DBM-tree. Moreover, an optimization algorithm calledShrink is also presented, which improves the performance of an already builtDBM-tree by reorganizing the elements among their nodes. Experiments performed over both synthetic and real world datasets showed that theDBM-tree is, in average, 50% faster than traditional MAM and reduces the number of distance calculations by up to 72% and disk accesses by up to 66%. After performing the Shrink algorithm, the performance improves up to 40% regarding the number of disk accesses for range andk-nearest neighbor queries. In addition, theDBM-tree scales up well, exhibiting linear performance with growing number of elements in the database.

symposium on large spatial databases | 2011

FlexTrack: a system for querying flexible patterns in trajectory databases

Marcos R. Vieira; Petko Bakalov; Vassilis J. Tsotras

We describe the FlexTrack system for querying trajectories using flexible pattern queries. Such queries are composed of a sequence of simple spatio-temporal predicates, e.g., range and nearest-neighbors, as well as complex motion pattern predicates, e.g., predicates that contain variables and constraints. Users can interactively select spatio-temporal predicates to construct such pattern queries using a hierarchy of regions that partition the spatial domain. Several different query processing algorithms are currently implemented and available in the FlexTrack system.

Geoinformatica | 2015

High performance FPGA and GPU complex pattern matching over spatio-temporal streams

Roger Moussalli; Ildar Absalyamov; Marcos R. Vieira; Walid A. Najjar; Vassilis J. Tsotras

The wide and increasing availability of collected data in the form of trajectories has led to research advances in behavioral aspects of the monitored subjects (e.g., wild animals, people, and vehicles). Using trajectory data harvested by devices, such as GPS, RFID and mobile devices, complex pattern queries can be posed to select trajectories based on specific events of interest. In this paper, we present a study on FPGA- and GPU-based architectures processing complex patterns on streams of spatio-temporal data. Complex patterns are described as regular expressions over a spatial alphabet that can be implicitly or explicitly anchored to the time domain. More importantly, variables can be used to substantially enhance the flexibility and expressive power of pattern queries. Here we explore the challenges in handling several constructs of the assumed pattern query language, with a study on the trade-offs between expressiveness, scalability and matching accuracy. We show an extensive performance evaluation where FPGA and GPU setups outperform the current state-of-the-art (single-threaded) CPU-based approaches, by over three orders of magnitude for FPGAs (for expressive queries) and up to two orders of magnitude for certain datasets on GPUs (and in some cases slowdown). Unlike software-based approaches, the performance of the proposed FPGA and GPU solutions is only minimally affected by the increased pattern complexity.

social network mining and analysis | 2014

Building Socially Connected Skilled Teams to Accomplish Complex Tasks

Ana Paula Appel; Victor Fernandes Cavalcante; Marcos R. Vieira; Vagner Figueredo de Santana; Rogerio Abreu De Paula; Steven K. Tsukamoto

Solving todays problems demands more than the effort of an individual, however, brilliant mind. Collaboration and team work are fundamental skills for tackling such problems. The ability of team members to work together and communicate with one another thus becomes an uppermost concern. In this context, to assemble an effective team requires an approach that goes beyond the analysis of individual skills. This paper proposes and examines the problem that takes into account different skill attributes and social ties to build an interconnected team. Our proposed solution is evaluated by means of building one team to defeat an opposite team defined in the same social network. Our experimental results show that our algorithms produces meaningful socially collaborative skilled teams.

international conference on data mining | 2014

Bus Travel Time Predictions Using Additive Models

Matthias Kormaksson; Luciano Barbosa; Marcos R. Vieira; Bianca Zadrozny

Many factors can affect the predictability of public bus services such as traffic, weather, day of week, and hour of day. However, the exact nature of such relationships between travel times and predictor variables is, in most situations, not known. In this paper we develop a framework that allows for flexible modeling of bus travel times through the use of Additive Models. The proposed class of models provides a principled statistical framework that is highly flexible in terms of model building. The experimental results demonstrate uniformly superior performance of our best model as compared to previous prediction methods when applied to a very large GPS data set obtained from buses operating in the city of Rio de Janeiro.

symposium on large spatial databases | 2013

Stream-Mode FPGA acceleration of complex pattern trajectory querying

Roger Moussalli; Marcos R. Vieira; Walid A. Najjar; Vassilis J. Tsotras

The wide and increasing availability of collected data in the form of trajectory has lead to research advances in behavioral aspects of the monitored subjects (e.g., wild animals, people, vehicles). Using trajectory data harvested by devices, such as GPS, RFID and mobile devices, complex pattern queries can be posed to select trajectories based on specific events of interest. In this paper, we present a study on FPGA-based architectures processing complex patterns on streams of spatio-temporal data. Complex patterns are described as regular expressions over a spatial alphabet that can be implicitly or explicitly anchored to the time domain. More importantly, variables can be used to substantially enhance the flexibility and expressive power of pattern queries. Here we explore the challenges in handling several constructs of the assumed pattern query language, with a study on the trade-offs between expressiveness, scalability and matching accuracy. We show an extensive performance evaluation where FPGA setups outperform the current state-of-the-art CPU-based approaches by over three orders of magnitude. Unlike software-based approaches, the performance of the proposed FPGA solution is only minimally affected by the increased pattern complexity.

Explore More