Pedro Furtado
University of Coimbra
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pedro Furtado.
data warehousing and olap | 2004
Pedro Furtado
Parallelism can be used for major performance improvement in large Data warehouses (DW) with performance and scalability challenges. A simple low-cost shared-nothing architecture with horizontally fully-partitioned facts can be used to speedup response time of the data warehouse significantly. However, extra overheads related to processing large replicated relations and repartitioning requirements between nodes can significantly degrade speedup performance for many query patterns if special care is not taken during placement to minimize such overheads. In this paper we show these problems experimentally with the help of the performance evaluation benchmark TPC-H and identify simple modifications that can minimize such undesirable extra overheads. We analyze experimentally a simple and easy-to-apply partitioning and placement decision that achieves good performance improvement results.
ACM Transactions on Sensor Networks | 2013
Tony O'Donovan; James Brown; Felix Büsching; Alberto Cardoso; José Cecílio; Jose Manuel do Ó; Pedro Furtado; Paulo Gil; Anja Jugel; Wolf-Bastian Pöttner; Utz Roedig; Jorge Sá Silva; Ricardo Silva; Cormac J. Sreenan; Vasos Vassiliou; Thiemo Voigt; Lars C. Wolf; Zinon Zinonos
Todays industrial facilities, such as oil refineries, chemical plants, and factories, rely on wired sensor systems to monitor and control the production processes. The deployment and maintenance of such cabled systems is expensive and inflexible. It is, therefore, desirable to replace or augment these systems using wireless technology, which requires us to overcome significant technical challenges. Process automation and control applications are mission-critical and require timely and reliable data delivery, which is difficult to provide in industrial environments with harsh radio environments. In this article, we present the GINSENG system which implements performance control to allow us to use wireless sensor networks for mission-critical applications in industrial environments. GINSENG is a complete system solution that comprises on-node system software, network protocols, and back-end systems with sophisticated data processing capability. GINSENG assumes that a deployment can be carefully planned. A TDMA-based MAC protocol, tailored to the deployment environment, is employed to provide reliable and timely data delivery. Performance debugging components are used to unintrusively monitor the system performance and identify problems as they occur. The article reports on a real-world deployment of GINSENG in an especially challenging environment of an operational oil refinery in Sines, Portugal. We provide experimental results from this deployment and share the experiences gained. These results demonstate the use of GINSENG for sensing and actuation and allow an assessment of its ability to operate within the required performance bounds. We also identify shortcomings that manifested during the evaluation phase, thus giving a useful perspective on the challenges that have to be overcome in these harsh application settings.
data warehousing and knowledge discovery | 2004
Pedro Furtado
Data warehouses (DW) with enormous quantities of data put major performance and scalability challenges. The Node-Partitioned Data Warehouse (NPDW) divides the DW into cheap computer nodes for scalability. Partitioning and data placement strategies are relevant to the performance of complex queries on the NPDW. In this paper we propose a partitioning placement and join processing strategy to boost the performance of costly joins in NPDW, compare alternative strategies using the performance evaluation benchmark TPC-H and draw conclusions.
International Journal of Data Warehousing and Mining | 2009
Pedro Furtado
Data Warehouses are a crucial technology for current competitive organizations in the globalized world. Size, speed and distributed operation are major challenges concerning those systems. Many data warehouses have huge sizes and the requirement that queries be processed quickly and efficiently, so parallel solutions are deployed to render the necessary efficiency. Distributed operation, on the other hand, concerns global commercial and scientific organizations that need to share their data in a coherent distributed data warehouse. In this article we review the major concepts, systems and research results behind parallel and distributed data warehouses.
International Journal of Data Warehousing and Mining | 2013
Florian Waas; Robert Wrembel; Tobias Freudenreich; Maik Thiele; Christian Koncilia; Pedro Furtado
In a typical BI infrastructure, data, extracted from operational data sources, is transformed, cleansed, and loaded into a data warehouse by a periodic ETL process, typically executed on a nightly basis, i.e., a full days worth of data is processed and loaded during off-hours. However, it is desirable to have fresher data for business insights at near real-time. To this end, the authors propose to leverage a data warehouses capability to directly import raw, unprocessed records and defer the transformation and data cleaning until needed by pending reports. At that time, the databases own processing mechanisms can be deployed to process the data on-demand. Event-processing capabilities are seamlessly woven into our proposed architecture. Besides outlining an overall architecture, the authors also developed a roadmap for implementing a complete prototype using conventional database technology in the form of hierarchical materialized views.
international database engineering and applications symposium | 2007
R.L. de Carvalho Costa; Pedro Furtado
Database servers typically offer a best-effort model of service to submitted commands, that is, they try to process every command as fast as possible. Hence, they are not prepared to provide differentiation for quality of service. In this paper we consider the distributed grid-DWPA architecture context, which fragments and replicates data into several sites to provide an efficient grid data warehouse solution. Instead of offering best-effort service to every query, we propose the use of a performance predict model that is used in conjunction with QoS oriented scheduling to enable the establishment of service level agreements (SLA).
IEEE Transactions on Industrial Informatics | 2014
José Cecílio; Pedro Furtado
Deployment of embedded systems in industrial environments requires preconfiguration for operation, and, in some contexts, easy reconfiguration capabilities are also desirable. It is therefore useful to define a mechanism for embedded devices that will operate in sensor and actuator networks to be remotely (re)configured and to have flexible computation capabilities. We propose such a configuration, reconfiguration, and processing mechanism in the form of a software architecture. A node component should be deployed in any embedded device and implements application programming interface (API), configuration, processing, and communication. The resulting system provides remote configuration and processing of data in any node in a most flexible way, since every node has the same uniform API, processing, and access functionalities. The experimental section shows a working deployment of this concept in an industrial refinery setting, as part of the EU FP7 project Ginseng.
international parallel and distributed processing symposium | 2005
Pedro Furtado
Commercial database systems must typically rely on fast hardware platforms and interconnects to deal efficiently with data in parallel. However, cheap computing power can be applied for flexibility and scalability in managing large data volumes if the right choices are made concerning data placement and processing. Our work concentrates on the use of cheap computing power in possibly slow, non-dedicated local networks to achieve a computing power over demanding query-intensive databases that would be unachievable without expensive specialized hardware and massively parallel systems. The Node Partitioned Data Management System (NPDM) works on computing nodes on non-dedicated local networks. In this paper we concentrate on query transformations required for efficient processing over a specialized query-intensive schema. The decision support benchmark TPC-H is used as a study case for the transformations and for experimental analysis.
distributed computing in sensor systems | 2011
W-B. Pöttner; Lars C. Wolf; José Cecílio; Pedro Furtado; R. Silva; J. Sa Silva; Anderson dos Santos; Paulo Gil; Alberto Cardoso; Zinon Zinonos; Ben McCarthy; James Brown; Utz Roedig; Tony O'Donovan; Cormac J. Sreenan; Zhitao He; Thiemo Voigt; A. Jugel
The GINSENG project develops performance-controlled wireless sensor networks that can be used for time-critical applications in hostile environments such as industrial plant automation and control. GINSENG aims at integrating wireless sensor networks with existing enterprise resource management solutions using a middleware. A cornerstone is the evaluation in a challenging industrial environment — an oil refinery in Portugal. In this paper we first present our testbed. Then we introduce our solution to access, debug and flash the sensor nodes remotely from an operations room in the plant or from any location with internet access. We further present our experimental methodology and show some exemplary results from the refinery testbed.
data warehousing and knowledge discovery | 2011
João Pedro Costa; José Cecílio; Pedro Martins; Pedro Furtado
The star schema model has been widely used as the facto DW storage organization on relational database management systems (RDBMS). The physical division in normalized fact tables (with metrics) and denormalized dimension tables allows a trade-off between performance and storage space while, at the same time offering a simple business understanding of the overall model as a set of metrics (facts) and attributes for business analysis (dimensions). However, the underlying premises of such trade-off between performance and storage have changed. Nowadays, storage capacity increased significantly at affordable prices (below 50