Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Brian K. Caufield.
Performance Evaluation and Benchmarking | 2009
Len Wyatt; Brian K. Caufield; Daniel Pol
Conditions in the marketplace for ETL tools suggest that an industry standard benchmark is needed. The benchmark should provide useful data for comparing the performance of ETL systems, be based on a meaningful scenario, and be scalable over a wide range of data set sizes. This paper gives a general scoping of the proposed benchmark and outlines some key decision points. The Transaction Processing Performance Council (TPC) has formed a development subcommittee to define and produce such a benchmark.
very large data bases | 2014
Meikel Poess; Tilmann Rabl; Hans-Arno Jacobsen; Brian K. Caufield
Historically, the process of synchronizing a decision support system with data from operational systems has been referred to as Extract, Transform, Load (ETL) and the tools supporting such process have been referred to as ETL tools. Recently, ETL was replaced by the more comprehensive acronym, data integration (DI). DI describes the process of extracting and combining data from a variety of data source formats, transforming that data into a unified data model representation and loading it into a data store. This is done in the context of a variety of scenarios, such as data acquisition for business intelligence, analytics and data warehousing, but also synchronization of data between operational applications, data migrations and conversions, master data management, enterprise data sharing and delivery of data services in a service-oriented architecture context, amongst others. With these scenarios relying on up-to-date information it is critical to implement a highly performing, scalable and easy to maintain data integration system. This is especially important as the complexity, variety and volume of data is constantly increasing and performance of data integration systems is becoming very critical. Despite the significance of having a highly performing DI system, there has been no industry standard for measuring and comparing their performance. The TPC, acknowledging this void, has released TPC-DI, an innovative benchmark for data integration. This paper motivates the reasons behind its development, describes its main characteristics including workload, run rules, metric, and explains key decisions.
Technology Conference on Performance Evaluation and Benchmarking | 2012
Len Wyatt; Brian K. Caufield; Marco Vieira; Meikel Poess
The proposed TPC-DI benchmark measures the performance of Data Integration systems (a.k.a. ETL systems) given the task of integrating data from an OLTP system and other data sources to create a data warehouse.This paper describes the scenario, structure and timing principles used in TPC-DI. Although failure recovery is very important in real deployments of Data Integration systems, certain complexities made it difficult to specify in the benchmark. Hence failure recovery aspects have been scoped out of the current version of TPC-DI. The issues around failure recovery are discussed in detail and some options are described. Finally the audience is invited to offer additional suggestions.
Archive | 2012
Brian K. Caufield; Fan Ding; Mi Wan Shum; Dong Jie Wei; Samuel H. K. Wong
Archive | 2013
Brian K. Caufield; Ajay Sood; Julian J. Vizor
Archive | 2012
Brian K. Caufield; Yong Li; Xiaoyan Pu
Archive | 2015
Brian K. Caufield; Lawrence A. Greene; Eric A. Jacobson; Yong Li; Shyam R. Mudambi; Xiaoyan Pu; Dong J. Wei
Archive | 2008
Aarti D Borkar; Arron J. Harden; Brian K. Caufield
Archive | 2014
Brian K. Caufield; Ron E. Liu; DongJie Wei; Xin Ying Yang
Archive | 2009
Brian K. Caufield; Hung B. Nguyen