Grace Au
Teradata
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Grace Au.
international conference on management of data | 1996
William T. O'Connell; I. T. Ieong; David Schrader; C. Watson; Grace Au; Alexandros Biliris; S. Choo; P. Colin; G. Linderman; Euthimios Panagos; J. Wang; T. Walter
The Teradata Multimedia Object Manager is a general-purpose content analysis multimedia server designed for symmetric multiprocessing and massively parallel processing environments. The Multimedia Object Manager defines and manipulates user-defined functions (UDFs), which are invoked in parallel to analyze or manipulate the contents of multimedia objects. Several computationally intensive applications of this technology, which use large persistent datasets, include fingerprint matching, signature verification, face recognition, and speech recognition/translation.
international conference on data engineering | 2017
Mohammed Al-Kateb; Paul Sinclair; Alain Crolotte; Lu Ma; Grace Au; Sanjay Nair
The UNION ALL set operator is useful for combining data from multiple sources. With the emergence of big data ecosystems in which data is typically stored on multiple systems, UNION ALL has become even more important. In this paper, we present optimization techniques implemented in Teradata Database for join queries with UNION ALL. Instead of spooling all branches of UNION ALL before performing join operations, we propose cost-based pushing of joins into branches. Join pushing not only addresses the prohibitive cost of spooling all branches, but also helps in exposing more efficient join methods (e.g., direct hash-based joins) which, otherwise, would not be considered by the query optimizer. The geography of relations being pushed to UNION ALL branches is also adjusted to avoid unnecessary redistributions and duplications of data. We conclude the paper with a performance study that demonstrates the impact of the proposed optimization techniques on query performance.
very large data bases | 2016
Mohammed Al-Kateb; Paul Sinclair; Grace Au; Carrie Ballinger
Data partitioning is an indispensable ingredient of database systems due to the performance improvement it can bring to any given mixed workload. Data can be partitioned horizontally or vertically. While some commercial proprietary and open source database systems have one flavor or mixed flavors of these partitioning forms, Teradata Database offers a unique hybrid row-column store solution that seamlessly combines both of these partitioning schemes. The key feature of this hybrid solution is that either row, column, or combined partitions are all stored and handled in the same way internally by the underlying file system storage layer. In this paper, we present the main characteristics and explain the implementation approach of Teradatas row-column store. We also discuss query optimization techniques applicable specifically to partitioned tables. Furthermore, we present a performance study that demonstrates how different partitioning options impact the performance of various queries.
international conference on management of data | 2018
Mohammed Al-Kateb; Paul Sinclair; Grace Au; Sanjay Nair; Mark Sirek; Lu Ma; Mohamed Y. Eltabakh
The UNION ALL set operator is useful for combining data from multiple sources. With the emergence and prevalence of big data ecosystems in which data is typically stored on multiple systems, UNION ALL has become even more important in many analytical queries. In this project, we demonstrate novel cost-based optimization techniques implemented in Teradata Database for join queries involving UNION ALL views and derived tables. Instead of the naive and traditional way of spooling each UNION ALL branch to a common spool prior to performing join operations, which can be prohibitively expensive, we demonstrate new techniques developed in Teradata Database including: 1) Cost-based pushing of joins into UNION ALL branches, 2) Branch grouping strategy prior to join pushing, 3) Geography adjustment of the pushed relations to avoid unnecessary redistribution or duplication, 4) Iterative join decomposition of a pushed join to multiple joins, and 5) Combining multiple join steps into a single multisource join step. In the demonstration, we use the Teradata Visual Explain tool, which offers a rich set of visual rendering capabilities of query plans, the display of various metadata information for each plan step, and several interactive UGI options for end-users.
international conference on computer design | 1996
William O'Connell; Grace Au; David Schrader
This paper introduces a novel approach to optimizing and monitoring database queries which involve operations on multiple data types in a parallel multimedia engine. Our approach uses dataflow graphs to represent the multimedia operations in a query. We have extended the Actors parallel programming model by designing an agent model for query execution that incorporates extensions for efficient data streaming, agent migration, and agent cloning. We incorporate algorithms for dynamic run-time workload control of both the number of queries in the system as well as the number of instances of agents executing multimedia operators. We describe an approach for capturing cost statistics for user-defined functions that is used by the system to estimate and schedule the execution of nested user-defined functions.
Archive | 2008
Carlos Bouloy; Grace Au; Hong Gui
Archive | 2012
Lu Ma; Grace Au
Archive | 2006
Grace Au; Bhashyam Ramesh; Haiyan Chen
Archive | 2009
Grace Au; Rama Krishna Korlapati; Haiyan Chen
Archive | 2004
Hong Gui; Grace Au; Curt J. Ellmann