Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pete Smith is active.

Publication


Featured researches published by Pete Smith.


Oracle 10g Data Warehousing | 2005

5 – Loading Data into the Warehouse

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter presents the extraction, transformation, and load process. Data are extracted from the source operational systems and transported to the staging area. The staging area is a temporary holding place used to prepare the data. The data are integrated with other data, cleansed, and transformed into a common representation. It is then loaded into the target data warehouse tables. This chapter discusses various techniques to identify rows that changed on the source system as part of the extraction process, and introduces both synchronous and asynchronous forms of change data capture. This chapter also discusses a number of types of transformations that are common to a data warehouse. It presents examples performing transformations during the load process, using staging tables inside the Oracle database, and using the new regular expression functionality. This chapter further illustrates the techniques used to load the warehouse, including SQL*Loader, data pump, external tables, and transportable table spaces.


Oracle 10g Data Warehousing | 2005

14 – Data Warehousing and the Web

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter discusses the integration of data warehousing with the web. It deals with Internet and Intranet. Internet used to describe the global connection of public computers. Intranet is only accessible to people within that organization. Oracle is a good example of a company using its intranet to communicate and conduct internal business. Oracle offers a complete solution that facilitates easy publishing of the data from data warehouse onto intranet or Internet, providing a complete customized experience. Everything that is needed to build this environment can be found within Oracle Application Server (AS) 10 g . This chapter explains how the web can change the information used and accessed by the data warehouse. Oracle Application Server 10 g provides the entire technology stack that is needed to build and implement e-business portals, web services, and transaction-based applications. This chapter also explains how the web is used to publish information from Oracle discoverer and Oracle reports using OracleAS portal.


Oracle 10g Data Warehousing | 2005

3 – Architecture of a Data Warehouse

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter introduces concepts about the technical architecture of a data warehouse and discusses the significant changes that Oracle has implemented with 10 g and how they can be beneficially deployed in the data warehouse. An important component of a data warehouse architecture is its ability to scale. A data warehouse will grow with an increase of users and reporting requirements and as more data are loaded to address new business areas. The architecture must be able to handle this growth to process the new data without any detrimental impact to the query response to the increasing user community. There are various approaches that can be used to scale a system. Many servers scale simply by allowing more processors and memory to be added. An alternative method of scaling is clustering where multiple, possibly smaller, servers operate together in a coordinated fashion to service the increased demands. Oracle provides the real application cluster technology for clustering the database.


Oracle 10g Data Warehousing | 2005

9 – Query Rewrite

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter discusses various techniques used by Oracle to rewrite queries using materialized views. With query rewrite, the queries will be transparently rewritten to use the materialized views. This chapter explains how the materialized view can be used to rewrite a large class of queries, thereby reducing the space and maintenance resources required for materialized views. This chapter also explains how to troubleshoot problems with query rewrite and how to use query rewrite to improve the performance of refreshing the materialized views. This chapter also explains how to know that which materialized views are to be created. Determining the optimal set of materialized views to create for a large number of queries can be tricky, and, if not done correctly, the disk space requirements and refresh overhead could soon get prohibitive. Oracle Database 10 g provides a tool called the structured query language access advisor, which is designed to choose the best set of materialized views and indexes for an application.


Oracle 10g Data Warehousing | 2005

4 – Physical Design of the Data Warehouse

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter discusses various considerations and techniques for the physical design of a data warehouse. It opens with a discussion of data partitioning, which is a technique used in data warehouse. It explains how data partitioning can be used to improve manageability, performance, and availability of a data warehouse. A significant benefit of partitioning the data is that it makes it possible for data maintenance operations to be performed at the partition level. A table can be partitioned using a column called the partition key whose value determines the partition into which a row of data will be placed. This chapter explores various index types that are suitable for a data warehouse and how they can be partitioned. One factor that significantly influences the physical design choices is how data are loaded into the warehouse. This chapter also discusses data compression, which can help reduce the storage requirements in a data warehouse.


Oracle 10g Data Warehousing | 2005

10 – Tuning Query Performance

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter discusses various aspects of query performance tuning. Query performance tuning is an ongoing process, which is needed throughout the life cycle of any database application. The first step in tuning query performance is to be able to monitor the database and identify queries that are not performing adequately. Oracle database 10 g provides tuning tools such as the structured query language (SQL) access advisor and SQL tuning advisor, which can be invaluable assistants to a database administrator (DBA) in simplifying the ongoing tasks of performance tuning. With these tools index and materialized views to speed up, the queries can be created and the optimizers ability to create good execution plans, using profiles can be improved. This chapter also discusses how to find and fix some common parallel execution problems and tune the process global area memory so that the queries execute with the optimal memory required. It concludes with a discussion on how plan stability is used to keep the query performance predictable over time.


Oracle 10g Data Warehousing | 2005

17 – High Availability and a Data Warehouse

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter discusses various aspects of improving the availability of the data warehouse. Oracle Database 10 g provides features such as real-application clusters, automatic storage management, recovery manager, and data guard to provide fault-tolerant operation in case of hardware and software failures and allows logical and physical reorganization of data without requiring downtime. This chapter begins with key features of a highly available system. A highly available system is one where there is very little downtime. Availability is measured by its impact on the users of the system. This chapter presents a discussion on the role of disaster recovery in a data warehouse, and explains how the data warehouse fits into an enterprise disaster recovery strategy using data guard. It also discusses techniques that help in maintaining a balance between the costs and the availability and protection of the data. It concludes with a discussion on life-cycle management, which ensures that the data warehouse will continue to be cost effective even as data sizes grow.


Oracle 10g Data Warehousing | 2005

1 – Data Warehousing

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter discusses the evolution of data warehouses and data marts. A data warehouse is a database containing data from multiple operational systems that has been consolidated, integrated, aggregated, and structured, so that it can be used to support the analysis and decision-making process of a business. This chapter highlights the Oracle database 10 g and some of the many challenges faced by warehouse developers. It also discusses the evolution of computers, beginning from the mainframe in the 1970s to the minicomputers in the 1980s to client/server in the 1990s. In the late 1990s and early 2000s, Internet computing began to change the way, making it possible to deploy business intelligence applications to large, geographically distributed user populations both within the enterprise and outside of it to suppliers and customers. After heavy technology investments in the late 1990s, many companies have found they have underutilized assets and are looking at ways to reduce operating costs. Consolidation and grid computing, based on low-cost commodity hardware, can provide substantial savings.


Oracle 10g Data Warehousing | 2005

11 – Managing the Warehouse

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter discusses some of the tasks required to manage a warehouse, examines new features available in Oracle database 10 g to help with this, and uses oracle enterprise manager (OEM) to simplify management of the warehouse. This chapter makes use of the various graphic user interface (GUI) tools provided in OEM to manage the database. With Oracle database 10 g , enterprise manager changed from the Java-based GUI tool to an easily accessible interface accessed via any browser on wide area network. There are two named variants of enterprise manager: database control and grid control. For managing an individual database and its ASM storage, EM database control is used. This chapter discusses various tasks such as reorganizing the warehouse, gathering optimizer statistics, maintaining security, and monitoring space usage, and introduces techniques for periodic reorganization using partition maintenance operations and online redefinition. Developing a test system and a business continuity plan are important considerations in this chapter.


Oracle 10g Data Warehousing | 2005

6 – Querying the Data Warehouse

Lilian Hobbs; Susan Hillson; Shilpa Lawande; Pete Smith

This chapter discusses several features in the Oracle database for querying and analysis in a data warehouse. Oracle provides several mechanisms to improve query performance such as star transformation, partition-wise join, partition pruning, and parallel execution. This chapter begins with the query optimizer. The job of the optimizer is to determine a plan to execute a query in the fastest possible time. The query optimizer in Oracle database 10 g is known as the cost-based optimizer. This chapter explains some of the features of the cost-based query optimizer and how they work for queries in a data warehouse. It also discusses several structured query language (SQL) functions that are useful for decision-support applications to answer business queries, which perform computations such as period-over-period comparisons and cumulative aggregations. These SQL functions allow users to express complex queries simply and process them efficiently. The new spreadsheet technology in Oracle database 10 g is also presented in the chpater.

Collaboration


Dive into the Pete Smith's collaboration.

Researchain Logo
Decentralizing Knowledge