Brad Nichols | Researchain

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Brad Nichols is active.

Explore More

Publication

Featured researches published by Brad Nichols.

TruCluster Server Handbook | 2003

6 – Tru64 UNIX Cluster Hooks: File System Hierarchy, CDSL, & PID

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

UNIX operating system enables a more seamless transition from a standalone system to a cluster. Once a system joins or forms a cluster, it becomes a member of the cluster. Tru64 UNIX version 5.0, the /etc/rc.config file has been split into two files. rc.config and rc. config, common. A CDSL is a symbolic link with the “{memb}” variable as part of the path. The “{memb}” is the “context”, or more appropriately, what is resolved to determine what the context. Configuring a network interface is a member-specific task. The rcmgr command is the recommended method for modifying the rc.config files on Tru64 UNIX. The file system hierarchy in Tru64 UNIX version 5 has been modified slightly from the version 4 layout. CDSL is a “behind-the-scenes” feature to make the life as an administrator easier. CDSL should be managed in a perfect way. Since a CDSL is a symbolic link, command can be used with the “s” option, although, it is highly recommended to use the mkcdsl command. The mkcdsl command is designed not to create just a CDSL, but to create a CDSL that can be maintained. Process Identities (PID) have been expanded to a 32-bit integer beginning in Tru64 UNIX version 5. In a stand-alone environment, there is no appreciable difference; for instance, PID 0 is still [kernel idle] and PID 1 is still the init daemon. The PID structure in a TruCluster environment, however, is defined to provide a unique range of process IDs (524288 PIDs/member).

TruCluster Server Handbook | 2003

2 – Tru64 UNIX &TruCluster Server Overview

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

Tru64 UNIX is a UNIX-based operating system. It supports Single System Image (SSI) cluster option. Tru64 UNIX is a 64-bit operating system. It provides huge virtual memory space and large file offsets. The very basics of Tru64 UNIX operating system are elements of Mach kernel. It includes notions of “tasks” and “threads”, which are figured out within the functioning of Tru64 UNIX. “Task” represents a running program, and “thread” is a schedulable entity within the program. Tru64 UNIX operating system uses virtual address that is translated into physical addresses, providing access to input and output space. The Unified Buffer Cache (UBC) processes the memory caching needs of the Tru64 UNIX operating system. This operating system supports shared libraries, which provides sharing of code at the functional level. The physical memory of Tru64 UNIX operating system is pageable that means the contents of the memory pages may be paged out to swap space when the systems free page list becomes very low. Tru64 UNIX supports Non Uniform Memory Access (NUMA) systems. This operating system functions smoothly in single CPU as well as multiple CPU systems. Asymmetric Multiprocessing (ASMP) supports synchronization in multiple CPU systems. Symmetric Multiprocessing (SMP) is an arrangement in case of “tightly coupled” systems, where two CPUs are involved. Tru64 UNIX operating system uses TruCluster Server software that provides interaction between multiple systems. The Connection Manager (CNX), which is built in the kernel of Tru64 UNIX, runs on each member of the cluster and provides software glue to keep the clusters stuck together. The Logical Storage Manager (LSM) of Tru64 UNIX operating system helps to create logical storage devices referred as volumes.

TruCluster Server Handbook | 2003

Tru64 UNIX Cluster Hooks: Device Naming & Hardware Management

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

This chapter squarely focuses deeper into the modified Tru64 UNIX directory hierarchy along with exploration of the way device locations have been altered as well. The subjects discussed in the light of cluster hooks through device management are device special file names and locations, worldwide identifiers (wwid), hardware management databases, the hardware manager (hwragr) command, the device special file (ds fmgr) command. In Tru64 UNIX prior to version 5.0, all devices were located in the /dev subdirectory. In version 5.0 and above, this has changed; to the casual observer, though, it will look the same. Unique device special file names are accomplished by using a SCSI worldwide identifier (WWID) to identify storage devices on a stand-alone system or in a cluster. A WWID is a unique identifier, not unlike a serial number, that most SCSI devices should have. There are exceptions, however, and these include how Tru64 UNIX identifies units from HSZ RAID controllers and older SCSI devices. Every device is assigned a name the first time it is detected by the operating system. The name starts with a base name (dsk, cdrom, tape, etc.) followed by a number. The numbering begins with zero. The chapter answers to many questions, including reply to the way to reinitializing devices directories, checking and creating device directory hierarchy, and tape device special file.

TruCluster Server Handbook | 2003

10 – Creating a Single-Node Cluster

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

This chapter focuses on the installation and configuration of TruCluster Server and the creation of a single-node cluster. Careful planning and preparation come together to help create something quite tangible and a highly available, highly scalable UNIX cluster. It can, therefore, be assumed that all planning and preparation done for installing and configuring TruCluster Server and for the creation of your UNIX cluster make the process cluster creation smoother. There are certain details what disks are required to build a single-node cluster. One or more disks on the shared bus is needed to hold the clusterwide AdvFS file systems: cluster_root ( / ), cluster_usr (/usr), and cluster_var (/var). One boot disk per cluster member is required on the shared bus. Depending on the number of members in the cluster, one entire disk on the shared bus has to be there to act as the quorum disk. This list does not include the original system disk for Tru64 UNIX. Checking the Cluster File System (CFS) Subsystem, Cluster Application Availability (CAA) Subsystem, and network aliases are all necessary to create a single-node cluster. It is also important to verify overall cluster configuration, cluster alias subsystem, connection manager subsystem, and device request dispatcher (DRD) subsystem to main cluster integrity.

TruCluster Server Handbook | 2003

System Administration Tasks

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

This chapter can help people understand the differences and similarities in the life of a system administrator once the cluster is up. The typical tasks of a system administrator consists of handling many tasks such as responding to user requests, responding to management requests, attending meetings pertaining to the response to management and user requests, adding and maintaining accounts, maintaining file systems, maintaining disks and other devices, assisting the network administration team, handling performance, network, and hardware problems. In many ways, the cluster should be thought of as a single system. A cluster is a closely coupled series of machines sharing a common management interface and common resources. Tru64 UNIX provides several options for system administration tasks. Most of them have come into existence in an attempt to standardize, homogenize, or otherwise improve the life of a system administrator. The sysman suite actually consists of three tools: “sysman –menu”, “sysman –station” (can be executed with the sins command), and “sysman” on the command-line. The chapter includes many commands in its scope to work properly in a cluster.

TruCluster Server Handbook | 2003

The Cluster Alias Subsystem (CLUA)

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

The alternative to having names for network locations is to always use 32 bit Internet Protocol (IP) addresses to refer to targets. All of the cluster members have network interfaces, and all of the interfaces have IP addresses associated with them. The cluster alias name will have to be associated with an IP address that is not tied to a single network interface on a single member of the cluster; otherwise, the bulk of the cluster network services will be executing on one cluster member. An IP address is usually associated with a 48-bit Media Access Control (MAC) address. The MAC addresses are “burned” onto a PROM chip on the network device to uniquely identify each device. When an IP address is assigned to the network device, the MAC address is associated with the IP address. Somehow, IP address is taken, that is not associated with any particular interface. To avoid the problem of overloading, the member on which the device using the MAC address physically resides, the selected cluster member is deemed as the “proxy ARP master” for the cluster. The proxy ARP master cluster member will be responsible for handling incoming requests for access to the cluster alias and routing the requests to other cluster members where appropriate. The cluster software tries to compensate for this by sending out a gratuitous ARP broadcast to inform all network nodes that they need to replace an entry in their ARP cache. This eliminates the potential havoc caused by the client ARP cache being behind the times. The goal of Virtual MAC is to allow client systems to reference an IP address that resolves to a MAC address that does not change as the proxy ARP mastering responsibilities move from one cluster member to another.

TruCluster Server Handbook | 2003

Tru64 UNIX Cluster Hooks: Event Manager

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

The Event Manager (EVM) is a new component that is added to Tru64 UNIX in 5.0 version release of the operating system. It is in response to customer requests to have a common access to all system event information. An event is something that interests the system or cluster. The kernel subsystem of Tru64 UNIX operating system generates an event. Other than human interests, the EVM is used by many applications to wait for events to occur and then act according the information received. The EVM components are located in the root(/) /usr, and /var directory trees. The three primary components of the EVM are EVM Daemon (evmd), the EVM Channel Manager (evmchmgr), and EVM Logger (evmlogger). The EVM Configuration Files are located in the /etc directory. EVM Channels are sources of events. There are two types of EVM Channels: active channel and passive channel. EVM filter files are located in the /usr/share/evm/filters directory. Filter files are ended with the suffix “.evf”. EVM template files for Tru64 UNIX are located in /usr/share/evm/templates/sys directory. Template file names end with the suffix “.evt” and the files are owned by root or bin. The event viewer is the GUI for EVM, which is invoked through the sysman program. Event filters help to narrow down the number of events. Tru64 UNIX has many keywords to assist the narrowing process of the events. Complex filters are created in addition to event keywords. These filters are stored in filter files for reuse. Event security is maintained with certain commands. EVM remote access is disabled by default. The registered events can be retrieved by using the evmwatch command.

TruCluster Server Handbook | 2003

The Device Request Dispatcher (DRD)

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

The Device Request Dispatcher (DRD) is a subsystem that dispatches requests to devices. It dispatches Input/Output requests to any storage device in the cluster. The main function of the DRD is to coordinate access to the storage devices in a cluster. The DRD subsystem is the mechanism that always enables every member in the cluster to see every storage device in it. The three instances when DRD look for devices are boot time, device open and device detection. All Inputs and outputs to a storage device in a cluster pass through the DRD subsystem. The DRD is a client/server implementation per device. A device can be served by more than one DRD if the device is capable of Direct Access. using Input/Output barriers. All disk devices supported by TruCluster Server versions 5.0A, 5.1 and 5.1A are Direct Access Input Output (DAIO) capable. A device is called a served device if one member in the cluster actively serves the device to the other cluster members. Input/Output (I/O) barriers are used in a cluster for stimulating a single system to handle I/O on shutdown or power failure. The I/O barrier is a combination of hardware, software and firmware. The DRD subsystem is designed in such a way that it can auto-configure and does not require any setup. The access member name can be changed via drdmgr command. Sometimes statistics are reset on a device for a member to test its connectivity. Block I/O is performed at the Cluster File System layer before sending to DRD. DRD events are defined in the /usr/share/evm/templates/clu/drd.evt template file.

TruCluster Server Handbook | 2003

4 – Cluster Configuration Planning

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

Publisher Summary This chapter provides ideas for carefully and logically planning and preparing a successful TruCluster Server implementation—both for the short term and for the long term. The memory requirement for a TruCluster Server cluster member is the amount of memory to install Tru64 UNIX plus at least an additional 64 MB for the cluster software. This lack of planning creates serious problems both in the short and long term. At best, it will end rebuilding the system before having any users and wasting valuable time in the process. To implement a version of TruCluster Server prior to version 5.1A Patch Kit 1, it was recommended that when routing configuration is performed on the Tru64 UNIX server, as part of the configuration setup, that it is chosen to use gated in preparation for the implementation of TruCluster Server that only requires the TCS-UA license PAK. As part of the planning process for installing either Tru64 UNIX or TruCluster Server, it is required to verify that the hardware, firmware, and console variables are set appropriately. There was the opportunity to examine both the Memory Channel interconnect and the Ethernet LAN interconnect. The chapter finds the hardware requirements for each interconnect, the configuration, and how to obtain interconnect information that is required to create a cluster.

TruCluster Server Handbook | 2003

23 – Cluster Application Availability (CAA)

Scott Fafrak; Jim Lola; Dennis O'Brien; Greg Yates; Brad Nichols

Publisher Summary This chapter discusses the Cluster Application Availability (CAA) framework. CAA is used to make an application that would normally be restricted to running on only one member in a cluster capable of automatically relocating to another cluster member in the event that the member where it is running were to fail or is shut down for maintenance. There are many applications—known as single-instance applications—written without any thought to concurrency. These are candidates for CAA. There are single-instance and multi-instance applications. A multi-instance application is an application that is designed to run multiple copies at the same time, whereas a single-instance application is an application that is/was not designed to run more than one copy on any cluster member at a time. If an application would normally be restricted to running on one cluster member at a time, CAA can be used to relocate the application from one cluster member to another to keep the application running within the cluster at all times. CAA architecture includes segments like resource manager, resource monitors, resource registry database, directory layout, and CAA commands. CAA supports four resource types: Application, Changer, Network, Tape. Application resources must have a resource profile and an action script. CAA events are defined in the /usr/share/evm/templates/clu/caa/caa. evt template file.

Explore More

Collaboration

Dive into the Brad Nichols's collaboration.

Explore More

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot

Dive into the research topics where Brad Nichols is active.

Publication

Featured researches published by Brad Nichols.

6 – Tru64 UNIX Cluster Hooks: File System Hierarchy, CDSL, & PID

2 – Tru64 UNIX &TruCluster Server Overview

Tru64 UNIX Cluster Hooks: Device Naming & Hardware Management

10 – Creating a Single-Node Cluster

System Administration Tasks

The Cluster Alias Subsystem (CLUA)

Tru64 UNIX Cluster Hooks: Event Manager

The Device Request Dispatcher (DRD)

4 – Cluster Configuration Planning

23 – Cluster Application Availability (CAA)

Collaboration

Dive into the Brad Nichols's collaboration.