Antares RAID SparcLinux Howto Thom Coates1 , Carl Munio, Jim Ludemann Revised 25 May 2002 -------------------------------------- 1 Thomas D. Coates, Jr. PhD, c/o Neuropunk.Org, PO Box 910385, Lexington KY 40591-0385, tcoates@neuropunk.org Table of Contents   * Chapter 1  Preliminaries   + 1.1  Preamble   + 1.2  Acknowledgements and Thanks   + 1.3  New Versions   * Chapter 2  Introduction   + 2.1  5070 features5070 Main Features   + 2.2  Background   o 2.2.1  RAID levelsRaid Levels   o 2.2.2  RAID linearRAID Linear   o 2.2.3  RAID 1Level 1   o 2.2.4  stripingStriping   o 2.2.5  RAID 0Level 0   o 2.2.6  RAID 2RAID 3Level 2 and 3   o 2.2.7  RAID 4Level 4   o 2.2.8  RAID 5Level 5   * Chapter 3  Installation   + 3.1  compatibilitySBUS Controller Compatibility   + 3.2  hardware installationHardware Installation Procedure   o 3.2.1  Serial Terminal   o 3.2.2  Hard Drive Plant   + 3.3  5070 Onboard Configuration   o 3.3.1  Main Screen Options   o 3.3.2  [Q]uit   o 3.3.3  [R]aidSets:   o 3.3.4  [H]ostports:   o 3.3.5  [S]pares:   o 3.3.6  [M]onitor:   o 3.3.7  [G]eneral:   o 3.3.8  [P]robe   o 3.3.9  Example RAID Configuration Session   + 3.4  Linux Configuration   o 3.4.1  Existing Linux Installation   o 3.4.2  New Linux Installation   + 3.5  Maintenance   o 3.5.1  spares, activatingActivating a spare   o 3.5.2  re-integrating repaired driveRe-integrating a repaired drive into the RAID (levels 3 and 5)   + 3.6  Troubleshooting / Error Messages   o 3.6.1  Out of band temperature detected...   o 3.6.2  ... failed ... cannot have more than 1 faulty backend.   o 3.6.3  When booting I see: ... Sun disklabel: bad magic 0000 ... unknown partition table.   + 3.7  Bugs   + 3.8  Frequently Asked Questions   o 3.8.1  How do I reset/erase the onboard configuration?   o 3.8.2  How can I tell if a drive in my RAID has failed?   + 3.9  command referenceAdvanced Topics: 5070 Command Reference   o 3.9.1  autobootAUTOBOOT - script to automatically create all raid sets and scsi monitors   o 3.9.2  AUTOFAULT - script to automatically mark a backend faulty after a drive failure   o 3.9.3  AUTOREPAIR - script to automatically allocate a spare and reconstruct a raid set   o 3.9.4  BIND - combine elements of the namespace   o 3.9.5  BUZZER - get the state or turn on or off the buzzer   o 3.9.6  CACHE - display information about and delete cache ranges   o 3.9.7  CACHEDUMP - Dump the contents of the write cache to battery backed-up ram   o 3.9.8  CACHERESTORE - Load the cache with data from battery backed-up ram   o 3.9.9  CAT - concatenate files and print on the standard output   o 3.9.10  CMP - compare the contents of 2 files   o 3.9.11  CONS - console device for Husky   o 3.9.12  DD - copy a file (disk, etc)   o 3.9.13  DEVSCMP - Compare a file's size against a given value   o 3.9.14  DFORMAT- Perform formatting functions on a backend disk drive   o 3.9.15  DIAGS - script to run a diagnostic on a given device   o 3.9.16  DPART - edit a scsihd disk partition table   o 3.9.17  DUP - open file descriptor device   o 3.9.18  ECHO - display a line of text   o 3.9.19  ENV- environment variables file system   o 3.9.20  ENVIRON - RaidRunner Global environment variables - names and effects   o 3.9.21  EXEC - cause arguments to be executed in place of this shell   o 3.9.22  EXIT - exit a K9 process   o 3.9.23  EXPR - evaluation of numeric expressions   o 3.9.24  FALSE - returns the K9 false status   o 3.9.25  FIFO - bi-directional fifo buffer of fixed size   o 3.9.26  GET - select one value from list   o 3.9.27  GETIV - get the value an internal RaidRunner variable   o 3.9.28  HELP - print a list of commands and their synopses   o 3.9.29  HUSKY - shell for K9 kernel   o 3.9.30  HWCONF - print various hardware configuration details   o 3.9.31  HWMON - monitoring daemon for temperature, fans, PSUs.   o 3.9.32  INTERNALS - Internal variables used by RaidRunner to change dynamics of running kernel   o 3.9.33  KILL - send a signal to the nominated process   o 3.9.34  LED- turn on/off LED's on RaidRunner   o 3.9.35  LFLASH- flash a led on RaidRunner   o 3.9.36  LINE - copies one line of standard input to standard output   o 3.9.37  LLENGTH - return the number of elements in the given list   o 3.9.38  LOG - like zero with additional logging of accesses   o 3.9.39  LRANGE - extract a range of elements from the given list   o 3.9.40  LS - list the files in a directory   o 3.9.41  LSEARCH - find the a pattern in a list   o 3.9.42  LSUBSTR - replace a character in all elements of a list   o 3.9.43  MEM - memory mapped file (system)   o 3.9.44  MDEBUG - exercise and display statistics about memory allocation   o 3.9.45  MKDIR - create directory (or directories)   o 3.9.46  MKDISKFS - script to create a disk filesystem   o 3.9.47  MKHOSTFS - script to create a host port filesystem   o 3.9.48  MKRAID - script to create a raid given a line of output of rconf   o 3.9.49  MKRAIDFS - script to create a raid filesystem   o 3.9.50  MKSMON - script to start the scsi monitor daemon smon   o 3.9.51  MKSTARGD - script to initialize a scsi target daemon for a given raid set   o 3.9.52  MSTARGD - monitor for stargd   o 3.9.53  NICE - Change the K9 run-queue priority of a K9 process   o 3.9.54  NULL- file to throw away output in   o 3.9.55  PARACC - display information about hardware parity accelerator   o 3.9.56  PEDIT - Display/modify SCSI backend Mode Parameters Pages   o 3.9.57  PIPE - two way interprocess communication   o 3.9.58  PRANKS - print or set the accessible backend ranks for the current controller   o 3.9.59  PRINTENV - print one or all GLOBAL environment variables   o 3.9.60  PS - report process status   o 3.9.61  PSCSIRES - print SCSI-2 reservation table for all or specific monikers   o 3.9.62  PSTATUS - print the values of hardware status registers   o 3.9.63  RAIDACTION- script to gather/reset stats or stop/start a raid set's stargd   o 3.9.64  RAID0 - raid 0 device   o 3.9.65  RAID1 - raid 1 device   o 3.9.66  RAID3 - raid 3 device   o 3.9.67  RAID4 - raid 4 device   o 3.9.68  RAID5 - raid 5 device   o 3.9.69  RAM - ram based file system   o 3.9.70  RANDIO - simulate random reads and writes   o 3.9.71  RCONF, SPOOL, HCONF, MCONF, CORRUPT-CONFIG - raid configuration and spares management   o 3.9.72  REBOOT - exit K9 on target hardware + return to monitor   o 3.9.73  REBUILD - raid set reconstruction utility   o 3.9.74  REPAIR - script to allocate a spare to a raid set's failed backend   o 3.9.75  REPLACE - script to restore a backend in a raid set   o 3.9.76  RM - remove the file (or files)   o 3.9.77  RMON - Power-On Diagnostics and Bootstrap   o 3.9.78  RRSTRACE - disassemble scsihpmtr monitor data   o 3.9.79  RSIZE - estimate the memory usage for a given raid set   o 3.9.80  SCN2681 - access a scn2681 (serial IO device) as console   o 3.9.81  SCSICHIPS - print various details about a controller's scsi chips   o 3.9.82  SCSIHD - SCSI hard disk device (a SCSI initiator)   o 3.9.83  SCSIHP - SCSI target device   o 3.9.84  SET - set (or clear) an environment variable   o 3.9.85  SCSIHPMTR - turn on host port debugging   o 3.9.86  SETENV - set a GLOBAL environment variable   o 3.9.87  SDLIST - Set or display an internal list of attached disk drives   o 3.9.88  SETIV - set an internal RaidRunner variable   o 3.9.89  SHOWBAT - display information about battery backed-up ram   o 3.9.90  SHUTDOWN - script to place the RaidRunner into a shutdown or quiescent state   o 3.9.91  SLEEP - sleep for the given number of seconds   o 3.9.92  SMON - RaidRunner SCSI monitor daemon   o 3.9.93  SOS - pulse the buzzer to emit sos's   o 3.9.94  SPEEDTST - Generate a set number of sequential writes then reads   o 3.9.95  SPIND - Spin up or down a disk device   o 3.9.96  SPINDLE - Modify Spindle Synchronization on a disk device   o 3.9.97  SRANKS - set the accessible backend ranks for a controller   o 3.9.98  STARGD - daemon for SCSI-2 target   o 3.9.99  STAT - get status information on the named files (or stdin)   o 3.9.100     STATS - Print cumulative performance statistics on a Raid Set or Cache Range   o 3.9.101     STRING - perform a string operation on a given value   o 3.9.102     SUFFIX - Suffixes permitted on some big decimal numbers   o 3.9.103     SYSLOG - device to send system messages for logging   o 3.9.104     SYSLOGD - initialize or access messages in the system log area   o 3.9.105     TEST - condition evaluation command   o 3.9.106     TIME - Print the number of seconds since boot (or reset of clock)   o 3.9.107     TRAP - intercept a signal and perform some action   o 3.9.108     TRUE - returns the K9 true status   o 3.9.109     STTY or TTY - print the user's terminal mount point or terminfo status   o 3.9.110     UNSET - delete one or more environment variables   o 3.9.111     UNSETENV - unset (delete) a GLOBAL environment variable   o 3.9.112     VERSION - print out the version of the RaidRunner kernel   o 3.9.113     WAIT - wait for a process (or my children) to terminate   o 3.9.114     WARBLE - periodically pulse the buzzer   o 3.9.115     XD- dump given file(s) in hexa-decimal to standard out   o 3.9.116     ZAP - write zeros to a file   o 3.9.117     ZCACHE - Manipulate the zone optimization IO table of a Raid Set's cache   o 3.9.118     ZERO - file when read yields zeros continuously   o 3.9.119     ZLABELS - Write zeros to the front and end of Raid Sets   + 3.10  Advanced Topics: SCSI Monitor Daemon (SMON)   + 3.11  Further Reading Chapter 1  Preliminaries This document describes how to install, configure, and maintain a hardware RAID built around the 5070 SBUS host based RAID controller by Antares Microsystems. Other topics of discussion include RAID levels, the 5070 controller GUI, and 5070 command line. A complete command reference for the 5070's K9 kernel and Bourne-like shell is included. 1.1  Preamble Copyright 2000 by Thomas D. Coates, Jr. This document's source is licensed under the terms if the GNU general public license agreement. Permission to use, copy, modify, and distribute this document without fee for any purpose commercial or non-commercial is hereby granted, provided that the author's names and this notice appear in all copies and/or supporting documents; and that the location where a freely available unmodified version of this document may be obtained is given. This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY, either expressed or implied. While every effort has been taken to ensure the accuracy of the information documented herein, the author(s)/editor(s)/maintainer(s)/ contributor(s) assumes NO RESPONSIBILITY for any errors, or for any damages, direct or consequential, as a result of the use of the information documented herein. A complete copy of the GNU Public License agreement may be obtained from: Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. Portions of this document are adapted and/or re-printed from the 5070 installation guide and man pages with permission of Antares Microsystems, Inc., Campbell CA. 1.2  Acknowledgements and Thanks   * Carl and Jim at Antares for the hardware, man pages, and other support/ contributions they provided during the writing of this document.   * Penn State University - Hershey Medical Center, Department of Radiology, Section of Clinical Image Management (My home away from my home away from home).   * The software-raid-HOWTO Copyright 1997 by Linas Vepstas under the GNU public license agreement. The software-raid-HOWTO is Available from : http://www.linuxdoc.org 1.3  New Versions   * The location of the most recent version of this document is posted on my homepage: http://members.iglou.com/tcoates/   * Other versions may be found in different formats at the LDP homepage: http://www.linuxdoc.org and mirror sites. Chapter 2  Introduction The Antares 5070 is a high performance, versatile, yet relatively inexpensive host based RAID controller. Its embedded operating system (K9 kernel) is modelled on the Plan 9 operating system whose design is discussed in several papers from AT&T (see the "Further Reading" section). K9 is a kernel targeted at embedded controllers of small to medium complexity (e.g. ISDN-ethernet bridges, RAID controllers, etc). It supports multiple lightweight processes (i.e. without memory management) on a single CPU with a non-preemptive scheduler. Device driver architecture is based on Plan 9 (and Unix SVR4) streams. Concurrency control mechanisms include semaphores and signals. The 5070 has three single ended ultra 1 SCSI channels and two onboard serial interfaces one of which provides command line access via a connected serial terminal or modem. The other is used to upgrade the firmware. The command line is robust, implementing many of the essential Unix commands (e.g. dd, ls, cat, etc.) and a scaled down Bourne shell for scripting. The Unix command set is augmented with RAID specific configuration commands and scripts. In addition to the command line interface an ASCII text based GUI is provided to permit easy configuration of level 0, 1, 3, 4, and 5 RAIDs. 2.1  5070 features5070 Main Features   * RAID levels 0, 1, 3, 4, and 5 are supported.   * Text based GUI for easy configuration for all supported RAID levels.   * A Multidisk RAID volume appears as an individual SCSI drive to the operating system and can be managed with the standard utilities (fdisk, mkfs, fsck,etc.). RAID Volumes may be assigned to different SCSI IDs or the same SCSI IDs but different LUNs.   * No special RAID drivers required for the host operating system.   * Multiple RAID volumes of different levels can be mixed among the drives forming the physical plant. For example in a hypothetical drive plant consisting of 9 drives:   + 2 drives form a level 3 RAID assigned to SCSI ID 5, LUN 0   + 2 drives form a level 0 RAID assigned to SCSI ID 5, LUN 1   + 5 drives form a level 5 RAID assigned to SCSI ID 6, LUN 0   * Three single ended SCSI channels which can accommodate 6 drives each (18 drives total).   * Two serial interfaces. The first permits configuration/control/monitoring of the RAID from a local serial terminal. The second serial port is used to upload new programming into the 5070 (using PPP and TFTP).   * Robust Unix-like command line and NVRAM based file system.   * Configurable ASCII SCSI communication channel for passing commands to the 5070's command line interpreter. Allows programming running on host OS to directly configure/control/monitor all parameters of the 5070. 2.2  Background Much of the information/knowledge pertaining to RAID levels in this section is adapted from the software-raid-HOWTO by Linas Vepstas . See the acknowledgements section for the URL where the full document may be obtained. RAID is an acronym for "Redundant Array of Inexpensive Disks" and is used to create large, reliable disk storage systems out of individual hard disk drives. There are two basic ways of implementing a RAID, software or hardware. The main advantage of a software RAID is low cost. However, since the OS of the host system must manage the RAID directly there is a substantial penalty in performance. Furthermore if the RAID is also the boot device, a drive failure could prove disastrous since the operating system and utility software needed to perform the recovery is located on the RAID. The primary advantages of hardware RAID is performance and improved reliability. Since all RAID operations are handled by a dedicated CPU on the controller, the host system's CPU is never bothered with RAID related tasks. In fact the host OS is completely oblivious to the fact that its SCSI drives are really virtual RAID drives. When a drive fails on the 5070 it can be replaced on-the-fly with a drive from the spares pool and its data reconstructed without the host's OS ever knowing anything has happened. 2.2.1  RAID levelsRaid Levels The different RAID levels have different performance, redundancy, storage capacity, reliability and cost characteristics. Most, but not all levels of RAID offer redundancy against drive failure. There are many different levels of RAID which have been defined by various vendors and researchers. The following describes the first 7 RAID levels in the context of the Antares 5070 hardware RAID implementation. 2.2.2  RAID linearRAID Linear RAID-linear is a simple concatenation of drives to create a larger virtual drive. It is handy if you have a number small drives, and wish to create a single, large drive. This concatenation offers no redundancy, and in fact decreases the overall reliability: if any one drive fails, the combined drive will fail. SUMMARY   * Enables construction of a large virtual drive from a number of smaller drives   * No protection, less reliable than a single drive   * RAID 0 is a better choice due to better I/O performance 2.2.3  RAID 1Level 1 Also referred to as "mirroring". Two (or more) drives, all of the same size, each store an exact copy of all data, disk-block by disk-block. Mirroring gives strong protection against drive failure: if one drive fails, there is another with the an exact copy of the same data. Mirroring can also help improve performance in I/O-laden systems, as read requests can be divided up between several drives. Unfortunately, mirroring is also one of the least efficient in terms of storage: two mirrored drives can store no more data than a single drive. SUMMARY   * Good read/write performance   * Inefficient use of storage space (half the total space available for data)   * RAID 6 may be a better choice due to better I/O performance. 2.2.4  stripingStriping Striping is the underlying concept behind all of the other RAID levels. A stripe is a contiguous sequence of disk blocks. A stripe may be as short as a single disk block, or may consist of thousands. The RAID drivers split up their component drives into stripes; the different RAID levels differ in how they organize the stripes, and what data they put in them. The interplay between the size of the stripes, the typical size of files in the file system, and their location on the drive is what determines the overall performance of the RAID subsystem. 2.2.5  RAID 0Level 0 Similar to RAID-linear, except that the component drives are divided into stripes and then interleaved. Like RAID-linear, the result is a single larger virtual drive. Also like RAID-linear, it offers no redundancy, and therefore decreases overall reliability: a single drive failure will knock out the whole thing. However, the 5070 hardware RAID 0 is the fastest of any of the schemes listed here. SUMMARY:   * Use RAID 0 to combine smaller drives into one large virtual drive.   * Best Read/Write performance of all the schemes listed here.   * No protection from drive failure.   * ADVICE: Buy very reliable hard disk drives if you plan to use this scheme. 2.2.6  RAID 2RAID 3Level 2 and 3 RAID-2 is seldom used anymore, and to some degree has been made obsolete by modern hard disk technology. RAID-2 is similar to RAID-4, but stores ECC information instead of parity. Since all modern disk drives incorporate ECC under the covers, this offers little additional protection. RAID-2 can offer greater data consistency if power is lost during a write; however, battery backup and a clean shutdown can offer the same benefits. RAID-3 is similar to RAID-4, except that it uses the smallest possible stripe size. SUMMARY   * RAID 2 is largely obsolete   * Use RAID 3 to combine separate drives together into one large virtual drive.   * Protection against single drive failure,   * Good read/write performance. 2.2.7  RAID 4Level 4 RAID-4 interleaves stripes like RAID-0, but it requires an additional drive to store parity information. The parity is used to offer redundancy: if any one of the drives fail, the data on the remaining drives can be used to reconstruct the data that was on the failed drive. Given N data disks, and one parity disk, the parity stripe is computed by taking one stripe from each of the data disks, and XOR'ing them together. Thus, the storage capacity of a an (N+1)-disk RAID-4 array is N, which is a lot better than mirroring (N+1) drives, and is almost as good as a RAID-0 setup for large N. Note that for N= 1, where there is one data disk, and one parity disk, RAID-4 is a lot like mirroring, in that each of the two disks is a copy of each other. However, RAID-4 does NOT offer the read-performance of mirroring, and offers considerably degraded write performance. In brief, this is because updating the parity requires a read of the old parity, before the new parity can be calculated and written out. In an environment with lots of writes, the parity disk can become a bottleneck, as each write must access the parity disk. SUMMARY   * Similar to RAID 0   * Protection against single drive failure.   * Poorer I/O performance than RAID 3   * Less of the combined storage space is available for data [than RAID 3] since an additional drive is needed for parity information. 2.2.8  RAID 5Level 5 RAID-5 avoids the write-bottleneck of RAID-4 by alternately storing the parity stripe on each of the drives. However, write performance is still not as good as for mirroring, as the parity stripe must still be read and XOR'ed before it is written. Read performance is also not as good as it is for mirroring, as, after all, there is only one copy of the data, not two or more. RAID-5's principle advantage over mirroring is that it offers redundancy and protection against single-drive failure, while offering far more storage capacity when used with three or more drives. SUMMARY   * Use RAID 5 if you need to make the best use of your available storage space while gaining protection against single drive failure.   * Slower I/O performance than RAID 3 Chapter 3  Installation NOTE: The installation procedure given here for the SBUS controller is similar to that found in the manual. It has been modified so minor variations in the SPARCLinux installation may be included. 3.1  compatibilitySBUS Controller Compatibility The 5070 / Linux 2.2 combination was tested on SPARCstation (5, 10, & 20), Ultra 1, and Ultra 2 Creator. The 5070 was also tested on Linux with Symmetrical Multiprocessing (SMP) support on a dual processor Ultra 2 creator 3D with no problems. Other 5070 / Linux / hardware combinations may work as well. 3.2  hardware installationHardware Installation Procedure If your system is already up and running, you must halt the operating system. GNOME: 1. From the login screen right click the "Options" button. 2. On the popup menu select System -> Halt. 3. Click "Yes" when the verification box appears KDE: 1. From the login screen right click shutdown. 2. On the popup menu select shutdown by right clicking its radio button. 3. Click OK XDM: 1. login as root 2. Left click on the desktop to bring up the pop-up menu 3. select "New Shell" 4. When the shell opens type "halt" at the prompt and press return Console Login (systems without X windows): 1. Login as root 2. Type "halt" All Systems: Wait for the message "power down" or "system halted" before proceeding. Turn off your SPARCstation system (Note: Your system may have turned itself off following the power down directive), its video monitor, external disk expansion boxes, and any other peripherals connected to the system. Be sure to check that the green power LED on the front of the system enclosure is not lit and that the fans inside the system are not running. Do not disconnect the system power cord. SPARCstation 4, 5, 10, 20 & UltraSPARC Systems: 1. Remove the top cover on the CPU enclosure. On a SPARCstation 10, this is done by loosening the captive screw at the top right corner of the back of the CPU enclosure, then tilting the top of the enclosure forward while using a Phillips screwdriver to press the plastic tab on the top left corner. 2. Decide which SBUS slot you will use. Any slot will do. Remove the filler panel for that slot by removing the two screws and rectangular washers that hold it in. 3. Remove the SBUS retainer (commonly called the handle) by pressing outward on one leg of the retainer while pulling it out of the hole in the printed circuit board. 4. Insert the board into the SBUS slot you have chosen. To insert the board, first engage the top of the 5070 RAIDium backpanel into the backpanel of the CPU enclosure, then rotate the board into a level position and mate the SBUS connectors. Make sure that the SBUS connectors are completely engaged. 5. Snap the nylon board retainers inside the SPARCstation over the 5070 RAIDium board to secure it inside the system. 6. Secure the 5070 RAIDium SBUS backpanel to the system by replacing the rectangular washers and screws that held the original filler panel in place. 7. Replace the top cover by first mating the plastic hooks on the front of the cover to the chassis, then rotating the cover down over the unit until the plastic tab in back snaps into place. Tighten the captive screw on the upper right corner. Ultra Enterprise Servers, SPARCserver 1000 & 2000 Systems, SPARCserver 6XO MP Series: 1. Remove the two Allen screws that secure the CPU board to the card cage. These are located at each end of the CPU board backpanel. 2. Remove the CPU board from the enclosure and place it on a static-free surface. 3. Decide which SBUS slot you will use. Any slot will do. Remove the filler panel for that slot by removing the two screws and rectangular washers that hold it in. Save these screws and washers. 4. Remove the SBUS retainer (commonly called the handle) by pressing outward on one leg of the retainer while pulling it out of the hole in the printed circuit board. 5. Insert the board into the SBUS slot you have chosen. To insert the board, first engage the top of the 5070 RAIDium backpanel into the backpanel of the CPU enclosure, then rotate the board into a level position and mate the SBUS connectors. Make sure that the SBUS connectors are completely engaged. 6. Secure the 5070 RAIDium board to the CPU board with the nylon screws and standoffs provided on the CPU board. The standoffs may have to be moved so that they match the holes used by the SBUS retainer, as the standoffs are used in different holes for an MBus module. Replace the screws and rectangular washers that originally held the filler panel in place, securing the 5070 RAIDium SBus backpanel to the system enclosure. 7. Re-insert the CPU board into the CPU enclosure and re-install the Allen-head retaining screws that secure the CPU board. All Systems: 1. Mate the external cable adapter box to the 5070 RAIDium and gently tighten the two screws that extend through the cable adapter box. 2. Connect the three cables from your SCSI devices to the three 68-pin SCSI-3 connectors on the Antares 5070 RAIDium. The three SCSI cables must always be reconnected in the same order after a RAID set has been established, so you should clearly mark the cables and disk enclosures for future disassembly and reassembly. 3. Configure the attached SCSI devices to use SCSI target IDs other than 7, as that is taken by the 5070 RAIDium itself. Configuring the target number is done differently on various devices. Consult the manufacturer's installation instructions to determine the method appropriate for your device. 4. As you are likely to be installing multiple SCSI devices, make sure that all SCSI buses are properly terminated. This means a terminator is installed only at each end of each SCSI bus daisy chain. Verifying the Hardware Installation: These steps are optional but recommended. First, power-on your system and interrupt the booting process by pressing the "Stop" and "a" keys (or the "break" key if you are on a serial terminal) simultaneously as soon as the Solaris release number is shown on the screen. This will force the system to run the Forth Monitor in the system EPROM, which will display the "ok" prompt. This gives you access to many useful low-level commands, including: ok show-devs . . . /iommu@f,e0000000/sbus@f,e000100SUNW, isp@1,8800000 . . . The first line in the response shown above means that the 5070 RAIDium host adapter has been properly recognized. If you don't see a line like this, you may have a hardware problem. Next, to see a listing of all the SCSI devices in your system, you can use the probe-scsi-all command, but first you must prepare your system as follows: ok setenv auto-boot? False ok reset ok probe-scsi-all This will tell you the type, target number, and logical unit number of every SCSI device recognized in your system. The 5070 RAIDium board will report itself attached to an ISP controller at target 0 with two Logical Unit Numbers (LUNs): 0 for the virtual hard disk drive, and 7 for the connection to the Graphical User Interface (GUI). Note: the GUI communication channel on LUN 7 is currently unused under Linux. See the discussion under "SCSI Monitor Daemon (SMON)" in the "Advanced Topics" section for more information. REQUIRED: Perform a reconfiguration boot of the operating system: ok boot -r If no image appears on your screen within a minute, you most likely have a hardware installation problem. In this case, go back and check each step of the installation procedure. This completes the hardware installation procedure. 3.2.1  Serial Terminal If you have a serial terminal at your disposal (e.g. DEC-VT420) it may be connected to the controller's serial port using a 9 pin DIN male to DB25 male serial cable. Otherwise you will need to supplement the above cable with a null modem adapter to connect the RAID controller's serial port to the serial port on either the host computer or a PC. The terminal emulators I have successfully used include Minicom (on Linux), Kermit (on Caldera's Dr. DOS), and Hyperterminal (on a windows CE palmtop), however, any decent terminal emulation software should work. The basic settings are 9600 baud , no parity, 8 data bits, and 1 stop bit. 3.2.2  Hard Drive Plant Choosing the brand and capacity of the drives that will form the hard drive physical plant is up to you. I do have some recommendations:   * Remember, you generally get what you pay for. I strongly recommend paying the extra money for better (i.e. more reliable) hardware especially if you are setting up a RAID for a mission critical project. For example, consider purchasing drive cabinets with redundant hot-swappable power supplies, etc.   * You will also want a UPS for your host system and drive cabinets. Remember, RAID levels 3 and 5 protect you from data loss due to drive failure NOT power failure.   * The drive cabinet you select should have hot swappable drive bays, these cost more but are definitely worth it when you need to add/change drives.   * Make sure the cabinet(s) have adequate cooling when fully loaded with drives.   * Keep your SCSI cables (internal and external) as short as possible   * Mark the drives/cabinet(s) in such a way that you will be able to reconnect them to the controller in their original configuration. Once the RAID is configured you cannot re-organize you drives without re-configuring the RAID (and subsequently erasing the data stored on it).   * Keep in mind that although it is physically possible to connect/configure up to 6 drives per channel, performance will sharply decrease for RAIDs with more than three drives per channel. This is due to the 25 MHz bandwidth limitation of the SBUS. Therefore, if read/write performance is an issue go with a small number of large drives. If you need a really large RAID (~ 1 terabyte) then you will have no other choice but to load the channels to capacity and pay the performance penalty. NOTE: if you are serving files over a 10/100 Base T network you may not notice the performance decrease since the network is usually the bottleneck not the SBUS. 3.3  5070 Onboard Configuration Before diving into the RAID configuration I need to define a few terms.   * "RaidRunner" is the name given to the the 5070 controller board.   * "Husky" is the name given to the shell which produces the ":raid;" command prompt. It is a command language interpreter that executes commands read from the standard input or from a file. Husky is a scaled down model of Unix's Bourne shell (sh). One major difference is that husky has no concept of current working directory. For more information on the husky shell and command prompt see the "Advanced Topics" section   * The "host port" is the SCSI ID assigned to the controller card itself. This is usually ID 7.   * A "backend" is a drive attached to the controller on a given channel.   * A "rank" is a collection of all the backends from each channel with the same SCSI ID (i.e. rank 0 would consist of all the drives with SCSI ID 0 on each channel)   * Each of the backends is identified by a three digit number where the first digit is the channel, the second the SCSI ID of the drive, and the third the LUN of the drive. The numbers are separated by a period. The identifier is prefixed with a "D" if it is a disk or "T" if it is a tape (e.g. D0.1.0). This scheme is referred to as in the following documentation.   * A "RAID set" consists of given number of backends (there are certain requirements which I'll come to later)   * A "spare" is a drive which is unused until there is a failure in one of the RAID drives. At that time the damaged drive is automatically taken offline and replaced with the spare. The data is then reconstructed on the spare and the RAID resumes normal operation.   * Spares may either be "hot" or "warm" depending on user configuration. Hot spares are spun up when the RAID is started, which shortens the replacement time when a drive failure occurs. Warm spares are spun up when needed, which saves wear on the drive. The test based GUI can be started by typing "agui" : raid; agui  at the husky prompt on the serial terminal (or emulator). Agui is a simple ASCII based GUI that can be run on the RaidRunner console port which enables one to configure the RaidRunner. The only argument agui takes is the terminal type that is connected to the RaidRunner console. Current supported terminals are dtterm, vt100 and xterm. The default is dtterm. Each agui screen is split into two areas, data and menu. The data area, which generally uses all but the last line of the screen, displays the details of the information under consideration. The menu area, which generally is the bottom line of the screen, displays a strip menu with a title then list of options or sub-menus. Each option has one character enclosed in square brackets (e.g. [Q]uit) which is the character to type to select that option. Each menu line allows you to refresh the screen data (in case another process on the RaidRunner writes to the console). The refresh character may also be used during data entry if the screen is overwritten. The refresh character is either or . When agui starts, it reads the configuration of the RaidRunner and probes for every possible backend. As it probes for each backend, it's "name" is displayed in the bottom left corner of the screen. 3.3.1  Main Screen Options The Main screen (Figure 3.1) is the first screen displayed. It provides a summary of the RaidRunner configuration. At the top is the RaidRunner model, version and serial number. Next is a line displaying, for each controller, the SCSI ID's for each host port (labeled A, B, C, etc) and total and currently available amounts of memory. The next set of lines display the ranks of devices on the RaidRunner. Each device follows the nomenclature of < device_type_c.s.l> where device_type_ can be D for disk or T for tape, c is the internal channel the device is attached to, s is the SCSI ID (Rank) of the device on that channel, and l is the SCSI LUN of the device (typically 0). --------------------------------------------------------- [antares-RAID-SparcLinux-HOWTO001] Figure 3.1: The main screen of the 5070 onboard configuration utility --------------------------------------------------------- The next set of lines provide a summary of the Raid Sets configured on the RaidRunner. The summary includes the raid set name, it's type, it's size, the amount of cache allocated to it and a comma separated list of it's backends. See rconf in the "Advanced Topics" section for a full description of the above. Next the spare devices are configured. Each spare is named (device_type_c.s.l format), followed by it's size (in 512-byte blocks), it's spin state (Hot or Warm), it's controller allocation , and finally it's current status (Used/ Unused, Faulty/Working). If used, the raid set that uses it is nominated. At the bottom of the data area, the number of controllers, channels, ranks and devices are displayed. The menu line allows one to quit agui or select further actions or sub-menus.   * [Q]uit: Exit the main screen and return to the husky prompt.   * [R]aidSets: Enter the RaidSet configuration screen.   * [H]ostports Enter the Host Port configuration screen.   * [S]pares Enter the Spare Device configuration screen.   * [M]onitor Enter the SCSI Monitor configuration screen.   * [G]eneral Enter the General configuration/information screen.   * [P]robe Re-probe the device backends on the RaidRunner. As each backend is probed it's "name" (c.s.l format) is displayed in the bottom left corner of the screen. These selections are described in detail below. 3.3.2  [Q]uit Exit the agui main screen and return to the husky ( :raid; ) prompt. 3.3.3  [R]aidSets: The Raid Set Configuration screen (Figure 3.2) displays a Raid Set in the data area and provides a menu which allows you to Add, Delete, Modify, Install (changes) and Scroll through all other raid sets (First, Last, Next and Previous). If no raid sets have been configured, only the screen title and menu is displayed. All attributes of the raid set are displayed. For information on each attribute of the raid set, see the rconf command in the "Advanced Topics" section. The menu line allows one to leave the Raid Set Configuration screen or select further actions. --------------------------------------------------------- [antares-RAID-SparcLinux-HOWTO002] Figure 3.2: The RAIDSet configuration screen. ---------------------------------------------------------   * [Q]uit: Exit the Raid Set Configuration screen and return to the Main screen. If you have modified, deleted or added a raid set and have not installed the changes you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded.   * [I]nst: This action installs (into the RaidRunner configuration area) any changes that may have been made to raid sets, be that deletion, addition or modification. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line.   * [M]od: This action allows you to modify the displayed raid set. You will be prompted for each Raid Set attribute that can be changed. The prompt includes allowable options or formats required. If you don't wish to change a particular attribute, then press the RETURN or TAB key. The attributes you can change are the raid set name, I/O mode, status (Active to Inactive), bootmode, spares usage, backend zone table usage, IO size (if raid set has never been used - i.e. just added), cache size, I/O queues length, host interfaces and additional stargd arguments. If you wish to change a single attribute then use the RETURN or TAB key to skip all other options. The changed attribute will be re-displayed as soon as you press the RETURN key. When specifying cache size, you may suffix the number with 'm' or 'M' to indicate the number is in Megabytes or with 'k' or 'K' to indicate the number is in Kilobytes. Note you can only enter whole integer values. When specifying io size, you may suffix the number with 'k' or 'K' to indicate the number is in Kilobytes. When you enter data, it is checked for correctness and if incorrect, a message is displayed and all changes are discarded and you will have to start again. Remember you must install ([I]nst.) any changes.   * [A]dd: When this option is selected you will be prompted for various attributes of the new raid set. These attributes are the raid set name, the raid set type, the initial host interface the raid set is to appear on (in c.h.l format where c is the controller number, h is the host port (0, 1, 2 etc) and l is the SCSI LUN) and finally a list of backends. When backends are to be entered, the screen displays a list of available backends, each with a numeric index (commencing at 0). You select each backend by entering the index and once complete enter q for Quit. As each backend index is entered, it's backend name is displayed in a comma separated list. When you enter data, it is checked for correctness and if incorrect, a message is displayed and the addition will be ignored and you will have to start again. Once the backends are complete, the newly created raid set will be displayed on the screen with supplied and default attributes. You can then modify the raid set to change other attributes. Remember you must install ([I]nst.) any new raid sets.   * [D]elete: This action will delete the currently displayed raid set. If this raid set is Active, then you will not be allowed to delete it. You will have to make it Inactive (via the [M]od. option) then delete it. You will be prompted to confirm the deletion. Once you confirm the deletion, the screen will be cleared and the next raid set will be displayed, if configured. Remember you must install ([I]nst.) any changes.   * [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the configured raid sets. 3.3.4  [H]ostports: The Host Port Configuration screen (Figure 3.3) displays for each controller, each host port (labelled A, B, C, etc for port number 0, 1, 2, etc) and the assigned SCSI ID. If the RaidRunner you use, has external switches for host port SCSI ID selection, you may only exit ([Q]uit) from this screen. If the RaidRunner you use, does NOT have external switches for host port SCSI ID selection, then you may modify (and hence install) the SCSI ID for any host port. The menu line allows one to leave the Host Port Configuration screen or select further actions (if NO external host). --------------------------------------------------------- [antares-RAID-SparcLinux-HOWTO003] Figure 3.3: The host port configuration screen. ---------------------------------------------------------   * [Q]uit: Exit the Host Port Configuration screen and return to the Main screen. If you have modified a host port SCSI ID assignment and have not installed the changes you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded.   * [I]nstall: This action installs (into the RaidRunner configuration area) any changes that may have been made to host port SCSI ID assign­ ments. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line.   * [M]odify: This action allows you to modify the host port SCSI ID assignments for each host port on each controller (if NO external host port SCSI ID switches). You will be prompted for the SCSI ID for each host port. You can enter either a SCSI ID (0 thru 15), the minus "-" character to clear the SCSI ID assignment or RETURN to SKIP. As you enter data, it is checked for correctness and if incorrect, a message will be printed although previously correctly entered data will be retained. Remember you must install ([I]nst.) any changes. 3.3.5  [S]pares: The Spare Device Configuration screen (Figure 3.4) displays all configured spare devices in the data area and provides a menu which allows you to Add, Delete, Mod­ ify and Install (changes) spare devices. If no spare devices have been configured, only the screen title and menu is displayed. Each spare device displayed, shows it's name (in device_type_c.s.l format), it's size in 512-byte blocks, it's spin status (Hot or Warm), it's controller allocation, finally it's current status (Used/Unused, Faulty/Working). If used, the raid set that uses it is nominated. For information on each attribute of a spare device, see the rconf command in the "Advanced Topics" section. The menu line allows one to leave the Spare Device Configuration screen or select further actions. --------------------------------------------------------- [antares-RAID-SparcLinux-HOWTO004] Figure 3.4: The spare device configuration screen. ---------------------------------------------------------   * [Q]uit: Exit the Spare Device Configuration screen and return to the Main screen. If you have modified, deleted or added a spare device and have not installed the changes you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded.   * [I]nstall: This action installs (into the RaidRunner configuration area) any changes that may have been made to the spare devices, be that deletion, addition or modification. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line.   * [M]odify: This action allows you to modify the unused spare devices. You will be prompted for each spare device attribute that can be changed. The prompt includes allowable options or formats required. If you don't wish to change a particular attribute, then press the RETURN key. The attributes you can change are the new size (in 512-byte blocks), the spin state (H or hot or W for Warm), and the controller allocation (A for any, 0 for controller 0, 1 for controller 1, etc). If you wish to change a single attribute of a spare device, then use the RETURN key to skip all other attributes for each spare device. The changed attribute will not be re-displayed until the last prompted attribute is entered (or skipped). When you enter data, it is checked for cor­ rectness and if incorrect, a message is dis­ played and all changes are discarded and you will have to start again. Remember you must install ([I]nstall) any changes.   * [A]dd: When adding a spare device, the list of available devices is displayed and you are required to type in the device name. Once entered, the spare is added with defaults which you can change, if required, via the [M]odify option. Remember you must install ([I]nstall) any changes.   * [D]elete: When deleting a spare device, the list of spare devices allowed to be deleted is displayed and you are required to type in the required device name. Once entered, the spare is deleted from the screen. Remember you must install ([I]nstall) any changes. 3.3.6  [M]onitor: The SCSI Monitor Configuration screen (Figure 3.5) displays a table of SCSI monitors configured for the RaidRunner. Up to four SCSI monitors may be configured. The table columns are entitled Controller, Host Port, SCSI LUN and Protocol and each line of the table shows the appropriate SCSI Monitor attribute. For details on SCSI Monitor attributes, see the rconf command in the "Advanced Topics" section. The menu line allows one to leave the SCSI Monitor Configuration screen or modify and install the table. --------------------------------------------------------- [antares-RAID-SparcLinux-HOWTO005] Figure 3.5: The SCSI monitor configuration screen. ---------------------------------------------------------   * [Q]uit: Exit the SCSI Monitor Configuration screen and return to the Main screen. If you have made changes and have not installed them you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded.   * [I]nstall: This action installs (into the RaidRunner configuration area) any changes that may have been made to SCSI Monitor configuration. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line.   * [M]odify: This action allows you to modify the SCSI Monitor configuration. The cursor will be moved around the table, prompting you for input. If you do not want to change an attribute, enter RETURN to skip. If you want to delete a SCSI monitor then enter the minus "-" character when prompted for the controller number. If you want to use the default protocol list, then enter RETURN at the Protocol List prompt. As you enter data, it is checked for correctness and if incorrect, a message will be printed and any previously entered data is discarded. You will have to re-enter the data again. Remember you must install ([I]nstall) any changes. 3.3.7  [G]eneral: The General screen (Figure 3.6) has a blank data area and a menu which allows one to Quit and return to the main screen, or to select further sub-menus which provide information about Devices, the System Message Logger, Global Environment variables and throughput Statistics. --------------------------------------------------------- [antares-RAID-SparcLinux-HOWTO006] Figure 3.6: The General Screen. The options accessible from here allow you to view information on the attached devices (SCSI hard drives and tape units), browse the system logs, and examine environment variables. ---------------------------------------------------------   * [Q]uit: Exit the General screen and return to the Main screen.   * [D]evices: Enter the Device information screen (Figure 3.7). The Devices screen displays the name of all devices on the RaidRunner. The menu line allows one to Quit and return to the General screen or display information about the devices. ----------------------------------------------------- [antares-RAID-SparcLinux-HOWTO007] Figure 3.7: The device information screen. -----------------------------------------------------   + [Q]uit: Exit the Devices screen and return to the General screen.   + Device information[I]nformation: The Device Information screen (Figure 3.8) displays information about each device (Figure ). You can scroll through the devices. For disks, information displayed includes, the device name, serial number, vendor name, product id, speed, version, sector size, sector count, total device size in MB, number of cylinders, heads and sectors per track and the zone/notch partitions. The menu line allows one the leave the Device Information screen or browse through devices. ------------------------------------------------- [antares-RAID-SparcLinux-HOWTO008] Figure 3.8: Example of the information displayed for a hard drive device. -------------------------------------------------   o [Q]uit: Exit the Device Information screen and return to the Devices screen.   o [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the devices and hence display their current data .   * System LogSys[L]og: Enter the System Logger Messages screen (Figure 3.9). ----------------------------------------------------- [antares-RAID-SparcLinux-HOWTO009] Figure 3.9: The system logger messages screen. An example message is shown, there is one message per screen. -----------------------------------------------------   + [Q]uit: Exit the System Logger Messages screen and return to the General screen.   + [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the system log.   * Environment variable configuration[E]nvironment: Enter the Global Environment Variable configuration screen (Figure 3.10). The Environment Variable Configuration screen dis­ plays all configured Global Environment Variables and provides a menu which allows you to Add, Delete, Modify and Install (changes) variables. Each variable name is displayed followed by an equals "=" and the value assigned to that variable enclosed in braces - "{" .. "}". The menu line allows you to Quit and return to the General screen or select further actions. ----------------------------------------------------- [antares-RAID-SparcLinux-HOWTO010] Figure 3.10: The global environment variable configuration screen. -----------------------------------------------------   + [Q]uit: Exit the Environment Variable Configuration screen and return to the General screen. If you have modified, deleted or added an environment variable and have not installed the changes you will be asked to confirm this. If you select Yes to continue the exit, all changes made since the last install action will be discarded.   + [I]nst: This action installs (into the RaidRunner configuration area) any changes that may have been made to environment variables, be that deletion, addition or modification. If you exit prior to installing, all changes made since the last installation will be discarded. The installation process takes time. It is complete once the typed "i" character, is cleared from the menu line.   + [M]od: This action allows you to modify an environment variable's value. You will be prompted for the name of the environment variable and then prompted for it's new value. If the environment variable entered is not found, a message will be printed and you will not be prompted for a new value. If you do not enter a new value, (i.e. just press RETURN) no change will be made. Remember you must install ([I] nstall) any changes.   + [A]dd: When adding a new environment variable, you will be prompted for it's name and value. Providing the variable name is not already used and you enter a value, the new variable will be added and displayed. Remember you must install ([I]nstall) any changes.   + [D]elete: When deleting an environment variable, you will be prompted for the variable name and if valid, the environment variable will be deleted. Remember you must install ([I]nstall) any changes.   * Throughput statistics (viewing)[S]tats: Enter the Statistics monitoring screen (Figure 3.11). The Statistics screen display various general and specific statistics about raid sets configured and running on the RaidRunner. The first section of the data area displays the current temperature in degrees Celsius and the current speed of fans in the RaidRunner. The next section of the data area displays various statistics about the named raid set. The statistics are - the current cache hit rate, the cumulative number of reads, read failures, writes and write failures for each backend of the raid set and finally the read and write throughput for each stargd process (indicated by it's process id) that front's the raid set. The menu line allows one the leave the Statistics screen or select further actions. ----------------------------------------------------- [antares-RAID-SparcLinux-HOWTO011] Figure 3.11: The statistics monitoring screen -----------------------------------------------------   + [Q]uit: Exit the Statistics screen and return to the General screen.   + [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the statistics.   + [R]efresh: This option will get the statistics for the given raid set and re-display the current statistics on the screen.   + [Z]ero: This option will zero the cumulative statistics for the currently displayed raid set.   + [C]ontinuous: This option will start a back­ ground process that will update the statistics of the currently displayed raid set every 2 seconds. A loop counter is created and updated every 2 seconds also. To inter­ rupt this continuous mode of gathering statistics, just press any character. If you need to re-fresh the display, then press the refresh characters - or . 3.3.8  [P]robe The probe option re-scans the SCSI channels and updates the backend list with the hardware it finds. 3.3.9  Example RAID Configuration Session The generalized procedure for configuration consists of three steps arranged in the following order: 1. Configuring the Host Port(s) 2. Assigning Spares 3. Configuring the RAID set Note that there is a minimum number of backends required for the various supported RAID levels:   * Level 0 : 2 backends   * Level 3 : 2 backends   * Level 5 : 3 backends In this example we will configure a RAID 5 using 6, 2.04 gigabyte drives. The total capacity of the virtual drive will be 10 gigabytes (the equivalent of one drive is used for redundancy). This same configuration procedure can be used to configure other levels of RAID sets by changing the type parameter. 1. Power on the computer with the serial terminal connected to the RaidRunner's serial port. 2. When the husky ( :raid; ) prompt appears, Start the GUI by typing "agui" and pressing return. 3. When the main screen appears, select "H" for [H]ostport configuration 4. On some models of RaidRunner the host port in not configurable. If you have only a [Q]uit option here then there is nothing further to be done for the host port configuration, note the values and skip to step 6. If you have add/modify options then your host port is software configurable. 5. If there is no entry for a host port on this screen, add an entry with the parameters: controller=0, hostport=0 , SCSI ID=0. Don't forget to [I] nstall your changes. If there is already and entry present, note the values (they will be used in a later step). 6. From this point onward I will assume the following hardware configuration: a. There are 7 - 2.04 gig drives connected as follows: i. 2 drives on SCSI channel 0 with SCSI IDs 0 and 1 (backends 0.0.0, and 0.1.0, respectively). ii. 3 drives on SCSI channel 1 with SCSI IDs 0 ,1 and 5 (backends 1.0.0, 1.1.0, and 1.5.0). iii. 2 drives on SCSI channel 2 with SCSI IDs 0 and 1 (backends 2.0.0 and 2.1.0). b. Therefore: i. Rank 0 consists of backends 0.0.0, 1.0.0, 2.0.0 ii. Rank 1 consists of backends 0.1.0, 1.1.0, 2.1.0 iii. Rank 5 contains only the backend 1.5.0 c. The RaidRunner is assigned to controller 0, hostport 0 7. Press Q to [Q]uit the hostports screen and return to the Main screen. 8. Press S to enter the [S]pares screen 9. Select A to [A]dd a new spare to the spares pool. A list of available backends will be displayed and you will be prompted for the following information: Enter the device name to add to spares - from above: enter D1.5.0 10. Select I to [I]nstall your changes 11. Select Q to [Q]uit the spares screen and return to the Main screen 12. Select R from the Main screen to enter the [R]aidsets screen. 13. Select A to [A]dd a new RAID set. You will be prompted for each of the RAID set parameters. The prompts and responses are given below. a. Enter the name of Raid Set: cim_homes (or whatever you want to call it). b. Raid set type [0,1,3,5]: 5 c. Enter initial host interface - ctlr,hostport,scsilun: 0.0.0 Now a list of the available backends will be displayed in the form: 0 - D0.0.0 1 - D1.0.0 2 - D2.0.0 3 - D0.1.0 4 - D1.1.0 5 - D2.1.0 d. Enter index from above - Q to Quit: 1 press return 2 press return 3 press return 4 press return 5 press return Q 14. After pressing Q you will be returned to the Raid Sets screen. You should see the newly configured Raid set displayed in the data area (Figure 3.12 ). 15. Press I to [I]nstall the changes ----------------------------------------------------- [antares-RAID-SparcLinux-HOWTO012] Figure 3.12: The RaidSets screen of the GUI showing the newly configured RAID 5 ----------------------------------------------------- 16. Press Q to exit the RaidSet screen and return to the the Main screen 17. Press Q to [Q]uit agui and exit to the husky prompt. 18. type "reboot" then press enter. This will reboot the RaidRunner (not the host machine.) 19. When the RaidRunner reboots it will prepare the drives for the newly configured RAID. NOTE: Depending on the size of the RAID this could take a few minutes to a few hours. For the above example it takes the 5070 approximately 10 - 20 minutes to stripe the RAID set. 20. Once you see the husky prompt again the RAID is ready for use. You can then proceed with the Linux configuration. 3.4  Linux Configuration These instructions cover setting up the virtual RAID drives on RedHat Linux 6.1. Setting it up under other Linux distributions should not be a problem. The same general instructions apply. If you are new to Linux you may want to consider installing Linux from scratch since the RedHat installer will do most of the configuration work for you. If so skip to section titled "New Linux Installation." Otherwise go to the "Existing Linux Installation" section (next). 3.4.1  Existing Linux Installation Follow these instructions if you already have Redhat Linux installed on your system and you do not want to re-install. If you are installing the RAID as part of a new RedHat Linux installation (or are re-installing) skip to the "New Linux Installation" section. QLogic SCSI Driver The driver can either be loaded as a module or compiled into your kernel. If you want to boot from the RAID then you may want to use a kernel with compiled in QLogic support (see the kernel-HOWTO available from http:// www.linuxdoc.org. To use the modular driver become the superuser and add the following lines to /etc/conf.modules: alias qlogicpti /lib/modules/preferred/scsi/qlogicpti  Change the above path to where ever your SCSI modules live. Then add the following line to you /etc/fstab (with the appropriate changes for device and mount point, see the fstab man page if you are unsure) /dev/sdc1 /home ext2 defaults 1 2 Or, if you prefer to use a SYSV initialization script, create a file called ``raid'' in the /etc/rc.d/init.d directory with the following contents (NOTE: while there are a few good reasons to start the RAID using a script, one of the aforementioned methods would be preferable): #!/bin/bash case "$1" in start) echo "Loading raid module" /sbin/modprobe qlogicpti echo echo "Checking and Mounting raid volumes..." mount -t ext2 -o check /dev/sdc1 /home touch /var/lock/subsys/raid ;; stop) echo "Unmounting raid volumes" umount /home echo "Removing raid module(s)" /sbin/rmmod qlogicpti rm -f /var/lock/subsys/raid echo ;; restart) $0 stop  $0 start  ;;  *) echo "Usage: raid {start|stop|restart}" exit 1 esac exit 0  You will need to edit this example and substitute your device name(s) in place of /dev/sdc1 and mount point(s) in place of /home. The next step is to make the script executable by root by doing: chmod 0700 /etc/rc.d/init.d/raid Now use your run level editor of choice (tksysv, ksysv, etc.) to add the script to the appropriate run level. Device mappings Linux uses dynamic device mappings you can determine if the drives were found by typing: more /proc/scsi/scsi one or more of the entries should look something like this: Host: scsi1 Channel: 00 Id: 00 Lun: 00 Vendor: ANTARES Model: CX106 Rev: 0109 Type: Direct-Access ANSI SCSI revision: 02 There may also be one which looks like this: Host: scsi1 Channel: 00 Id: 00 Lun: 07 Vendor: ANTARES Model: CX106-SMON Rev: 0109 Type: Direct-Access ANSI SCSI revision: 02 This is the SCSI monitor communications channel which is currently un-used under Linux (see SMON in the advanced topics section below). To locate the drives (following reboot) type: dmesg | more Locate the section of the boot messages pertaining to you SCSI devices. You should see something like this: qpti0: IRQ 53 SCSI ID 7 (Firmware v1.31.32)(Firmware 1.25 96/10/15) [Ultra Wide, using single ended interface] QPTI: Total of 1 PTI Qlogic/ISP hosts found, 1 actually in use. scsi1 : PTI Qlogic,ISP SBUS SCSI irq 53 regs at fd018000 PROM node ffd746e0 Which indicates that the SCSI controller was properly recognized, Below this look for the disk section: Vendor ANTARES Model: CX106 Rev: 0109 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdc at scsi1, channel 0, id 0, lun 0 SCSI device sdc: hdwr sector= 512 bytes. Sectors= 20971200 [10239 MB] [10.2 GB] Note the line that reads "Detected scsi disk sdc ..." this tells you that this virtual disk has been mapped to device /dev/sdc. Following partitioning the first partition will be /dev/sdc1, the second will be /dev/sdc2, etc. There should be one of the above disk sections for each virtual disk that was detected. There may also be an entry like the following: Vendor ANTARES Model: CX106-SMON Rev: 0109 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdd at scsi1, channel 0, id 0, lun 7 SCSI device sdd: hdwr sector= 512 bytes. Sectors= 20971200 [128 MB] [128.2 MB] BEWARE: this is not a drive DO NOT try to fdisk, mkfs, or mount it!! Doing so WILL hang your system. Partitioning A virtual drive appears to the host operating system as a large but otherwise ordinary SCSI drive. Partitioning is performed using fdisk or your favorite utility. You will have to give the virtual drive a disk label when fdisk is started. Using the choice ``Custom with autoprobed defaults'' seems to work well. See the man page for the given utility for details. Installing a filesystem Installing a filesystem is no different from any other SCSI drive: mkfs -t  /dev/ for example: mkfs -t ext2 /dev/sdc1 Mounting If QLogic SCSI support is compiled into you kernel OR you are loading the "qlogicpti" module at boot from /etc/conf.modules then add the following line (s) to the /etc/fstab: /dev/  ext2 defaults 1 1 If you are using a SystemV initialization script to load/unload the module you must mount/unmount the drives there as well. See the example script above. 3.4.2  New Linux Installation This is the easiest way to install the RAID since the RedHat installer program will do most of the work for you. 1. Configure the host port, RAID sets, and spares as outlined in "Onboard Configuration." Your computer must be on to perform this step since the 5070 is powered from the SBUS. It does not matter if the computer has an operating system installed at this point all we need is power to the controller card. 2. Begin the RedHat SparcLinux installation 3. The installation program will auto detect the 5070 controller and load the Qlogic driver 4. Your virtual RAID drives will appear as ordinary SCSI hard drives to be partitioned and formatted during the installation. NOTE: When using the graphical partitioning utility during the RedHat installation DO NOT designate any partition on the virtual drives as type RAID since they are already hardware managed virtual RAID drives. The RAID selection on the partitioning utilities screen is for setting up a software RAID. IMPORTANT NOTE: you may see a small SCSI drive ( usually ~128 MB) on the list of available drives. DO NOT select this drive for use. It is the SMON communication channel NOT a drive. If setup tries to use it it will hang the installer. 5. Thats it, the installation program takes care of everything else !! 3.5  Maintenance 3.5.1  spares, activatingActivating a spare When running a RAID 3 or 5 (if you configured one or more drives to be spares) the 5070 will detect when a drive goes offline and automatically select a spare from the spares pool to replace it. The data will be rebuilt on-the-fly. The RAID will continue operating normally during the re-construction process (i.e. it can be read from and written to just is if nothing has happened). When a backend fails you will see messages similar to the following displayed on the 5070 console: 930 secs: Redo:1:1 Retry:1 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/ Selection Time-out @682400+16 932 secs: Redo:1:1 Retry:2 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/ Selection Time-out @682400+16 933 secs: Redo:1:1 Retry:3 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/ Selection Time-out @682400+16 934 secs: CIO_cim_homes_q3 R5_W(3412000, 16): Pre-Read drive 4 (D1.1.0) fails with result "Re-/Selection Time-out" 934 secs: CIO_cim_homes_q2 R5: Drained alternate jobs for drive 4 (D1.1.0) 934 secs: CIO_cim_homes_q2 R5: Drained alternate jobs for drive 4 (D1.1.0) RPT 1/0 934 secs: CIO_cim_homes_q2 R5_W(524288, 16): Initial Pre-Read drive 4 (D1.1.0) fails with result "Re-/Selection Time-out" 935 secs: Redo:1:0 Retry:1 (DIO_cim_homes_D1.0.0_q1) CDB=28(Read_10)SCSI Bus ~Reset detected @210544+16 936 secs: Failed:1:1 Retry:0 (rconf) CDB=2A(Write_10)Re-/Selection Time-out @4194866+128 ... Then you will see the spare being pulled from the spares pool, spun up, tested, engaged, and the data reconstructed. 937 secs: autorepair pid=1149 /raid/cim_homes: Spinning up spare device 938 secs: autorepair pid=1149 /raid/cim_homes: Testing spare device/dev/ hd/1.5.0/data 939 secs: autorepair pid=1149 /raid/cim_homes: engaging hot spare ... 939 secs: autorepair pid=1149 /raid/cim_homes: reconstructing drive 4 ... 939 secs: 1054 939 secs: Rebuild on /raid/cim_homes/repair: Max buffer 2800 in 7491 reads, priority 6 sleep 500 ... The rebuild script will printout its progress every 10% of the job completed 939 secs: Rebuild on /raid/cim_homes/repair @ 0/7491 1920 secs: Rebuild on /raid/cim_homes/repair @ 1498/7491 2414 secs: Rebuild on /raid/cim_homes/repair @ 2247/7491 2906 secs: Rebuild on /raid/cim_homes/repair @ 2996/7491 3.5.2  re-integrating repaired driveRe-integrating a repaired drive into the RAID (levels 3 and 5) After you have replaced the bad drive you must re-integrate it into the RAID set using the following procedure. 1. Start the text GUI 2. Look the list of backends for the RAID set(s). 3. Backends that have been marked faulty will have a (-) to the right of their ID ( e.g. D1.1.0- ). 4. If you set up spares the ID of the faulty backend will be followed by the ID of the spare that has replaced it ( e.g. D1.1.0-D1.5.0 ) . 5. Write down the ID(s) of the faulty backend(s) (NOT the spares). 6. Press Q to exit agui 7. At the husky prompt type: replace    Where is whatever you named the raid set and is the ID of the backend that is being re-integrated into the RAID. If a spare was in use it will be automatically returned to the spares pool. Be patient, reconstruction can take a few minutes minutes to several hours depending on the RAID level and the size. Fortunately, you can use the RAID as you normally would during this process. 3.6  Troubleshooting / Error Messages 3.6.1  Out of band temperature detected...   * Probable Cause: The 5070 SBUS card is not adequately cooled.   * Solution: Try to improve cooling inside the case. Clean dust from the fans, re-organize the cards so the raid card is closest to the fan, etc. On some of the "pizza box" sun cases (e.g. SPARC 20) you may need to add supplementary cooling fans especially if you have it loaded with cards. 3.6.2  ... failed ... cannot have more than 1 faulty backend.   * Cause: More than one backend in the RAID 3/4/5 has failed (i.e. there is no longer sufficient redundancy to enable the lost data to be reconstructed).   * Solution: You're hosed ... Sorry. If you did not assign spares when you configured you RAID 3/4/5 now may be a good time to re-consider the wisdom of that decision. Hopefully you have been making regular backups. Since now you will have to replace the defective drives, re-configure the RAID, and restore the data from a secondary source. 3.6.3  When booting I see: ... Sun disklabel: bad magic 0000 ... unknown partition table.   * Suspected Cause: Incorrect settings in the disk label set by fdisk (or whatever partitioning utility you used). This message seems to happen when you choose one of the preset disk labels rather than "Custom with autoprobed defaults."   * Solution: Since this error does not seem to effect the operation of the drive you can choose to do nothing and be ok. If you want to correct it you can try re-labeling the disk or re-partitioning the disk and choose "Custom with autoprobed defaults." If you are installing RedHat Linux from scratch the installer will get all of this right for you. 3.7  Bugs None yet! Please send bug reports to tcoates@neuropunk.org 3.8  Frequently Asked Questions 3.8.1  How do I reset/erase the onboard configuration? At the husky prompt issue the following command: rconf -init This will delete all of the RAID configuration information but not the global variables and scsi monitors. the remove ALL configuration information type: rconf -fullinit Use these commands with caution! 3.8.2  How can I tell if a drive in my RAID has failed? In the text GUI faulty backends appear with a (-) to the right of their ID. For example the list of backends: D0.0.0,D1.0.0-,D2.0.0,D0.1.0,D1.1.0,D2.1.0 Indicates that backend (drive) D1.0.0 is either faulty or not present. If you assigned spares (RAID 3 or 5) then you should also see that one or more spares are in use. Both the main and the and the RaidSets screens will show information on faulty/not present drives in a RAID set. 3.9  command referenceAdvanced Topics: 5070 Command Reference In addition to the text based GUI the RAID configuration may also be manipulated from the husky prompt ( the : raid; prompt) of the onboard controller. This section describes commands that a user can input interactively or via a script file to the K9 kernel. Since K9 is an ANSI C Application Programming Interface (API) a shell is needed to interpret user input and form output. Only one shell is currently available and it is called husky. The K9 kernel is modelled on the Plan 9 operating system whose design is discussed in several papers from AT&T (See the "Further Reading" section for more information). K9 is a kernel targeted at embedded controllers of small to medium complexity (e.g. ISDN-ethernet bridges, RAID controllers, etc). It supports multiple lightweight processes (i.e. without memory management) on a single CPU with a non-pre-emptive scheduler. Device driver architecture is based on Plan 9 (and Unix SVR4) STREAMS. Concurrency control mechanisms include semaphores and signals. The husky shell is modelled on a scaled down Unix Bourne shell. Using the built-in commands the user can write new scripts thus extending the functionality of the 5070. The commands (adapted from the 5070 man pages) are extensive and are described below. 3.9.1  autobootAUTOBOOT - script to automatically create all raid sets and scsi monitors   * SYNOPSIS: autoboot   * DESCRIPTION: autoboot is a husky script which is typically executed when a RaidRunner boots. The following steps are taken - 1. Start all configured scsi monitor daemons (smon). 2. Test to see if the total cache required by all the raid sets that are to boot is not more than 90% of available memory. 3. Start all the scsi target daemons (stargd) and set each daemon's mode to "spinning-up" which enables it to respond to all non medium access commands from the host. This is done to allow hosts to gain knowledge about the RaidRunner's scsi targets as quickly as possible. 4. Bind into the root (ram) filesystem all unused spare backend devices. 5. Build all raid sets. 6. If battery backed-up ram is present, check for any saved writes and restore them into the just built raid sets. 7. Finally, set the state of all scsi target daemons to "spun-up" enabling hosts to fully access the raid set's behind them. 3.9.2  AUTOFAULT - script to automatically mark a backend faulty after a drive failure   * SYNOPSIS: autofault raidset   * DESCRIPTION: autofault is a husky script which is typically executed by a raid file system upon the failure of a backend of that raid set when that raid file system cannot use spare backends or has been configured not to use spare backends. After parsing it's arguments (command and environment) autofault issues a rconf command to mark a given backend as faulty.   * OPTIONS:   + raidset: The bind point of the raid set whose backend failed.   + $DRIVE_NUMBER: The index of the backend that failed. The first backend in a raid set is 0. This option is passed as an environment variable.   + $BLOCK_SIZE: The raid set's io block size in bytes. (Ignored). This option is passed as an environment variable.   + $QUEUE_LENGTH: The raid set's queue length. (Ignored). This option is passed as an environment variable.   * SEE ALSO: rconf 3.9.3  AUTOREPAIR - script to automatically allocate a spare and reconstruct a raid set   * SYNOPSIS: autorepair raidset size   * DESCRIPTION: autorepair is a husky script which is typically executed by either a raid type 1, 3 or 5 file system upon the failure of a backend of that raid set. After parsing it's arguments (command and environment) autorepair gets a spare device from the RaidRunner's spares spool. It then engages it in write-only mode and reads the complete raid device which reconstructs the data on the spare. The read is from the raid file system repair entrypoint. Reading from this entrypoint causes a read of a block immediately followed by a write of that block. The read/write sequence is atomic (i.e is not interruptible). Once the reconstruction has completed, a check is made to ensure the spare did not fail during reconstruction and if not, the access mode of the spare device is set to the access mode of the raid set. The process that reads the repair entrypoint is rebuild. This device reconstruction will take anywhere from 10 minutes to one and a half hours depending on both the size and speed of the backends and the amount of activity the host is generating. During device reconstruction, pairs of numbers will be printed indicating each 10% of data reconstructed. The pairs of numbers are separated by a slash character, the first number being the number of blocks reconstructed so far and the second being the number number of blocks to be reconstructed. Further status about the rebuild can be gained from running rebuild. When the spare is allocated both the number of spares currently used on the backend and the spare device name is printed. The number of spares on a backend is referred to the depth of spares on the backend. Thus prior to re-engaging the spare after a reconstruction a check can be made to see if the depth is the same. If it is not, then the spare reconstruction failed and reconstruction using another spare is underway (or no spares are available), and hence we don't re-engage the drive.   * OPTIONS:   + raidset: The bind point of the raid set whose backend failed.   + size : The size of the raid set in 512 byte blocks.   + $DRIVE_NUMBER: The index of the backend that failed. The first backend in a raid set is 0. This option is passed as an environment variable.   + $BLOCK_SIZE: The raid set's io block size in bytes. This option is passed as an environment variable.   + $QUEUE_LENGTH: The raid set's queue length. This option is passed as an environment variable.   * SEE ALSO: rconf, rebuild 3.9.4  BIND - combine elements of the namespace   * SYNOPSIS: bind [-k] new old   * DESCRIPTION: Bind replaces the existing old file (or directory) with the new file (or directory). If the"-k" switch is given then new must be a kernel recognized device (file system). Section 7k of the manual pages documents the devices (sometimes called file systems) that can be bound using the "-k" switch. 3.9.5  BUZZER - get the state or turn on or off the buzzer   * SYNOPSIS: buzzer or buzzer on|off|mute   * DESCRIPTION: Buzzer will either print the state of the buzzer, turn on or off the buzzer or mute it. If no arguments are given then the state of the buzzer is printed, that is on or off will be printed if the buzzer is currently on or off respectively. If the buzzer has been muted, then you will be informed of this. If the buzzer has not been used since the RaidRunner has booted then the special state, unused, is printed. If the argument on is given the buzzer is turned on, if off, the buzzer is turned off. If the argument mute is given then the muted state of the buzzer is changed.   * SEE ALSO: warble, sos 3.9.6  CACHE - display information about and delete cache ranges   * SYNOPSIS: cache [-D moniker] [-I moniker] [-F] [-g moniker first|last] lastoffset   * DESCRIPTION: cache will print (to standard output) information about the given cache range, delete a given cache range, flush the cache or return the last offset of all cache ranges.   * OPTIONS   + -F: Flush all cache buffers to their backends (typically raid sets).   + -D moniker: Delete the cache range with moniker (name) moniker.   + -I moniker: Invalidate the cache for the given cache range (moniker). This is only useful for debugging or elaborate benchmarks.   + g moniker first|last: Print either the first or last block number of a cache range with moniker (name) moniker.   + lastoffset: Print the last offset of all cache ranges. The last offset is the last block number of all cache ranges. 3.9.7  CACHEDUMP - Dump the contents of the write cache to battery backed-up ram   * SYNOPSIS: cachedump   * DESCRIPTION: cachedump causes all unwritten data in the RaidRunner's cache to be written out to the battery backed-up ram. No data will be written to battery backed-up ram if there is currently valid data already stored there. This command is typically executed when there is something wrong with the data (or it's organization) in battery backed-up ram and you need to re-initialize it. cachedump will always return a NULL status.   * SEE ALSO: showbat, cacherestore 3.9.8  CACHERESTORE - Load the cache with data from battery backed-up ram   * SYNOPSIS: cacherestore   * DESCRIPTION: cacherestore will check the RaidRunner's battery backed-up ram for any data it has stored as a result of a power failure. It will copy any data directly into the cache. This command is typically executed automatically at boot time and prior to the RaidRunner making it's data available to a host. Having successfully copied any data from battery backed-up ram into the cache, it flushes the cache and then re-initializes battery backed-up ram to indicate it holds no data. cacherestore will return a NULL status on success or 1 if an error occurred during the loading (with a message written to standard error).   * SEE ALSO: showbat 3.9.9  CAT - concatenate files and print on the standard output   * SYNOPSIS: cat [ file... ]   * DESCRIPTION: cat writes the contents of each given file, or standard input if none are given or when a file named `-' is given, to standard output. If the nominated file is a directory then the filenames contained in that directory are sent to standard out (one per line). More information on a file (e.g. its size) can be obtained by using stat. The script file ls uses cat and stat to produce directory listings.   * SEE ALSO echo, ls, stat 3.9.10  CMP - compare the contents of 2 files   * SYNOPSIS: cmp [-b blockSize] [-c count] [-e] [-x] file1 file2   * DESCRIPTION: cmp compares the contents of the 2 named files. If file1 is "-" then standard input is used for that file. If the files are the same length and contain the same val­ ues then nothing is written to standard output and the exit status NIL (i.e. true) is set. Where the 2 files dif­ fer, the first bytes that differ and the position are out­ put to standard out and the exit status is set to "differ" (i.e. false). The position is given by a block number (origin 0) followed by a byte offset within that block (origin 0). The optional "-b" switch allows the blockSize of each read operation to be set. The default blockSize is 512 (bytes). For big compares involving disks a relatively large blockSize may be useful (e.g. 64k). See suffix for allowable suffixes. The optional "-c" switch allows the count of blocks read to fixed. A value of 0 for count is interpreted as read to the end of file (EOF). To compare the first 64 Megabytes of 2 files the switches "-b 64k -c 1k" could be used. See suffix for allowable suffixes. The optional "-e" switch instructs ccmmpp to output to stan­ dard out (usually overwriting the same line) the count of blocks compared, each time a multiple of 100 is reached. The final block count is also output. The optional "-x" switch instructs ccmmpp to continue after a comparison error (but not a file error) and keep a count of blocks in error. If any errors are detected only the last one will be output when the command exits. If the "-e" switch is also given then the current count of blocks in error is output to the right of the multiple of 100 blocks compared. This command is designed to compare very large files. Two buffers of blockSize are allocated dynamically so their size is bounded by the amount of memory (i.e. RAM in the target) available at the time of command execution. The count could be up to 2G. The number of bytes compared is the product of blockSize and count (i.e. big enough).   * SEE ALSO: suffix 3.9.11  CONS - console device for Husky   * SYNOPSIS: bind -k cons bind_point   * DESCRIPTION: cons allows an interpreter (e.g. Husky) to route console input and output to an appropriate device. That console input and output is available at bind_point in the K9 namespace. The special file cons should always be available.   * EXAMPLES: Husky does the following in its initialization: bind -k cons /dev/cons On a Unix system this is equivalent to: bind -k unixfd /dev/cons On a DOS system this is equivalent to: bind -k doscon /dev/cons On target hardware using a SCN2681 chip this is equivalent to: bind -k scn2681 /dev/cons   * SEE ALSO: unixfd, doscon, scn2681 3.9.12  DD - copy a file (disk, etc)   * SYNOPSIS: dd [if=file] [of=file] [ibs=bytes] [obs=bytes] [bs=bytes] [skip =blocks] [seek=blocks] [count=blocks] [flags=verbose]   * DESCRIPTION: dd copies a file (from the standard input to the standard output, by default) with a user-selectable blocksize.   * OPTIONS   + if=file Read from file instead of the standard input.   + of=file, Write to file instead of the standard output.   + ibs=bytes, Read given number of bytes at a time.   + obs=bytes, Write given number of bytes at a time.   + bs=bytes, Read and write given number of bytes at a time. Override ibs and obs.   + skip=blocks, Skip ibs-sized blocks at start of input.   + seek=blocks, By-pass obs-sized blocks at start of output.   + count=blocks, Copy only ibs-sized input blocks.   + flags=verbose, Print (to standard output) the number of blocks copied every ten percent of the copy. The output is of the form X/T where X is the number of blocks copied so far and T is the total number of blocks to copy. This option can only be used if both the count= and of= options are also given. The decimal numbers given to "ibs", "obs", "bs", "skip", "seek" and "count" must not be negative. These numbers can optionally have a suffix (see suffix). dd outputs to standard out in all cases. A successful copy of 8 (full) blocks would cause the following output: 8+0 records in 8+0 records out The number after the "+" is the number of fractional blocks (i.e. blocks that are less than the block size) involved. This number will usually be zero (and is otherwise when physical media with alignment requirements is involved). A write failure outputting the last block on the previous example would cause the following output: Write failed 8+0 records in 7+0 records out   * SEE ALSO: suffix 3.9.13  DEVSCMP - Compare a file's size against a given value   * SYNOPSIS: devscmp filename size   * DESCRIPTION: devscmp will find the size of the given file and compare it's size in 512-byte blocks to the given size (to be in 512-byte blocks). If the size of the file is less than the given value, then -1 is printed, if equal to then 0 is printed, and if the size of the given file is greater than the given size then 1 is printed. This routine is used in internal scripts to ensure that backends of raid sets are of an appropriate size. 3.9.14  DFORMAT- Perform formatting functions on a backend disk drive   * SYNOPSIS   + dformat -p c.s.l -R bnum   + dformat -p c.s.l -pdA|-pdP|-pdG   + dformat -p c.s.l -S [-v] [-B firstbn]   + dformat -p c.s.l -F   + dformat -p c.s.l -D file   * DESCRIPTION: In it's first form dformat will either reassign a block on a nominated disk drive. via the SCSI-2 REASSIGN BLOCKS command. The second form will allow you to print out the current manufacturers defect list (-pdP), the grown defect list (-pdG) or both defect lists (-pdA). Each printed list is sorted with one defect per line in Physical Sector Format - Cylinder Number, Head Number and Defect Sector Number. The third form causes the drive to be scanned in a destructive write/read/compare manner. If a read or write or data comparison error occurs then an attempt is made to identify the bad sector(s). Typically the drive is scanned from block 0 to the last block on the drive. You can optionally give an alternative starting block number. The fourth form causes a low level format on the specified device. The fifth option allows you to download a device's microcode into the device.   * OPTIONS:   + -R bnum: Specify a logical block number to reassign to the drive's grown defect list.   + -pdA: Print both the manufacturer's and grown defect list.   + \ -pdP: Print the manufacturer's defect list.   + -pdG: Print the grown defect list.   + -S: Perform a destructive scan of the disk reporting I/O errors.   + -B firstbn: Specify the first logical block number to start a scan from.   + -v: Turn on verbose mode - which prints the current block number being scanned.   + -F: Issue a low-level SCSI format command to the given device. This will take some time.   + -D file: Download into the specified device, the given file. The download is effected by a single SCSI Write-Buffer command in save microcode mode. This allows users to update a device's microcode. Use this command carefully as you could destroy the device by loading an incorrect file.   + -p c.s.l: Identify the disk device by specifying it's channel, SCSI ID (rank) and SCSI LUN provided in the format "c.s.l"   * SEE ALSO: Product manual for disk drives used in your RAID. 3.9.15  DIAGS - script to run a diagnostic on a given device   * SYNOPSIS: diags disk -C count -L length -M io-mode -T io-type -D device   * DESCRIPTION: diags is a husky script which is used to run the randio diagnostic on a given device. When randio is executed, it is executed in verbose mode.   * OPTIONS:   + disk: This is the device type of diagnostic we are to run.   + -C count: Specify the number of times to execute the diagnostic.   + -L length: Specify the "length" of the diagnostic to execute. This can be either short, medium or long and specified with the letter's s, m or l respectively. In the case of a disk, a short test will the first 10% of the device, a medium the first 50% and long the whole (100%) of the disk.   + -M io-mode: Specify a destructive (read-write) or non-destructive (read-only) test. Use either read-write or read-only.   + -T io-type: Specify a type of io - either sequential or random.   + -D device: Specify the device to test.   * SEE ALSO: randio, scsihdfs 3.9.16  DPART - edit a scsihd disk partition table   * SYNOPSIS:   + dpart -a|d|l|m -D file [-N name] [-F firstblock] [-L lastblock]   + dpart -a -D file -N name -F firstblock -L lastblock   + dpart -d -D file -N name   + dpart -l -D file   + dpart -m -D file -N name -F firstblock -L lastblock   * DESCRIPTION: Each scsihd device (typically a SCSI disk drive) can be divided up into eight logical partitions. By default when a scsihd device is bound into the RaidRunner's file system, it has four partitions, the whole device (raw), typically named bindpoint/raw, the partition file (bindpoint/partition), the RaidRunner backup configuration file (bindpoint/rconfig), and the "data" portion of the disk (bind- point/ data) which represents the whole device less the backup configuration area and partition file. For more information, see scsihdfs. If other partitions are added, then they will appear as bindpoint/partitionname. dpart allows you to edit or list the partition table on a scsihd device (typically a disk).   * OPTIONS:   + -a: Add a partition. When adding a partition, you need to specify the partition name (-N) and the partition range from the first block (-F) to the last block (-L).   + -d: Delete a named (-N) partition.   + -l: List all partitions.   + -m: Modify an existing partition. You will need to specify the partition name (-N) and BOTH it's first (-F) and last (-L) blocknumbers even if you are just modifying the last block number.   + -D file: Specify the partition file to be edited. Typically, this is the bindpoint/partition file.   + -N name: Specify the partition name.   + -F firstblock: Specify the first block number of the partition.   + -L lastblock: Specify the last block number of the partition.   * SEE ALSO: scsihd 3.9.17  DUP - open file descriptor device   * SYNOPSIS: bind -k dup bind_point   * DESCRIPTION: The dup device makes a one level directory with an entry in that directory for every open file descriptor of the invoking K9 process. These directory "entries" are the numbers. Thus a typical process (script) binding a dup device would at least make these files in the namespace: "bind_point/0", "bind_point/1" and "bind_point/2". These would correspond to its open standard in, standard out and standard error file descriptors. A dup device allows other K9 processes to access the open file descriptors of the invoking process. To do this the other processes simply "open" the required dup device directory entry whose name (a number) corresponds to the required file descriptor. 3.9.18  ECHO - display a line of text   * SYNOPSIS: echo [string ...]   * DESCRIPTION: echo writes each given string to the standard output, with a space between them and a newline after the last one. Note that all the string arguments are written in a single write kernel call. The following backslash-escaped characters in the strings are converted as follows: \b backspace \c suppress trailing newline \f form feed \n new line \r carriage return \t horizontal tab \v vertical tab \\ backslash \nnn the character whose ASCII code is nnn (octal)   * SEE ALSO: cat 3.9.19  ENV- environment variables file system   * SYNOPSIS: bind -k env bind_point   * DESCRIPTION: env file system associates a one level directory with the bind_point in the K9 namespace. Each file name in that directory is the name of the environment variable while the contents of the file is that variable's current value. Conceptually each process sees their own copy of the env file system. This copy is either empty or inherited from this process's parent at spawn time (depending on the flags to spawn). 3.9.20  ENVIRON - RaidRunner Global environment variables - names and effects   * DESCRIPTION: The RaidRunner uses GLOBAL environment variables to control the functionality of automatic actions. GLOBAL environment variables are saved in the Raid configuration area so they retain their values between reboots/power downs. Certain RaidRunner internal run-time variables can also be set as a GLOBAL environment variables. See the internals manual entry for details. The table below describes those GLOBAL environment variables that are used by the RaidRunner in it's normal operation.   + RebuildPri This variable, if set, controls the priority used when drive reconstruction occurs via the rebuild program. If the variable is not set then the default rebuild priority would be used. The variable is to be a comma separated list of raid set names and their associated rebuild priorities and sleep periods (colon separated). The form is Rname_1:Pri_1:Sleep_1,Rname_2:Pri_2:Sleep_2,...,Rname_N:Pri_N:Sleep_N where Pri_1 is to be the priority the rebuild program runs with when run on raid set Rname_1, Sleep_1 is the period, in milliseconds, to sleep between each rebuild action on the raid set, Pri_2 is to be the priority for raid set Rname_2, and so forth. For example, if the value of RebuildPri is R:5:30000 then if a rebuild occurs (via replace, repair or autorepair) on raid set R then the rebuild will run with priority 5 (via the -p rebuild option) and will sleep 30000 milliseconds (30 seconds) between each rebuild action (specified via the -S rebuild option). The priority given must be valid for the rebuild program.   + BackendRanks On certain RaidRunner's where multiple controllers may exist, you can restrict a controller's access to the backend ranks of devices available. For example, you may have 2 controllers and 4 ranks of backend devices. You can specify that the first controller can only access the first two ranks and the second controller, the second two ranks. This variable along with other associated commands allows you to set up this restriction. Additionally, you may only have a single controller RaidRunner which is in an enclosure with multiple ranks. By default the controller will attempt to probe for all devices on all ranks. If you have only populated the RaidRunner with say, half it's possible compliment of backend devices, then the RaidRunner will still probe for the other half. Setting this variable appropriately will prevent this un-needed (and on occasion time consuming) process. This variable takes the form controller_id:ranklist controller_id:ranklist ... where controller_id is the controller number (from 0 upwards) and ranklist is a comma list of backend ranks which the given controller will access. Note that the backend rank is the scsi-id of that rank. For example, on a 2 rank (rank 1 and 2 - i.e scsi id 1 for the first rank and scsi id 2 for the second), 1 controller This variable takes the form For example, on a 2 rank (rank 1 and 2 - i.e scsi id 1 for the first rank and scsi id 2 for the second), 1 controller RaidRunner where only the first rank has devices you could prevent the controller from attempting to access the (empty) second rank by setting BackendRanks to 0:1 Typically, you would not set this variable directly, but use supporting commands to set it. These commands are pranks and sranks. See these manual entries for details.   * RAIDn_reference_PBUFS Raid types 3, 4 and 5 all make use of memory for temporary parity buffers when they need to create parity data. This memory is in addition to that allocated to a raid set's cache. When a raid set is created, it will also create a default number of parity buffers (which are the same size is a raid set's iosize). Sometimes, if the iosize of the raid set is large there will not be enough memory to create this default number of parity buffers. To overcome this situation, you can set GLOBAL environment variables to over-ride the default number of parity buffers that all raid sets of a particular type or a specific raid set will use. You need to set these variables before you define the raid set via agui and if you delete them and not the raid set, then the effect raid sets may not boot and hence will not be accessible by a host. The variables are of the form RAIDn_reference_PBUFS where n is the raid type (3, 4 or 5), and reference is the raid set's name or the string 'Default' You use the reference of 'Default' to specify all raid sets of a particular type. For example, to over-ride the number of parity buffers for a raid 5 named : raid ; setenv RAID5_FRED_PBUFS 64 To over-ride the number of parity buffers for ALL raid 3's (and set only 72 parity buffers) set : raid ; setenv RAID3_Default_PBUFS 128 If you set a default for all raid sets of a particular type, but want ONE of them to be different then set up a variable for that particular raid set as it's value will over-ride the default. In the above example, where all Raid Type 3 will have 128 parity buffers, you could set the variable : raid ; setenv RAID3_Dbase_PBUFS 56  which will allow the raid 3 raid set named 'Dbase' to have 56 parity buffers, but all other raid 3's defined on the RaidRunner will have 128.   * SEE ALSO: setenv, printenv, rconf, rebuild, internals 3.9.21  EXEC - cause arguments to be executed in place of this shell   * SYNOPSIS: exec [ arg ... ]   * DESCRIPTION: exec causes the command specified by the first arg to be executed in place of this shell without creating a new process. Subsequent args are passed to the command specified by the first arg as its arguments. Shell redirection may appear and, if no other arguments are given, causes the shell input/output to be modified. 3.9.22  EXIT - exit a K9 process   * SYNOPSIS: exit [string]   * DESCRIPTION: exit has an optional string argument. If the optional argument is given the current K9 process is terminated with the given string as its exit value. (If the string has embedded spaces then the whole string should be a quoted_string). If no argument is given then the shell gets the string associated with the environment variable "status" and returns that string as the exit value. If the environment variable "status" is not found then the "true" exit status (i.e. NIL) is returned.   * SEE ALSO: true, K9exit 3.9.23  EXPR - evaluation of numeric expressions   * SYNOPSIS: expr numeric_expr ...   * DESCRIPTION: expr evaluates each numeric_expr command line argument as a separate numeric expression. Thus a single expression cannot contain unescaped whitespaces or needs to be placed in a quoted string (i.e. between "{" and "}"). Arithmetic is performed on signed integers (currently numbers in the range from -2,147,483,648 to 2,147,483,647). Successful calculations cause no output (to either standard out/error or environment variables). So each useful numeric_expr needs to include an assignment (or op-assignment). Each numeric_expr argument supplied is evaluated in the order given (i.e. left to right) until they all evaluate successfully (returning a true status). If evaluating a numeric_expr fails (usually due to a syntax error) then the expr command fails with "error" as the exit status and the error message is written to the environment variable "error".   * OPERATORS: The precedence of each operator is shown following the description in square brackets. "0" is the highest precedence. Within a single precedence group evaluation is left-to-right except for assignment operators which are right-to-left. Parentheses have higher precedence than all operators and can be used to change the default precedence shown below. UNARY OPERATORS + Does nothing to expression/number to the right. - negates expression/number to the right. ! logically negate expression/number to the right. ~ Bitwise negate expression/number to the right. BINARY ARITHMETIC OPERATORS * Multiply enclosing expressions [2] / Integer division of enclosing expressions % Modulus of enclosing expressions. + Add enclosing expressions - Subtract enclosing expressions. << Shift left expression _left_ by number in right expression. Equivalent to: left * (2 ** right) >> Shift left expression _right_ by number in right expression. Equivalent to: left / (2 ** right) & Bitwise AND of enclosing expressions ^ Bitwise exclusive OR of enclosing expressions. [8] | Bitwise OR of enclosing expressions. [9] BINARY LOGICAL OPERATORS These logical operators yield the number 1 for a true comparison and 0 for a false comparison. For logical ANDs and ORs their left and right expressions are assumed to be false if 0 otherwise true. Both logical ANDs and ORs evaluate both their left and right expressions in all case (cf. C's short-circuit action). <= true when left less than or equal to right. [5] >= true when left greater than or equal to right. [5] < true when left less than right. [5] > true when left greater than right. [5] == true when left equal to right. [6] != true when left not equal to right. [6] && logical AND of enclosing expressions [10] || logical OR of enclosing expressions [11] ASSIGNMENT OPERATORS In the following descriptions "n" is an environment variable while "r_exp" is an expression to the right. All assignment operators have the same precedence which is lower than all other operators. N.B. Multiple assignment operators group right-to-left (i.e. same as C language). = Assign right expression into environment variable on left. *= n *= r_exp is equivalent to: n = n * r_exp /= n /= r_exp is equivalent to: n = n / r_exp %= n %= r_exp is equivalent to: n = n % r_exp += n += r_exp is equivalent to: n = n + r_exp -= n -= r_exp is equivalent to: n = n - r_exp <<= n <<= r_exp is equivalent to: n = n << r_exp >>= n >>= r_exp is equivalent to: n = n >> r_exp &= n &= r_exp is equivalent to: n = n & r_exp |= n |= r_exp is equivalent to: n = n | r_exp   * NUMBERS: All number are signed integers in the range stated in the description above. Numbers can be input in base 2 through to base 36. Base 10 is the default base. The default base can be overridden by: 1. a leading "0" : implies octal or hexadecimal 2. a number of the form _base_#_num_ Numbers prefixed with "0" are interpreted as octal. Numbers prefixed with "0x" or "0X" are interpreted as hexadecimal. For numbers using the "#" notation the _base_ must be in the range 2 through to 36 inclusive. For bases greater then 10 the letters "a" through "z" are utilised for the extra "digits". Upper and lower case letters are acceptable. Any single digit that exceeds (or is equal to) the base is consider an error. Base 10 numbers only may have a suffix. See suffix for a list of valid suffixes. Also note that since expr uses signed integers then "1G" is the largest magnitude number that can be represented with the "Gigabyte" suffix (assuming 32 bit signed integers, -2G is invalid due to the order of evaluation).   * VARIABLES: The only symbolic variables allowed are K9 environment variables. Regardless of whether they are being read or written they should never appear preceded by a "$". Environment variables that didn't previous exist that appear as left argument of an assignment are created. When a non-existent environment variable is read then it is interpreted as the value 0.   * EXAMPLES: Some simple examples: expr {n = 1 + 2} # create n echo $n 3 expr {n*=2} # 3 * 2 result back into n echo $n 6 expr { k = n > 5 } # 6 > 5 is true so create k = 1 echo $k 1   * NOTE: expr is a Husky "built-in" command. See the "Note" section in "set" to see the implications.   * SEE ALSO: husky, set, suffix, test 3.9.24  FALSE - returns the K9 false status   * SYNOPSIS: false   * DESCRIPTION: false does nothing other than return a K9 false status. K9 processes return a pointer to a C string (null terminated array of characters) on termination. If that pointer is NULL then a true exit value is assumed while all other returned pointer values are interpreted as false (with the string being some explanation of what went wrong). This command returns a pointer to the string "false" as its return value.   * EXAMPLE: The following script fragment will print "got here" to standard out: if false then echo impossible else echo got here end   * SEE ALSO: true 3.9.25  FIFO - bi-directional fifo buffer of fixed size   * SYNOPSIS:   + bind -k {fifo size} bind_point   + cat bind_point   + bind_point/data   + bind_point/ctl   * DESCRIPTION: fifo file system associates a one level directory with the bind_point in the K9 namespace with a buffer size of size bytes. bind_point/data and bind_point/ctl are the data and control channels for the fifo. Data written to the bind_point/data file is available for reading from the same file in a first-in first-out basis. A write of x bytes to the bind_point/data file will either complete and and transfer all the data, or will transfer sufficient bytes until the fifo buffer is full then block until data is removed from the fifo buffer by reading. A read of x bytes from the bind_point/data file will transfer the lessor of the current amount of data in the fifo buffer or x bytes. A read from the bind_point/ctl will return the size of the fifo buffer and the current usage. The number of opens (# Opens) is the number of processes that currently have the bind_point/data file open.   * EXAMPLE > /buffer bind -k {fifo 2048} /buffer ls -l /buffer /buffer: /buffer/ctl                     fifo    2 0x00000001    1 0 /buffer/data                    fifo    2 0x00000002    1 0 cat /buffer/ctl Max: 2048 Cur: 0, # Opens: 0 echo hello > /buffer/data cat /buffer/ctl Max: 2048 Cur: 6, # Opens: 0 dd if=/buffer/data bs=512 count=1 hello 0+1 records in 0+1 records out cat /buffer/ctl Max: 2048 Cur: 0, # Opens: 0   * SEE ALSO: pipe 3.9.26  GET - select one value from list   * SYNOPSIS: get number [ value ... ]   * DESCRIPTION: get uses the given number to select one value from the given list. Indexing is origin 0 (e.g. "get 0 aaa bb c" returns "aaa"). If the number is out of range for an index on the given list of values then nothing is returned. 3.9.27  GETIV - get the value an internal RaidRunner variable   * SYNOPSIS:   + getiv   + getiv name   * DESCRIPTION: getiv prints the current value of an internal RaidRunner variable or prints a list of all variables. When a variable name is given it's current value is printed. If no value is given the all available internal variables are listed.   * NOTES: As different models of RaidRunners have different internal variables see your RaidRunner's Hardware Reference manual for a list of variables together with the meaning of their values. These variables are run-time variables and hence revert to their default value whenever the RaidRunner is booted.   * SEE ALSO: setiv 3.9.28  HELP - print a list of commands and their synopses   * SYNOPSIS: help or ?   * DESCRIPTION: help or the question mark character - ?, will print a list of all commands available to the command interpreter. Along with each command, it's synopsis is printed. 3.9.29  HUSKY - shell for K9 kernel   * SYNOPSIS   + husky [-c command] [ file [ arg ... ] ]   + hs [-c command] [ file [ arg ... ] ]   * DESCRIPTION: husky and hs are synonyms. husky is a command language interpreter that executes commands read from the standard input or from a file. husky is a scaled down model of Unix's Bourne shell (sh). One major difference is that husky has no concept of current working directory. If the "-c" switch is present then the following command is interpreted by husky in a newly thrown shell nested in the current environment. This newly thrown shell exits back to the current environment when the command finishes. Otherwise if arguments are given the first one is assumed to be a file containing husky commands. Again a new shell is thrown to execute these commands. husky script files can access their command line arguments and the 2nd and subsequent arguments to husky (if present) are passed to the file for that purpose. If no arguments are given to husky then commands are read from standard in (and the shell is considered interactive).   * RETURN STATUS: husky places the K9 return status of a process (NIL if ok, otherwise a string explaining the error) in the file "/env/status" An example: dd if=/xx dd: could not open /xx cat /env/status open failed cat /env/status # empty because previous "cat" worked As the file "/env/status" is an environment variable the return status of a command is also available in the variable $status. The exit status of a pipeline is the exit status of the last command in the pipeline.   * SIGNALS If an interactive shell receives an interrupt signal (i.e. K9_SIGINT - usually a control-C on the console) then the shell exits. The "init" process will then start a new instance of the husky shell with all the previously running processes (with the exception of the just killed shell) still running. This allows the user to kill the process that caused the previous shell problems. Alternatively a process that is acci­ dentally run in foreground is effectively put in the background by sending an interrupt signal to the shell. Note that this is quite different to Unix shells which would forward the signal onto the foreground process.   * QUOTES, ESCAPING, STRING CONCATENATION, ETC: A quoted_string (as defined in the grammar) commences with a "{" and finishes with the matching "}". The term "matching" implies that all embedded "{" must have a corresponding embedded "}" before the final "}" is said to match the original "{". A quoted_string can be spread across several lines. No command line substitution occurs within quoted_strings. The character for escaping the following character is "\". If a "{" needs to be interpreted literally then it can be represented by "\{". If a string containing spaces (whitespaces) needs to be interpreted as a single token then space (whitespace) can be escaped (i.e. "\ "). If a "\" itself needs to be interpreted literally then it can be represented by "\\". The string concatenation character is "^". This is useful when a token such as "/d4" needs to built up by a script when "/d" is fixed and the "4" is derived from some variable: set n 4 > /d^$n This example would create the file "/d4". The output of another husky command or script can be made available inline by starting the sequence with "`" and finishing it with a "'". For example: echo {ps output follows: } `ps' This prints the string "ps output follows:" followed on the next line by the current output from the command "ps". That output from "ps" would have its embedded newlines replaced by whitespaces.   * COMMAND LINE FILE REDIRECTION:   + Redirection should appear after a command and its arguments in a line to be interpreted by husky. A special case is a line that just contains "> filename" which creates the filename with zero length if it didn't previously exist or truncates to zero length if it did.   + Redirection of standard in to come from a file uses the token "<" with the filename appearing to its right. The default source of standard in is the console.   + Redirection of standard out to go to a file uses the token ">" with the filename appearing to its right. The default destination of standard out is the console.   + Redirection of standard error to go to a file uses the token ">[2]" with the filename appearing to its right. The default destination of standard error is the console.   + Redirection of writes from within a command which uses a known file descriptor number (say "n") to go to a file uses the token ">[n]" with the filename appearing to its right.   + Redirection of read from within a command which uses a known file descriptor number (say "n") to come from a file uses the token "<[n]" with the filename appearing to its right.   + Redirection of reads and writes from within a command which uses a known file descriptor number (say "n") to a file uses the token "<> [n]" with the filename appearing to its right. In order to redirect both standard out and standard error to the one file the form " > filename >[2=1]" can be used. This sequence first redirects standard out (i.e. file descriptor 1) to filename and then redirects what is written to file descriptor 2 (i.e. standard error) to file descriptor 1 which is now associated with filename.   * ENVIRONMENT VARIABLES: Each process can access the name it was invoked by via the variable: "arg0" . The command line arguments (excluding the invocation name) can be accessed as a list in the variable: "argv" . The number of elements in the list "argv" is place in "argc". The get command is useful for fetching individual arguments from this list. The pid of the current process can be fetched from the variable: "pid". When a script launches a new process in the background then the child's pid can be accessed from the variable "child". The variable "ContollerId" is set to the RaidRunner controller number husky is running on. Environment variables are a separate "space" for each process. Depending on the way a process was created, its initial set of environment variables may be copied from its parent process at the "spawn" point.   * SEE ALSO: intro 3.9.30  HWCONF - print various hardware configuration details   * SYNOPSIS: hwconf [-D] [-M] [-I] [-d [-n]] [-f] [-h] [-i -p c.s.l] [-m] [-p c.s.l] [-s] [-S] [-t] [-T] [-P] [-W]   * DESCRIPTION: hwconf prints details about the RaidRunner hardware and devices attached.   * OPTIONS:   + -h: Print the number of controllers, host interfaces per controller, the number of disk channels per controller, number of ranks of disks and the details memory (in bytes) on each controller. Four memory figures are printed, the first is the total memory in the controller, next is the amount of memory at boot time, next is the amount currently available and lastly is the largest available contiguous area of memory. This is the default option.   + -f: Print the number of fans in the RaidRunner and then the speed for each fan in the system. The speeds values are in revolutions per minute (rpms). The fans in the system are labeled in your hardware specification sheet for your RaidRunner. The first speed printed from this command corresponds to fan number 0 on your specification sheet, the second is for fan 1, and so forth.   + -d: Print out information on all the disk drives on the RaidRunner. For each disk on the RaidRunner, print out - the device name, in the format c.s.l where c is the channel, s is the SCSI ID (or rank) and l is the SCSI LUN of the device, the manufacturer's name (vendor id), the disk's model name (product id), the disk's version id, the disk serial number, the disk geometry - number of cylinders, heads and sectors, and the last block number on the disk and the block size in bytes. the disk revolution count per minute (rpm's), the number of notches/zones available on the drive (if any)   + -n: Print out the disk drive notch/zone tables if available. This is a sub-option to the -d option. Not all disks appear to correctly report the notch/zone partition tables. For each notch/zone,   + the following is printed: the zone number, the zone's starting cylinder, the zone's starting head, the zone's ending cylinder, the zone's ending head, the zone's starting logical block number, the zone's ending logical block number, the zone's number of sectors per track   + -D: Print out the device names for all disk drives on the system.   + -I: Initialize back-end NCR SCSI chips. This flag may be used in conjunction with any other option and will done first. It has an effect only the first call to hwconf that has not yet used a -d, -D or -I options, or on those chips that have not yet had a -p on the channel associated with that chip.   + -m: Print out major flash and battery backed-up ram addresses (in hex). Additionally print out the size of the RaidRunner configuration area. Eight (8) addresses are printed in order RaidRunner configuration area start and end addresses (FLASH RAM), RaidRunner Husky Scripts area start and end addresses (FLASH RAM), RaidRunner Binary Image area start and end addresses (FLASH RAM), RaidRunner Battery Backed-up area start and end addresses. And the size of the RaidRunner configuration area (in bytes) is then printed.   + -p c.s.l: Probe a single device specified by the given channel, SCSI ID (rank) and SCSI LUN provided in the format "c.s.l". The output of this command is the same as the "-d" option but just for the given device. If the device is not present then nothing will be output and the exit status of the command will be 1.   + -i -p c.s.l: Re-initialize the SCSI device driver specified by the given channel, SCSI ID (rank) and SCSI LUN provided in the format "c.s.l". Typically this command is used when, on a running RaidRunner, a new drive is plugged in, and it will be used prior to the RaidRunner's next reboot.   + -M: Set the boottime memory. This option is executed internally by the contro