Download Sun ZFS Storage 7120 7320 and 7420 Appliance Customer Service
Transcript
Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual Part No: E38247 December 2012 E38247–01 Copyright © 2009, 2011, 2012, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT END USERS. Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government. This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group. This software or hardware and documentation may provide access to or information on content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services. Ce logiciel et la documentation qui l’accompagne sont protégés par les lois sur la propriété intellectuelle. Ils sont concédés sous licence et soumis à des restrictions d’utilisation et de divulgation. Sauf disposition de votre contrat de licence ou de la loi, vous ne pouvez pas copier, reproduire, traduire, diffuser, modifier, breveter, transmettre, distribuer, exposer, exécuter, publier ou afficher le logiciel, même partiellement, sous quelque forme et par quelque procédé que ce soit. Par ailleurs, il est interdit de procéder à toute ingénierie inverse du logiciel, de le désassembler ou de le décompiler, excepté à des fins d’interopérabilité avec des logiciels tiers ou tel que prescrit par la loi. Les informations fournies dans ce document sont susceptibles de modification sans préavis. Par ailleurs, Oracle Corporation ne garantit pas qu’elles soient exemptes d’erreurs et vous invite, le cas échéant, à lui en faire part par écrit. Si ce logiciel, ou la documentation qui l’accompagne, est concédé sous licence au Gouvernement des Etats-Unis, ou à toute entité qui délivre la licence de ce logiciel ou l’utilise pour le compte du Gouvernement des Etats-Unis, la notice suivante s’applique : U.S. GOVERNMENT END USERS. Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government. Ce logiciel ou matériel a été développé pour un usage général dans le cadre d’applications de gestion des informations. Ce logiciel ou matériel n’est pas conçu ni n’est destiné à être utilisé dans des applications à risque, notamment dans des applications pouvant causer des dommages corporels. Si vous utilisez ce logiciel ou matériel dans le cadre d’applications dangereuses, il est de votre responsabilité de prendre toutes les mesures de secours, de sauvegarde, de redondance et autres mesures nécessaires à son utilisation dans des conditions optimales de sécurité. Oracle Corporation et ses affiliés déclinent toute responsabilité quant aux dommages causés par l’utilisation de ce logiciel ou matériel pour ce type d’applications. Oracle et Java sont des marques déposées d’Oracle Corporation et/ou de ses affiliés.Tout autre nom mentionné peut correspondre à des marques appartenant à d’autres propriétaires qu’Oracle. Intel Xeon sont des marques ou des marques déposées d'Intel Corporation. Toutes les marques SPARC sont utilisées sous licence et sont des marques ou des marques déposées de SPARC International, Inc. AMD, Opteron, le logo AMD et le logo AMD Opteron sont des marques ou des marques déposées d'Advanced Micro Devices. UNIX est une marque déposée d'The Open Group. Ce logiciel ou matériel et la documentation qui l'accompagne peuvent fournir des informations ou des liens donnant accès à des contenus, des produits et des services émanant de tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quant aux contenus, produits ou services émanant de tiers. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenus pour responsables des pertes subies, des coûts occasionnés ou des dommages causés par l'accès à des contenus, produits ou services tiers, ou à leur utilisation. 121219@25097 Contents Preface .....................................................................................................................................................7 1 Introduction ...........................................................................................................................................9 Overview ..................................................................................................................................................9 Introduction ....................................................................................................................................9 Hardware .............................................................................................................................................. 12 Hardware View ............................................................................................................................. 13 BUI ................................................................................................................................................. 13 CLI ................................................................................................................................................. 19 Tasks .............................................................................................................................................. 21 2 Hardware Maintenance ......................................................................................................................23 Maintenance ........................................................................................................................................ 23 Introduction ................................................................................................................................. 23 7120 ....................................................................................................................................................... 24 7120 Hardware Overview ........................................................................................................... 24 Chassis Overview ......................................................................................................................... 24 Electrical Specifications ............................................................................................................... 27 Internal Components .................................................................................................................. 27 Standalone Controller Configurations ...................................................................................... 33 Attached Storage .......................................................................................................................... 34 7320 ....................................................................................................................................................... 34 7320 Hardware Overview ........................................................................................................... 34 Chassis Overview ......................................................................................................................... 35 7320 Replaceable Components .................................................................................................. 40 7320 Single and Cluster Controller Configurations ................................................................ 42 7420 ....................................................................................................................................................... 44 3 Contents 7420 Hardware Overview ........................................................................................................... 44 Chassis Overview ......................................................................................................................... 44 Internal Boards ............................................................................................................................. 46 Components ................................................................................................................................. 47 7420 Standalone and Cluster Controller Configurations ....................................................... 51 Attached Storage .......................................................................................................................... 54 7x20 ....................................................................................................................................................... 54 7x20 CRU Maintenance Procedures .......................................................................................... 54 Prerequisites ................................................................................................................................. 54 Safety Information ....................................................................................................................... 55 Required Tools and Information ............................................................................................... 55 Chassis Serial Number ................................................................................................................. 55 Controller Replacement Tasks ................................................................................................... 56 Shelf ....................................................................................................................................................... 76 Disk Shelf Overview ..................................................................................................................... 76 Shelf ....................................................................................................................................................... 91 Disk Shelf Maintenance Procedures .......................................................................................... 91 Prerequisites ................................................................................................................................. 91 Safety Information ....................................................................................................................... 92 Electrostatic Discharge Precautions .......................................................................................... 92 Removing Power from the Disk Shelf ........................................................................................ 92 Tasks .............................................................................................................................................. 93 Faults ................................................................................................................................................... 103 Hardware Faults ......................................................................................................................... 103 HBA Expansion pt.1 .......................................................................................................................... 104 Expanding from 2 to 3 HBAs .................................................................................................... 104 HBA Expansion pt.2 .......................................................................................................................... 108 Expanding from 3 to 4 HBAs .................................................................................................... 108 HBA Expansion pt.3 .......................................................................................................................... 113 Expanding from 4 to 5 HBAs .................................................................................................... 113 HBA Expansion pt.4 .......................................................................................................................... 117 Expanding from 5 to 6 HBAs .................................................................................................... 117 3 4 System Maintenance .........................................................................................................................123 System ................................................................................................................................................. 123 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Contents Introduction ............................................................................................................................... 123 System Disks ............................................................................................................................... 123 Support Bundles ......................................................................................................................... 124 Initial Setup ................................................................................................................................. 125 Factory Reset ............................................................................................................................... 126 Updates ............................................................................................................................................... 126 System Updates .......................................................................................................................... 126 Hardware Firmware Updates ................................................................................................... 133 Rollback ....................................................................................................................................... 135 Cluster Upgrade ......................................................................................................................... 137 Updating via the BUI ................................................................................................................. 139 Updating via the CLI ................................................................................................................. 140 Passthrough x ..................................................................................................................................... 143 Passthrough-x Deferred Update .............................................................................................. 143 User Quotas ........................................................................................................................................ 144 User Quotas Deferred Update .................................................................................................. 144 COMSTAR ......................................................................................................................................... 144 COMSTAR Deferred Update ................................................................................................... 144 Triple Parity RAID ............................................................................................................................ 145 Triple-Parity RAID Deferred Update ...................................................................................... 145 Dedup ................................................................................................................................................. 145 Data Deduplication Deferred Update ..................................................................................... 145 Replication ......................................................................................................................................... 145 Replication Deferred Update .................................................................................................... 145 Received Properties ........................................................................................................................... 146 Received Properties Deferred Update ..................................................................................... 146 Slim ZIL .............................................................................................................................................. 146 Introduction ............................................................................................................................... 146 Snapshot Deletion ............................................................................................................................. 146 Snapshot Deletion Deferred Update ....................................................................................... 146 Recursive Snapshots .......................................................................................................................... 147 Recursive Snapshots Deferred Update .................................................................................... 147 Multi Replace ..................................................................................................................................... 147 Multi Replace Deferred Update ................................................................................................ 147 RAIDZ Mirror ................................................................................................................................... 147 RAIDZ/Mirror Deferred Update ............................................................................................. 147 5 Contents Optional Child Dir ............................................................................................................................ 148 Introduction ............................................................................................................................... 148 ConfigurationBackup ....................................................................................................................... 148 Configuration Backup ............................................................................................................... 148 Problems ............................................................................................................................................. 153 Problems ..................................................................................................................................... 153 Active problems display ............................................................................................................ 153 Repairing problems ................................................................................................................... 154 Related features .......................................................................................................................... 154 Logs ..................................................................................................................................................... 155 Introduction ............................................................................................................................... 155 BUI ............................................................................................................................................... 157 CLI ............................................................................................................................................... 157 Glossary .............................................................................................................................................. 159 6 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Preface The Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual contains hardware overviews and maintenance procedures for Oracle's Sun ZFS Storage 7x20 series of NAS appliances. This documentation is also available while using the Browser User Interface, accessible via the Help button. The appliance documentation may be updated using the System Upgrade procedure documented in the System Maintenance chapter of this book. Who Should Use This Book These notes are for users and system administrators who service and use the Sun ZFS Storage 7x20 Appliances. Related Documentation Refer to the following documentation for installation instructions, hardware overviews, service procedures and software update notes. ■ Installation Guide, Analytics Guide and Administration Guide (http://www.oracle.com/ technetwork/documentation/) Third-Party Web Site References Third-party URLs are referenced in this document and provide additional, related information. 7 Preface Note – Oracle is not responsible for the availability of third-party Web sites mentioned in this document. Oracle does not endorse and is not responsible or liable for any content, advertising, products, or other materials that are available on or through such sites or resources. Oracle will not be responsible or liable for any actual or alleged damage or loss caused by or in connection with the use of or reliance on any such content, goods, or services that are available on or through such sites or resources. Access to Oracle Support Oracle customers have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired. 8 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 1 C H A P T E R 1 Introduction Overview Introduction The Sun ZFS Storage 7000 family of products provide efficient file and block data services to clients over a network, and a rich set of data services that can be applied to the data stored on the system. Controllers ■ ■ ■ 7120 7320 7420 Legacy platforms: 7110 | 7210 | 7310 | 7410 Expansion Storage ■ Disk Shelves Legacy platforms: J4400/J4500 9 Overview Protocols Sun ZFS Storage appliances include support for a variety of industry-standard client protocols, including: ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ SMB NFS HTTP and HTTPS WebDAV iSCSI FC SRP iSER FTP SFTP Key Features Sun ZFS Storage systems also include new technologies to deliver the best storage price/performance and unprecedented observability of your workloads in production, including: ■ Analytics, a system for dynamically observing the behavior of your system in real-time and viewing data graphically ■ The ZFS Hybrid Storage Pool, composed of optional Flash-memory devices for acceleration of reads and writes, low-power, high-capacity disks, and DRAM memory, all managed transparently as a single data hierarchy Data Services To manage the data that you export using these protocols, you can configure your Sun ZFS Storage system using the built-in collection of advanced data services, including: LICENSE NOTICE: Remote Replication and Cloning may be evaluated free of charge, but each feature requires that an independent license be purchased separately for use in production. After the evaluation period, these features must either be licensed or deactivated. Oracle reserves the right to audit for licensing compliance at any time. For details, refer to the "Oracle Software License Agreement ("SLA") and Entitlement for Hardware Systems with Integrated Software Options." ■ ■ ■ ■ ■ ■ 10 RAID-Z (RAID-5 and RAID-6), mirrored, and striped disk configurations Unlimited read-only and read-write snapshots, with snapshot schedules Data deduplication Built-in data compression Remote replication of data for disaster recovery Active-active clustering for high availability (7310, 7320, 7410, and 7420) Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Overview ■ ■ ■ Thin provisioning of iSCSI LUNs Virus scanning and quarantine NDMP backup and restore Availability To maximize the availability of your data in production, Sun ZFS Storage appliances include a complete end-to-end architecture for data integrity, including redundancies at every level of the stack. Key features include: ■ Predictive self-healing and diagnosis of all system hardware failures: CPUs, DRAM, I/O cards, disks, fans, power supplies ■ ZFS end-to-end data checksums of all data and metadata, protecting data throughout the stack ■ RAID-6 (double- and triple-parity) and optional RAID-6 across disk shelves ■ Active-active clustering for high availability (7310, 7320, 7410, and 7420) ■ Link aggregations and IP multipathing for network failure protection ■ I/O Multipathing between the controller and disk shelves ■ Integrated software restart of all system software services ■ Phone-Home of telemetry for all software and hardware issues ■ Lights-out Management of each system for remote power control and console access Browser User Interface (BUI) The browser user interface The BUI is the graphical tool for administration of the appliance. The BUI provides an intuitive environment for administration tasks, visualizing concepts, and analyzing performance data. Chapter 1 • Introduction 11 Hardware The management software is designed to be fully featured and functional on a variety of web browsers. Direct your browser to the system using either the IP address or host name you assigned to the NET-0 port during initial configuration as follows: https://ipaddress:215 or https://hostname:215. The login screen appears. The online help linked in the top right of the BUI is context-sensitive. For every top-level and second-level screen in the BUI, the associated help page appears when you click the Help button. Command Line Interface (CLI) The CLI is designed to mirror the capabilities of the BUI, while also providing a powerful scripting environment for performing repetitive tasks. The following sections describe details of the CLI. When navigating through the CLI, there are two principles to be aware of: ■ Tab completion is used extensively: if you are not sure what to type in any given context, pressing the Tab key will provide you with possible options. Throughout the documentation, pressing Tab is presented as the word "tab" in bold italics. ■ Help is always available: the help command provides context-specific help. Help on a particular topic is available by specifying the topic as an argument to help, for example help commands. Available topics are displayed by tab-completing the help command, or by typing help topics. You can combine these two principles, as follows: dory:> help tab builtins commands general help properties script Hardware 12 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Hardware Locating a disk Hardware View The Maintenance > Hardware screen (also known as the "hardware view") provides component status of the appliance and attached disk shelves. This information is available from both the BUI and the CLI. BUI The BUI hardware view provides interactive illustrations that enable you to browse through the appliance and attached disk shelf components. The screenshot at the top of this page shows a disk highlighted in a Sun Storage 7110, showing both its physical location and details. The buttons in the hardware view are: icon description icon description Show a more detailed view of this component Toggle blinking of the locator LED for this component Leave this detailed view Reboot the appliance Chapter 1 • Introduction 13 Hardware icon description icon description Click for more details Power off the appliance Hardware component is ok (green) Offline disk Hardware component is not present (grey) Port active Hardware component is faulted (amber) Port inactive System Overview The main hardware page lists the system chassis, a summary of its contents, and any attached disk shelves (on supported systems). This provides an overview of the hardware present on the system, as well as controls to reset or power off the system. System Chassis The primary system chassis is shown on the top half of the view. At the top left, click the to get more detail about the chassis. The indicator notes if there are any faulted components within the chassis, and the name of the chassis. The chassis name is initially set to the appliance name during installation. To change the chassis name, use the entry field on the Configuration > Services > System Identity screen. At the top right of the system chassis is the appliance, and control to light the locate LED, reboot the power off the chassis. A thumbnail of the controller is presented at left. Clicking on the thumbnail or the "Show Details" link takes you to a detailed view of the chassis, and is identical to clicking on the right pointing arrow at the top left of the view. The following information is presented in a summary view: 14 Property Description Manufacturer Manufacturer of the system Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Hardware Property Description Model System model name Serial System chassis hardware serial number Processors Count and description of processors in the system Memory Total memory in the system System Size and number of system disks used for the system image Data Size and number of data disks in the system chassis. This is only valid for standalone systems. If there are no data disks present, "-" will be displayed. Cache Size and number of cache disks in the system chassis. This is only valid for expandable systems that support additional disk shelves. If there are no cache disks present, "-" will be displayed. Log Size and number of log disks in the system chassis. This is only valid for standalone systems. If there are no log devices present, "-" will be displayed. Total Total size and count of all disks in the system. Disk Shelves A list of disk shelves, if supported, is displayed at the bottom of the view. The thumbnail to the left represents the front of the currently selected disk shelf. Clicking on the right pointing arrow or double-clicking on a row within the list will provide complete details about the disk shelf. The state indicator will be orange if the chassis contains any faulted components. The following fields are displayed in the list: Chapter 1 • Introduction 15 Hardware Property Description Name Name of the disk shelf, used in faults and alerts. This is initially set to the serial number of the disk shelf, but can be changed by clicking on the name within the list. Manufacturer Disk Shelf Manufacturer Model Disk Shelf Model Data Total size of all data disks within the disk shelf. Cache Total size of all read-optimized cache devices ("Readzillas") within the drive shelf. There are currently no supported disk shelves with read cache devices, but this may not always be the case. If there are no cache devices within the shelf, then "-" is displayed. Log Total size of all write-optimized cache devices ("Logzillas") within the drive shelf. If there are no log devices within the shelf, then "-" is displayed. Paths Total number of I/O paths to the disk shelf. The only supported configurations are those with multiple paths to all disks, so this should read "2" under normal operating circumstances. Clicking the icon will bring up a dialog with information about each path. This includes which HBAs are connected to the disk shelf, and the state of any paths. If the disks within the disk shelf are not currently configured as part of a storage pool, complete path information will not be available, though it displays two paths to the chassis. Locate Toggle the locate LED for this disk shelf. If the LED is currently on, then this indicator will be flashing. Chassis Detail To view the chassis details, click on the icon (or one of the alternative forms described above). This view includes some of the same controls in the upper left (state, name, locate, reset, poweroff), as well as listings of all the components in the chassis. At the left is a set of images describing the chassis. If there are multiple views, then you can switch between them by clicking on the name of the view above the image. For each view, faulted components will be highlighted in red. In addition, the currently selected component will be highlighted in the image. Clicking on a component within the image will select the corresponding component in the list to the right. A tab is present for each component type in the following list. Each component type has a state icon which will be orange if there is a faulted component of the given type. ■ ■ ■ ■ ■ 16 Disk Slot CPU (controller only) Memory (controller only) Fan Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Hardware ■ ■ Power supply (PSU) Service processor (SP) (controller only) Clicking on a component type will display a list of all physical locations within the chassis where components may be present. Clicking on a component within the list will highlight it within the icon while over a row or double-clicking a row appropriate chassis image. Clicking on the will bring up a dialog with detailed information about the component. The information displayed in the list depends on the component type, but is a subset of the information available in the component detail. Disks and service processors support additional operations described below. Each component can report any or all of the following properties: Property Description Label Human-readable identifier for this component within the chassis. This is typically, but not necessarily, equivalent to the label printed on the physical chassis. FMRI Fault managed resource identifier (FMRI) for the component. This is an internal identifier used to identify the component within faults and is intended for service personnel. Active Problems For a faulted component, links to active problems affecting the component. Manufacturer Component manufacturer. Model Component model. Build Manufacturing build identifier. This is used to identify a particular location or batch where the component was manufactured. Part Component part number, or core factory part number. The orderable part number may differ, depending on whether a component is for replacement or expansion, and whether it's part of a larger assembly. Your service provider should be able to refer you to the appropriate orderable part. For components without part numbers, the model number should be used instead. Serial Component serial number. Revision Firmware or hardware revision of the component. Size Total memory or storage, in bytes. Type Disk type. Can be one of 'system', 'data', 'log', 'cache', or 'spare'. When a spare is active, it will be displayed as 'spare '. Speed Processor speed, in gigahertz. Cores Number of CPU cores. GUID Hardware global unique identifier. Chapter 1 • Introduction 17 Hardware Disks Disks support the additional options: Action Description Locate Toggle the locate indicator for the disk. If the LED is currently turned on, this icon will be blinking. Offline Online Offline the disk. This option is only available for disks that are part of a configured storage pool (including the system pool). Offlining a disk prevents the system from reading or writing to it. Faulted devices are already avoided, so this option should only be required if a disk is exhibiting performance problems that do not result in pathological failure. It is not possible to offline a disk that would prevent access to data (i.e. offlining both halves of a mirror). If the device is an active hot spare, this will also give the option of detaching the hot spare completely. Once a hot spare is detached, it cannot be activated except through another fault or hotplug event. Online the disk. Reverses the above operation. Infiniband Host Controller Adapters Infiniband Host Controller Adapters (HCA) report additional properties for the list of available ports: Action State Description When "active", the active port icon is displayed. Other valid port states ("down", "init", and "arm") are denoted by the inactive port icon icon will display the current port state in the tip pop-up. . Mousing over the port GUID The hardware assigned port GUID. Speed The current port speed enabled: Single Data Rate (SDR), Dual Data Rate (DDR) or Quad Data Rate (QDR) Service Processor The service processor behaves differently from other component nodes. Instead of providing a list of components, it presents a set of network properties that can be configured from the storage appliance. The following properties control the behavior of the service processor network management port. 18 Property Description MAC Address Hardware MAC address. This is read-only Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Hardware Property Description IP Address Source Either 'DHCP' or 'Static'. Controls whether DHCP should be used on the interface. IP Address IPv4 Address, when using static IP configuration. IPv6 is not supported. Subnet Dotted decimal subnet, when using static IP configuration. Default Gateway IPv4 default gateway address. Changing multiple values in conflicting ways (such as changing static IP assignments while in DHCP mode) has undefined behavior. CLI Hardware status details are available in the CLI under the maintenance hardware section. Use the show command to list the status of all components. The list command will list available chassis, which can be selected and then viewed using show. tarpon:> maintenance hardware show Chassis: NAME STATE MANUFACTURER chassis-000 0839QCJ01A ok Sun Microsystems, Inc. Sun Storage 7410 cpu-000 cpu-001 cpu-002 cpu-003 disk-000 disk-001 disk-002 disk-003 disk-004 disk-005 disk-006 disk-007 fan-000 fan-001 fan-002 fan-003 fan-004 fan-005 fan-006 fan-007 memory-000 memory-001 ... ok ok ok ok ok ok absent absent absent absent ok ok ok ok ok ok ok ok ok ok ok ok AMD AMD AMD AMD STEC STEC HITACHI HITACHI unknown Sun Microsystems, Sun Microsystems, Sun Microsystems, unknown Sun Microsystems, Sun Microsystems, Sun Microsystems, HYNIX HYNIX CPU 0 CPU 1 CPU 2 CPU 3 HDD 0 HDD 1 HDD 2 HDD 3 HDD 4 HDD 5 HDD 6 HDD 7 FT 0 FT 0 FM 0 FT 0 FM 1 FT 0 FM 2 FT 1 FT 1 FM 0 FT 1 FM 1 FT 1 FM 2 DIMM 0/0 DIMM 0/1 MODEL Inc. Inc. Inc. Inc. Inc. Inc. Quad-Core AMD Op Quad-Core AMD Op Quad-Core AMD Op Quad-Core AMD Op MACH8 IOPS MACH8 IOPS HTE5450SASUN500G HTE5450SASUN500G ASY,FAN,BOARD,H2 541-2068 541-2068 541-2068 ASY,FAN,BOARD,H2 541-2068 541-2068 541-2068 4096MB DDR-II 66 4096MB DDR-II 66 A 5th column for serial number ("SERIAL") has been truncated in the above example, as has the length of this list. Chapter 1 • Introduction 19 Hardware Component Properties If a particular component is selected, detailed information about its properties are reported. The following properties are supported, with the corresponding BUI property name. For a description of a particular property, see the description above. CLI Property BUI Property build Build cores Cores device N/A faulted (status indicator) label Label locate (writable) (status indicator) manufacturer Manufacturer model Model offline (writeable) (status indicator) part Part present (status indicator) revision Revision serial Serial size Size speed Speed type (combined with use) use Type When viewing a disk that is active as a hot spare, the detach command is also available. Viewing CPU details For example, the following shows details for component "CPU 0": tarpon:maintenance tarpon:maintenance tarpon:maintenance tarpon:maintenance 20 hardware> select chassis-000 chassis-000> select cpu chassis-000 cpu> select cpu-000 chassis-000 cpu-000> show Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Hardware Properties: label present faulted manufacturer model part revision cores speed = = = = = = = = = CPU 0 true false AMD Quad-Core AMD Opteron(tm) Processor 8356 1002 03 4 2.14G Tasks BUI ▼ Locating a failed component 1 Go to the Maintenance > Hardware screen. 2 Click the 3 Locate the fault icon in the lists of hardware components, and click it. The image should be updated to show where that component is physically located. 4 Optionally, click the icon for that component, if the component has it. The LED on the component will begin to flash. icon on the Storage System or Disk Shelf which has the fault icon. CLI To turn on the locate LED using the CLI, run the following commands. Go to the maintenance hardware context: hostname:> maintenance hardware List the appliance components: hostname:maintenance hardware> list NAME STATE MODEL SERIAL chassis-000 hostname ok Sun Storage 7410 unknown chassis-001 000000000C faulted J4400 000000000C Select the chassis and list its components: hostname:maintenance hardware> select chassis-001 hostname:maintenance chassis-001> list disk Chapter 1 • Introduction 21 Hardware fan psu slot Select the component type and show all available disks: hostname:maintenance chassis-001> select disk hostname:maintenance chassis-001 disk> show Disks: disk-000 disk-001 disk-002 disk-003 disk-004 disk-005 disk-006 disk-007 disk-008 disk-009 disk-010 disk-011 disk-012 disk-013 disk-014 disk-015 disk-016 disk-017 disk-018 disk-019 disk-020 disk-021 disk-022 disk-023 LABEL HDD 0 HDD 1 HDD 2 HDD 3 HDD 4 HDD 5 HDD 6 HDD 7 HDD 8 HDD 9 HDD 10 HDD 11 HDD 12 HDD 13 HDD 14 HDD 15 HDD 16 HDD 17 HDD 18 HDD 19 HDD 20 HDD 21 HDD 22 HDD 23 STATE ok faulted ok ok ok ok ok ok ok ok ok ok ok ok ok ok ok ok ok ok ok ok ok ok MANUFACTURER ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS MODEL ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS ST3500630NS SERIAL 9QG1ACNJ 9QG1A77R 9QG1AC3Z 9QG1ACKW 9QG1ACKF 9QG1ACPM 9QG1ACRR 9QG1ACGD 9QG1ACG4 9QG1ABDZ 9QG1A769 9QG1AC27 9QG1AC41 9QG1ACQ5 9QG1ACKA 9QG1AC5Y 9QG1ACQ2 9QG1A76S 9QG1ACDY 9QG1AC3Y 9QG1ACG6 9QG1AC3X 9QG1ACHL 9QG1ABLW Select the faulted disk and turn on the locate LED: hostname:maintenance chassis-001 disk> select disk-001 hostname:maintenance chassis-001 disk-001> set locate=true locate = true (uncommitted) hostname:maintenance chassis-001 disk-001> commit 22 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 2 C H A P T E R 2 Hardware Maintenance Maintenance Introduction This section describes concepts and procedural instructions for performing hardware and software maintenance tasks. The graphic above illustrates locating a spare disk within the chassis by highlighting its name in the BUI Hardware Maintenance list. The Maintenance > Hardware screen of the BUI provides visual representations of the physical system components, allowing you to visually identify and locate hardware components and verify their status. Software Updates can be applied in the System section of the interface, as well as viewing Logs and current Problems. ■ Hardware Overview - identify hardware components and verify their status ■ Controllers 23 7120 ■ ■ ■ ■ ■ ■ ■ ■ 7120 | 7320 | 7420 Overviews - component diagrams and specifications 7x20 Maintenance Procedures - replace controller drives, fans, power supplies, memory, cards, risers, and batteries 7110 | 7210 | 7310 | 7410 Overviews - component diagrams and specifications 7x10 Maintenance Procedures - replace controller drives, fans, power supplies, memory, cards, risers, and batteries Expansion Storage Disk Shelf Overview - component diagrams and specifications for Oracle Storage Drive Enclosure DE2-24, Sun Disk Shelf, and J4400/J4500 Disk Shelf Maintenance Procedures - replace disk shelf chassis components Expanding from 2 to 3 HBAs | 3 to 4 HBAs | 4 to 5 HBAs | 5 to 6 HBAs ■ Hardware Faults - Connect to ILOM to diagnose hardware faults ■ System - view system disks, manage support bundles ■ Updates - manage appliance software ■ Configuration Backup - backup and restore appliance configuration ■ Problems - view current problems ■ Logs - view appliance logs ■ Workflows - manage and execute workflows 7120 7120 Hardware Overview Use the information in this section as a reference when preparing to service replaceable components of the Sun ZFS Storage 7120. Refer to the following for procedural instructions: ■ ■ Controller Tasks - replace system controller components Disk Shelf Tasks - replace disk shelf components Chassis Overview The Sun ZFS Storage 7120 is an enterprise-class two-socket rackmount x64 system powered by the Intel Xeon processor. It packs high performance and room for growth with four PCIe slots and 18 DIMM slots into a compact 2U footprint. Refer to http://www.oracle.com/us/products/ 24 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7120 servers-storage/storage/nas/overview/index.html (http://www.oracle.com/ us/products/servers-storage/storage/nas/overview/index.html) for the most recent component specification. Refer to the Implementing Fibre Channel SAN Boot with Oracle's Sun ZFS Storage Appliance whitepaper at http://www.oracle.com/technetwork/articles/servers-storage-admin/ fbsanboot-365291.html (http://www.oracle.com/ technetwork/articles/servers-storage-admin/fbsanboot-365291.html) for details on FC SAN boot solutions using the Sun ZFS Storage 7120. The 7120 is a standalone controller that consists of an internal SAS-2 HBA providing disk shelf expansion, write flash acceleration, and 11 x 300GB 15K, 600GB 15K, 1TB 7.2K, 2TB 7.2K, or 3TB 7.2K hard drive storage. The SAS-2 storage fabric supports a greater number of targets, greater bandwidth, higher reliability, and bigger scale. The 2U chassis form factor dimensions are as follows: Dimension Measurement Dimension Measurement Height 87.6 mm/3.45 in Depth Width 436.8 mm/17.2 in Weight 765.25 mm/30.13 in 29.54 kg/65 lb Front Panel The following figure and legend show the front panel and the drive locations. The Logzilla 3.5" SSD belongs in slot 3 and is not supported in controllers configured with the internal Sun Aura flash HBA Logzilla. Figure Legend 1 Locator LED/button (white) 7 HDD 2 Chapter 2 • Hardware Maintenance 13 HDD 8 25 7120 Figure Legend 2 Service Action Required LED (amber) 8 HDD or SSD 3 14 HDD 9 3 Power button 9 HDD 4 15 HDD 10 4 Power/OK LED (green) 10 HDD 5 16 HDD 11 5 HDD 0 11 HDD 6 17 Drive map 6 HDD 1 12 HDD 7 Rear Panel The following figure and legend show the rear panel. Note: Optional Sun Dual Port 40Gb/sec 4x Infiniband QDR HCAdapter PCIe cards (375-3606-01) may be located in slots 1, 2, or 3. 375-3606-01 HCA expansion cards are not supported in the 10Gb network configurations. Figure Legend 1 Power Supply Unit 1 6 PCIe 4 11 Network Management port 2 Power Supply Unit 0 7 Boot HDD 1 12 Gbit Ethernet ports NET 0, 1, 2, 3 3 PCIe 0 8 Boot HDD 0 13 USB 2.0 ports (0, 1) 4 PCIe 3 9 Rear Panel System Status LEDs 14 HD15 Video port 5 PCIe 1 10 Serial Management port The serial management connector (SER MGT) is an RJ-45 port and provides a terminal connection to the SP console. 26 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7120 The network management connector (NET MGT) is an RJ-45 port and provides an alternate terminal interface to the SP console. There are four RJ-45 Gigabit Ethernet ports (NET0, NET1, NET2, NET3) located on the motherboard that operate at 10/100/1000 Mbit/sec. These network interfaces must be configured before use. Electrical Specifications The following list shows the electrical specifications for the 7120. Note that the power dissipation numbers listed are the maximum rated power numbers for the power supply. The numbers are not a rating of the actual power consumption of the appliance. Connectors ■ Two C13 connectors which work on 110-220v outlets Input ■ ■ ■ ■ Nominal frequencies: 50/60Hz Nominal voltage range: 100-120/200-240 VAC Maximum current AC RMS: 13.8A @ 100 VAC AC operating range: 90-264 VAC Output ■ ■ 3.3 VDC STBY: 3.0A +12 VDC: 86.7A Power dissipation ■ ■ ■ Max power consumption: 1235.3 W Max heat output: 4212 BTU/hr Volt-Ampere rating: 1261 VA @ 240 VAC, 0.98P.F. Internal Components The chassis has the following boards installed. Note: Field-replaceable units (FRU) should only be replaced by trained Oracle service technicians. ■ PCIe Risers - Each riser supports two PCIe cards that are customer-replaceable. There are two risers per system, each attached to the rear of the motherboard. Chapter 2 • Hardware Maintenance 27 7120 28 ■ Motherboard - The motherboard is a FRU and includes CPU modules, slots for 18 DIMMs, memory control subsystems, and the service processor (SP) subsystem. The SP subsystem controls the host power and monitors host system events (power and environmental). The SP controller draws power from the host 3.3V standby supply rail, which is available whenever the system is receiving AC input power, even when the system is turned off. ■ Power Distribution Board - The power distribution board is a FRU and distributes main 12V power from the power supplies to the rest of the storage controller. It is directly connected to the connector break out board and to the motherboard through a bus bar and ribbon cable. It also supports a top cover interlock kill switch. The power supplies connect directly to the power distribution board. ■ Connector Break Out Board - The connector break out board is FRU and serves as the interconnect between the power distribution board and the fan power boards, storage drive backplane, and I/O board. It also contains the top-cover interlock "kill" switch. ■ Fan Power Boards - The two fan power boards are FRUs and carry power to the system fan modules. In addition, they contain fan module status LEDs and transfer I2C data for the fan modules. ■ Storage Drive Backplane - The storage drive backplane is a FRU and includes the connectors for the storage drives, as well as the interconnect for the I/O board, power and locator buttons, and system/component status LEDs. The system has a 12-disk backplane. Each drive has an LED indicator for Power/Activity, Fault, and Locate. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7120 I/O Components The following figure and legend show the I/O components of the 7120 system. Figure Legend 1 Top Cover 3 Hard Disk Drives 2 Right Control Panel Light Pipe Assembly 4 Left Control Panel Light Pipe Assembly Cables The following figure and legend show the storage controller internal cables. Chapter 2 • Hardware Maintenance 29 7120 Note: The rear boot drives are not depicted in this illustration. 30 Cable Connection 1 Storage Drive Data Cable Connection between the HBA PCI-Express Card and the storage drive backplane. Cable Connection 2 Ribbon cable Connection is between the power distribution board and the motherboard. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7120 CPU and Memory The 7120 motherboard has 18 slots in two groups that hold industry-standard DDR3 DIMMs. The standard memory configuration is 48GB, 6x8GB DDR-1333 low voltage (LV) DIMMS. Following are the replaceable CPU and memory components of the 7120 system. Part Number Description FRU/CRU F371-4966-01 DIMM, 8GB, DDR3, 2RX4, 13 CRU F371-4885-01 Intel E5620, 2.40G FRU All sockets must be occupied by either a filler or a DDR3 DIMM. All DDR3 DIMMs must be identical. DIMMs are pre-installed in P0 slots D1, D2, D4, D5, D7, and D8. Chapter 2 • Hardware Maintenance 31 7120 Power Distribution, Fan Module and Disk Components The fan modules and LEDs are shown in the following illustration. The following figure and legend show the power distribution and associated components. Figure Legend 1 Fan Board 32 5 Power Distribution Board Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7120 Figure Legend 2 SAS Expander Board 6 Connector Board 3 Disk Backplane 7 Power Supply Backplane 4 Front Control Panel Light Pipe Assembly Standalone Controller Configurations The following table shows the configuration options for a 7120 controller. All PCIe cards are low-profile, and must be fitted with low-profile mounting brackets. This table describes base configurations for the 7120 with Aura Logzilla. Mktg Part Number Description Mfg Part Number TA7120-12TB S7120, 1xCPU, 24GB, 12TB 597-0754-01 TA7120-24TB S7120, 1xCPU, 24GB, 24TB 597-0755-01 The following table describes base configurations for the 7120 with Logzilla 3.5" SSD. Mktg Part Number Description Mfg Part Number 7101282 S7120, 1xCPU, 24GB, 3.3TB 7014523 7101284 S7120, 1xCPU, 24GB, 6.6TB 7014525 NIC/HBA Options The following table describes NIC/HBA options for the 7120. Mktg Part Number Description SG-XPCIESAS-GEN2-Z 2-port External Sun Thebe SAS (x4) HBA, PCIe 594-5889-01 SG-XPCIE2FC-QF8-Z 2-port FC HBA, 8Gb, PCIe 594-5684-01 X4446A-Z 4-port PCI-E Quad GigE UTP 594-4024-01 X4237A-N 2-port 4X IB HCA PCIe 594-5862-02 X1109A-Z 2-port 10Gig SFP+ NIC, PCIe 594-6039-01 Chapter 2 • Hardware Maintenance Mfg Part Number 33 7320 PCIe Options The following table describes the supported PCIe configuration option summary for the 7120. Slot Type Sun Part Number Vendor Part Number Description 0 PCIe 540-7975-03 Sun Aura Internal Flash HBA Logzilla Base Configuration (OBSOLETE) 0 PCIe 375-3481-01 Intel EXPI9404PT QP Copper NIC Optional Recommended Front-end 0 PCIe 375-3617-01 Intel Niantic DP Optical 10GE NIC Optional Recommended Front-end 0 PCIe 371-4325-01 QLogic 8Gb DP FC HBA Optional FC Target or Initiator (Backup) 0 PCIe 375-3606-01 Mellanox MHJH29-XTC InfiniBand HCA Optional Recommended Front-end 1 PCIe 375-3617-01 Intel Niantic Optional Recommended Front-end 1 PCIe 375-3606-01 Mellanox MHJH29-XTC InfiniBand HCA Optional Recommended Front-end 1 PCIe 375-3481-01 Intel EXPI9404PT QP Copper NIC Optional Recommended Front-end 1 PCIe 371-4325-01 QLogic 8Gb DP FC HBA Optional FC Target or Initiator (Backup) 3 PCIe 375-3665-01 Sun Thebe (INT) Internal SAS HBA Base Configuration 4 PCIe 375-3481-01 Intel EXPI9404PT QP Copper NIC Optional Recommended Front-end 4 PCIe 371-4325-01 QLogic 8Gb DP FC HBA Optional FC Target or Initiator (Backup) 4 PCIe 375-3609-03 Sun Thebe (EXT) 8P 6Gb/s SAS HBA Additional Optional Back-end DP Optical 10GE NIC Note Attached Storage The 7120 standalone configurations allow a single chain of 1 or 2 disk shelves. Write-optimized (Logzilla) SSDs are not supported in the expansion storage for the 7120. The disk shelves must be fully populated with 24 HDDs. Half-populated shelf configurations are not supported. 7320 7320 Hardware Overview Use the information in this section to prepare to service replaceable components of the 7320 system. After you have reviewed this section, refer to these procedural instructions: ■ ■ 34 Controller Tasks - replace storage controller components Disk Shelf Tasks - replace disk shelf components Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7320 Chassis Overview The Sun ZFS Storage 7320 consists of either a single storage controller or two storage controllers in a high availability cluster configuration. Both the single and clustered configuration supports one to six disk shelves. The 7320 controller base configuration includes two CPUs, built-in 4 x 1Gb/s front-end GigE ports, redundant power supplies, NIC options for expanded front-end support, tape backup, InfiniBand, and dual port SAS HBA for storage expansion. The CPUs are Intel Xeon 5600 series, 2.40GHz, 80W, with 4 core processors. The standard memory configuration is 96GB, 6 x 8GB DDR3-1333 low voltage (LV) DIMMs per CPU. Memory can be upgraded to 144GB using 9 x 8GB DDR3-1333 LV DIMMs per CPU (for a total of 18 x 8GB for two CPUs). Earlier versions of the 7320 controller included 24GB (base), 48GB, or 72GB memory options. The clustered configuration simply uses two servers and a cluster card in each server for a heartbeat connection between the servers. All user-accessible storage is provided by one to six disk shelves that are external to the server(s). RAID functions are managed by software. Solid state 18GB SAS-1 drives (7320 SAS-2) are used for a high-performance write cache (known as LogZilla) or ZFS intent log (ZIL) devices, and are used in place of up to four of the 24 drives in a disk shelf. The remaining 20 drives are available for storage. Refer to the http://www.oracle.com/us/products/servers-storage/storage/unified-storage/ index.html (http://www.oracle.com/ us/products/servers-storage/storage/unified-storage/index.html) for the most recent component specification. The 7320 is a SAS-2 (Serial Attached SCSI 2.0) device that consists of an HBA, disk shelf, and disks (1TB and 2TB SAS-2). The SAS-2 storage fabric supports a greater number of targets, greater bandwidth, higher reliability, and bigger scale than the SAS-1 fabric. Boards The 7320 storage controller chassis has the following boards installed. Note: Field-replaceable units (FRUs) should only be replaced by trained Oracle service technicians. ■ PCIe Risers - The storage controller contains three PCIe risers that are customer-replaceable units (CRUs) and are attached to the rear of the motherboard. Each riser supports one PCIe card. ■ Motherboard - The motherboard is a FRU and includes CPU modules, slots for 18 DIMMs, memory control subsystems, and the service processor (SP) subsystem. The SP subsystem controls the host power and monitors host system events (power and environmental). The SP controller draws power from the host 3.3V standby supply rail, which is available whenever the system is receiving AC input power, even when the appliance is turned off. Chapter 2 • Hardware Maintenance 35 7320 ■ Power Distribution Board - The power distribution board is a FRU and distributes main 12V power from the power supplies to the rest of the storage controller. It is directly connected to the paddle board and to the motherboard through a bus bar and ribbon cable. It also supports a top cover interlock kill switch. ■ Paddle Board - The paddle board is a FRU and serves as the interconnect between the power distribution board and the fan power boards, hard drive backplane, and I/O board. ■ Fan Board - The fan boards are FRUs and carry power to the storage controller fan modules. In addition, they contain fan module status LEDs and transfer I2C data for the fan modules. ■ Disk Backplane - The hard drive backplane is a FRU and includes the connectors for the hard disk drives, as well as the interconnect for the I/O board, Power and Locator buttons, and system/component status LEDs. The storage controller has an eight-disk backplane. Each drive has an LED indicator for Power/Activity, Fault, and OK-to-Remove (not supported). The following list contains the replaceable system boards for the 7320 storage controller. Part Number Description FRU/CRU F541-2883-01 X8 PCIe Riser Card 1U CRU F541-2885-01 X16 PCIe Riser Card 1U CRU F541-4081-01 RoHS Motherboard and Tray FRU F511-1489-01 DB, Power Distribution Board FRU F511-1548-01 PCB, 8 Disk 1U Backplane FRU F541-4275-02 PCBA, Connector Board, 1U FRU Cables The following list contains the replaceable cables for the 7320 storage controller. 36 Part Number Description FRU/CRU F530-4228-01 Cable, Mini SAS FRU (internal) F530-3927-01 FRU,CBL,PDB,MB,1U+2U,RIBBON FRU (internal) F530-4431-01 Cable, Fan data FRU (internal) F530-4417-01 FRU Cable, Fan paddle FRU (internal) F530-3880-01 Cable, Assembly, Ethernet, Shielded, RJ45-RJ45, 6m CRU (external) F530-3883-01 FRU,2M,4X Mini SAS Cable CRU (external) Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7320 7320 I/O Components The following figure and legend identify the I/O components of the storage controller. Figure Legend 1 Top cover 2 Left Control Panel Light Pipe Assembly 3 Drive Cage 4 Solid State Drives 5 blank/USB Module 6 Right Control Panel Light Pipe Assembly 7320 CPU and Memory Components The following list contains the replaceable CPU and memory components of the 7320. Part Number Description F371-4966-01 DIMM, 8GB, DDR3, 2RX4, 13 CRU F371-4885-01 Intel E5620, 2.40G Chapter 2 • Hardware Maintenance FRU/CRU FRU 37 7320 The storage controller motherboard has 18 slots in two groups that hold industry-standard DDR3 DIMM memory cards. All sockets must be occupied by either a filler or a DDR3 DIMM. 7320 Power Distribution and Fan Module Components The following figure and legend identify the Power Distribution/Fan Module components of the storage controller. Figure Legend 1 Fan Modules 4 Power Distribution/Bus Bar Assembly 2 Fan Board 38 5 Power Supplies Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7320 Figure Legend 3 Paddle Board Electrical Specifications The following list shows the electrical specifications for the 7320. Note: The power dissipation numbers listed are the maximum rated power numbers for the power supply. The numbers are not a rating of the actual power consumption of the appliance. Connectors ■ Two C13 connectors which work on 110-220v outlets Input ■ ■ ■ ■ Nominal frequencies: 50/60Hz Nominal voltage range: 100-120/200-240 VAC Maximum current AC RMS: 9.0 amps Max AC operating range: 90-264 VAC Output ■ ■ 3.3 VDC STBY: 3.6A +12 VDC: 62.3A Power dissipation ■ ■ ■ Max power consumption: 873 W Max heat output: 2977 BTU/hr Volt-Ampere rating: 891 VA @ 240 VAC, 0.98P.F. 7320 Front Panel The following figure and legend identify the front panel LEDs. Chapter 2 • Hardware Maintenance 39 7320 Figure Legend 1 Locate Button/LED 2 Service Required LED (amber) 3 Power/OK LED (green) 4 Power Button 5 Rear Power Supply 6 System Overtemperature LED 7 Top Fan The following figure and legend identify the 7320 front panel drive locations. Two mirrored hard disk drives (HDDs) that store the operating system reside in slots 0 and 1. Up to four solid state drives (ReadZilla SSDs), which store the read cache, fill slots 2 through 5, in order. Slots 6 and 7 are empty and must contain drive fillers. Disk Drive Locations HDD1 HDD3 HDD5 HDD0 HDD2 HDD4 HDD6 HDD7 7320 Replaceable Components The following list contains all of the replaceable power distribution, disk, and fan module components of the 7320. Note that power supplies, disks, and fan modules are hot-pluggable on the storage controller. 40 Part Number Description FRU/CRU F300-2233-02 RoHS 760W Power Supply CRU F541-2075-04 Buss Bar Power, 1U FRU F542-0184-01 DR, 3Gb SATA CRU F542-0330-01 2.5" 512GB ReadZilla SSD CRU Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7320 Part Number Description FRU/CRU F541-276-01 ASSY,FAN Module CRU F541-4274-02 Fan Board (1U) FRU 7320 PCIe Cards and Risers Following is the complete list of replaceable PCIe cards for the 7320 system. Part Number Description FRU/CRU F371-4325-01 8Gb FC HBA (PCIe) CRU F375-3609-02 PCA, SAS 6GBS 8 Port (PCIe) CRU F375-3606-03 Dual Port (x4) IB HCA (PCIe) CRU F375-3696-01 Dual Port CX2 4XQDR (PCIe) CRU F375-3617-01 2X10GbE SFP+, X8 (PCIe) CRU F375-3481-01 NIC Card Quad Port 1GigE Cu (PCIe) CRU F511-1496-04 Sun Fishworks Cluster Controller 200 (PCIe) FRU 7320 Rear Panel Following is an illustration of the 7320 storage controller rear panel. The Sun 375-3609 belongs in slot 2, cannot be installed in any other slots, and a second is not offered as an option. Figure Legend 1 Power supplies 4 Network management port 2 SC summary status LEDs 5 Ethernet ports 3 Serial management port 6 PCIe slots Chapter 2 • Hardware Maintenance 41 7320 7320 Single and Cluster Controller Configurations The single controller base configuration is 96GB RAM, 2x2.4GHz Quad-Core processors, one external SAS HBA, and four 10/100/1000 Ethernet ports. The following table describes base configurations for the 7320. Mktg Part Number Description Mfg Part Number TA7320-24A S7320, 2xCPU, 24GB, Single 597-1060-01 7104054 S7320, 2xCPU, 96GB, Single 7045900 TA7320-24A-HA S7320, 2xCPU, 24GB, Cluster 597-1061-01 7104055 S7320, 2xCPU, 96GB, Cluster 7045903 Following are the PCIe configuration options for a single controller. All PCIe cards are low profile and must be fitted with low-profile mounting brackets. Slot Type Part Number Vendor Part Description Note 0 PCIe 375-3617-01 Intel Niantic DP Optical 10GE NIC Optional Recommended Front-end 0 PCIe 375-3696-01 Mellanox InfiniBand HCA Optional Recommended Front-end 0 PCIe 375-3606-03 MHJH29-XTC InfiniBand HCA Optional Recommended Front-end 0 PCIe 375-3481-01 Intel EXPI9404PT QP Copper NIC Optional Recommended Front-end 0 PCIe 371-4325-01 QLogic 8Gb DP FC HBA Optional FC Target or Initiator (Backup) 1 PCIe 375-3617-01 Intel Niantic DP Optical 10GE NIC Optional Recommended Front-end 1 PCIe 375-3696-01 Mellanox InfiniBand HCA Optional Recommended Front-end 1 PCIe 375-3606-03 MHJH29-XTC InfiniBand HCA Optional Recommended Front-end 1 PCIe 375-3481-01 Intel EXPI9404PT QP Copper NIC Optional Recommended Front-end 1 PCIe 371-4325-01 QLogic 8Gb DP FC HBA Optional FC Target or Initiator (Backup) 2 PCIe 375-3609-03 Sun Thebe External SAS HBA Base Configuration 7320 Cluster Configurations The 7320 cluster base configuration is 96GB RAM, 2x2.4GHz Quad-Core processors, one external SAS HBA, and four 10/100/1000 Ethernet ports, and a Cluster card. The Sun Storage 7420C Cluster Upgrade Kit (XOPT 594-4680-01) contains two cluster cards with cables for converting two 7320 or two 7420 controllers to a cluster. 42 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7320 The following options are available for clustered storage controllers. Note: When you cluster a 7320, you must identically configure the cards in both of the clustered storage controllers, and you must identically configure all optional NIC/HBA cards used in the clustered storage controllers in both chassis. Slot Type Part Number 0 PCIe 0 Vendor Part Description Note 375-3617-01 Intel Niantic DP Optical 10GE NIC Optional Recommended Front-end PCIe 375-3696-01 Mellanox InfiniBand HCA Optional Recommended Front-end 0 PCIe 375-3606-03 MHJH29-XTC InfiniBand HCA Optional Recommended Front-end 0 PCIe 375-3481-01 Intel EXPI9404PT QP Copper NIC Optional Recommended Front-end 0 PCIe 371-4325-01 QLogic 8Gb DP FC HBA Optional FC Target or Initiator (Backup) 1 PCIe 542-0298-01 Sun Fishworks Cluster Card 2 Cluster Base Configuration 2 PCIe 375-3609-03 Sun Thebe External SAS HBA Cluster Base Configuration 7320 Connector Pinouts The serial management connector (SERIAL MGT) is an RJ-45 connector and is a terminal connection to the SP console. The network management connector (NET MGT) is an RJ-45 connector on the motherboard and provides an alternate terminal interface to the SP console. There are four RJ-45 Gigabit Ethernet connectors (NET0, NET1, NET2, NET3) located on the motherboard that operate at 10/100/1000 Mbit/sec. These network interfaces must be configured before use. Chapter 2 • Hardware Maintenance 43 7420 7320 Storage Disk Shelf The 7320 single and cluster controller configurations allow a single chain of one to six disk shelves. Any combination of disk-only and Logzilla-capable shelves may be combined within the chain in any order. The cabling configurations are unchanged. Half-populated shelf configurations are not supported. See Also ■ ■ ■ Controller Details Disk Shelf Overview Disk Shelf Maintenance Procedures 7420 7420 Hardware Overview Use the information on this page as a preparation reference for servicing replaceable components of 7420 controllers. Refer to the following for procedural instructions: ■ ■ Controller Tasks - replace system controller components Disk Shelf Tasks - replace disk shelf components Chassis Overview The Sun ZFS Storage 7420 Appliance consists of either a single storage controller, or two storage controllers in a high availability cluster configuration, and one to 36 disk shelves. Refer to the http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html (http://www.oracle.com/ us/products/servers-storage/storage/unified-storage/index.html) for the most recent component specification. The 3U chassis form factor dimensions are as follows: Dimension Measurement Dimension Measurement Height 13.3 cm/5.25 in Depth Width 43.7 cm/17.19 in Weight 70.6 cm/27.8 in 16.36 kg/96 lbs Front Panel 44 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7420 Figure Legend 1 Locator LED and button (white) 7 Power Supply (PS) Service Required LED 13 Solid state drive 2 (optional) 2 Service Required LED (amber) 8 Over Temperature Warning LED 14 Solid state drive 3 (optional) 3 Power/OK LED (green) 9 USB 2.0 Connectors 15 Solid state drive 4 (optional) 4 Power button 10 DB-15 video connector 16 Solid state drive 5 (optional) 5 Service Processor (SP) OK LED (green) 11 Boot drive 0 6 Fan/CPU/Memory Service Required LED 12 Boot drive 1 (required) The 500GB boot drives (HDDs) reside in slots 0 and 1 as a mirrored set, and Sun Storage Readzilla 512GB solid state drives (SSDs), may optionally fill, in order, slots 2 through 5. Each storage controller may have 0, 2, 3, or 4 Readzilla devices. Figure Legend 1 Locate (white) 2 Service action required (amber) 3 OK/Activity (green) Chapter 2 • Hardware Maintenance 45 7420 Rear Panel The following graphic shows the 7420 rear panel. Base configuration HBAs are not depicted in this illustration. Figure Legend 1 Power supply unit 0 status LEDs OK: green Power Supply Fail: amber AC OK: green 8 Network (NET) 10/100/1000 ports: NET0-NET3 2 Power supply unit 0 AC inlet 9 USB 2.0 ports 3 Power supply unit 1 status LEDs OK: green Power Supply Fail: amber AC OK: green 10 PCIe slots 5-9 4 Power supply unit 1 AC inlet 11 Network management (NET MGT) port 5 System status LEDs Power: green Attention: amber Locate: white 12 Serial management (SER MGT) port 6 PCIe slots 0-4 13 DB-15 video connector 7 Cluster card slot Internal Boards The 7420 storage controller chassis has the following boards installed. Field-replaceable units (FRUs) should only be replaced by trained Oracle service technicians. ■ 46 Motherboard - The motherboard is a FRU and includes CPU modules, slots for eight DIMM risers, memory control subsystems, and the service processor (SP) subsystem. The SP subsystem controls the host power and monitors host system events (power and environmental). The SP controller draws power from the host's 3.3V standby supply rail, which is available whenever the system is receiving AC input power, even when the system is turned off. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7420 ■ Power Distribution Board - The power distribution board is a FRU and distributes main 12V power from the power supplies to the rest of the system. It is directly connected to the Vertical PDB card, and to the motherboard through a bus bar and ribbon cable. It also supports a top cover interlock ("kill") switch. In the storage controller, the power supplies connect to the power supply backplane which connects to the power distribution board. ■ Vertical PDB Card - The vertical power distribution board, or Paddle Card is a FRU and serves as the interconnect between the power distribution board and the fan power boards, hard drive backplane, and I/O board. ■ Power Supply Backplane Card - This board connects the power distribution board to power supplies 0 and 1. ■ Fan Power Boards - The two fan power boards are FRUs and carry power to the storage controller fan modules. In addition, they contain fan module status LEDs and transfer I2C data for the fan modules. ■ Drive Backplane - The six-drive backplane is a FRU and includes the connectors for the drives, as well as the interconnect for the I/O board, Power and Locator buttons, and system/component status LEDs. Each drive has an LED indicator for Power/Activity, Fault, and Locate. Components The components of the storage controller are shown in the following figure and identified in the table. Figure Legend 1 Motherboard 7 CPUs and heatsinks Chapter 2 • Hardware Maintenance 47 7420 Figure Legend 2 Low-profile PCIe cards 8 Memory risers 3 Power supplies 9 Fan board 4 Power supply backplane 10 Fan modules 5 Drive backplane 11 Boot drives and SSDs 6 System lithium battery CPU and Memory The 7420 appliance supports two or four CPUs, with two memory risers required by each CPU. Four or eight 4GB or 8GB DDR3 DIMMs are installed on each riser, accommodating up to 256GB of memory for two CPUs, or up to 512GB for four CPUs. Empty CPU sockets must have memory riser fillers installed for proper cooling. The new 7420 controller has different CPU options and memory risers, but is visually identical to the existing 7420 controller (with 1.86GHz or 2.00GHz CPUs). The new 7420 controller supports the following configurations: 48 ■ Two, four, or eight 8GB DDR3 DIMMs installed on each riser, accommodating 128GB, 256GB, or 512GB of memory for 2.0Ghz CPUs. ■ Four or eight 8GB DDR3 DIMMs installed on each riser, accommodating 256GB or 512GB of memory for 2.0GHz and 2.4Ghz CPUs. ■ Four or eight 16GB DDR3 DIMMs installed on each riser, accommodating 512GB or 1TB of memory for 2.4Ghz CPUs. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7420 Refer to the service label on the cover for DIMM placement information. On every memory riser, slots D0, D2, D4, and D6 must be populated; optionally, slots D1, D3, D5, and D7 may be populated as a group on all installed memory risers. All DIMMs in the system must be identical. DIMM names in appliance logs and the Maintenance > Hardware view are displayed with the full name, such as /SYS/MB/P0/D7. Chapter 2 • Hardware Maintenance 49 7420 Fan Modules The Fan Modules and Fan Module LEDs of the storage controller are shown in the following figure. The following LEDs are lit when a fan module fault is detected: ■ ■ ■ Front and rear Service Action Required LEDs Fan Module Service Action Required (TOP) LED on the front of the server Fan Fault LED on or adjacent to the faulty fan module The system Overtemp LED might light if a fan fault causes an increase in system operating temperature. PCIe Cards The Sun Fishworks Cluster Controller 200 belongs in the Cluster slot (C) only. SAS HBAs must all be of the same type, installed in slots 1 and 8, with an optional third SAS HBA in slot 2, and an optional fourth SAS HBA in slot 7. PCIe slots should be populated in the following order: 9 (if used), 0 (if used), 7, 2, 6, 3, 5, 4. Connectors The serial management connector (SER MGT) is an RJ-45 connector and provides a terminal connection to the SP console. The network management connector (NET MGT) is an RJ-45 connector and provides a LAN interface to the SP console. 50 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7420 There are four RJ-45 Gigabit Ethernet connectors (NET0, NET1, NET2, NET3) located on the motherboard that operate at 10/100/1000 Mbit/sec. These network interfaces must be configured before use. 7420 Standalone and Cluster Controller Configurations The following tables show the configuration options for a single standalone 7420 controller or two clustered 7420 controllers. All PCIe cards are low-profile, and must be fitted with low-profile mounting brackets. Standalone Base Options This table describes 7420 standalone base configurations. Note: both 7100566 and 7100568 include a cluster card and can be configured as a single standalone or two clustered configuration. Mktg Part Number Description Mfg Part Number TA7420-26A S7420, no DIMMs, 2x1.86GHz-6C 597-0789-01 TA7420-28A S7420, no DIMMs, 2x2.00GHz-8C 597-0790-01 Cluster Base Options This table describes 7420 cluster base configurations. Mktg Part Number Description Mfg Part Number TA7420-26AR00HA S7420, no DIMMs, 2x1.86GHz-6C, Cluster 597-0795-01 TC7420-28AR00HA S7420, no DIMMs, 2x2.00GHz-8C, Cluster 597-0792-01 7100566 S7420, no DIMMs, 4x2GHz-8C, Cluster 7014572 7100568 S7420, no DIMMs, 4x2.40GHz-10C, Cluster 7014573 NIC/HBA Options This table describes NIC/HBA options for 7420 single and cluster configurations. Mktg Part Number Description SG-XPCIESAS-GEN2-Z 2-port External Sun Thebe SAS (x4) HBA, PCIe F375-3609-03 Chapter 2 • Hardware Maintenance Mfg Part Number 51 7420 Mktg Part Number Description Mfg Part Number SG-XPCIE2FC-QF8-Z 2-port FC HBA, 8Gb, PCIe 594-5684-01 X4446A-Z 4-port PCIe Quad GigE UTP 594-4024-01 X4242A 2-port CX2 4xQDR, HCA PCIe 594-6776-01 X4237A 2-port 4X IB HCA PCIe 594-5862-02 X1109A-Z 2-port 10Gig SFP+ NIC, PCIe 594-6039-01 X2129A XCVRm 850NM, 1/10GPS, Short Reach, SFP 594-6508-01 X5562A-Z XCVR 1300NM, 1/10GPS, Long Reach, SFP 594-6689-01 PCIe Options This table describes the supported single and clustered PCIe configuration option summary for 7420. The 7420 supports a maximum of six dual-port optical 10Gb Ethernet NICs, 8Gb FC HBAs, and Quad Port Copper Gb Ethernet NICs and a maximum of four InfiniBand HCAs. 52 Slot Type Sun Part Number 0 PCIe 371-4325-01 8Gb DP FC HBA 6 Optional FC Target or Initiator (Backup) 1 PCIe 375-3609-02 DP SAS-2 HBA 6 Base Configuration (2 minimum) 2 PCIe 375-3609-02 DP SAS-2 HBA 6 Additional Optional Back-end 2 PCIe 375-3481-01 QP Copper NIC 6 Optional Recommended Front-end 2 PCIe 371-4325-01 8Gb DP FC HBA 6 Optional FC Target or Initiator (Backup) 2 PCIe 375-3606-01 Infiniband HCA 4 Optional Recommended Front-end 2 PCIe 375-3617-01 DP Optical 10GE NIC 6 Optional Recommended Front-end 2 PCIe 375-3696-01 CX2 Infiniband HCA 4 Optional Recommended Front-end 3 PCIe 375-3609-02 DP SAS-2 HBA 6 Additional Optional Back-end 3 PCIe 375-3481-01 QP Copper NIC 6 Optional Recommended Front-end 3 PCIe 371-4325-01 8Gb DP FC HBA 6 Optional FC Target or Initiator (Backup) 3 PCIe 375-3606-01 Infiniband HCA 4 Optional Recommended Front-end 3 PCIe 375-3617-01 DP Optical 10GE NIC 6 Optional Recommended Front-end Description Max Note Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7420 Sun Part Number Slot Type Description 3 PCIe 375-3696-01 CX2 Infiniband HCA 4 Optional Recommended Front-end 4 PCIe 375-3481-01 QP Copper NIC 6 Optional Recommended Front-end 4 PCIe 375-3606-01 InfiniBand HCA 4 Optional Recommended Front-end 4 PCIe 375-3617-01 DP Optical 10GE NIC 6 Optional Recommended Front-end 4 PCIe 371-4325-01 8Gb DP FC HBA 6 Optional FC Target or Initiator (Backup) 4 PCIe 375-3696-01 CX2 Infiniband HCA 4 Optional Recommended Front-end C PCIe 511-1496-04 Cluster Controller 200 1 Cluster Base Configuration ONLY 5 PCIe 375-3481-01 QP Copper NIC 6 Optional Recommended Front-end 5 PCIe 375-3606-01 InfiniBand HCA 4 Optional Recommended Front-end 5 PCIe 375-3617-01 DP Optical 10GE NIC 6 Optional Recommended Front-end 5 PCIe 371-4325-01 8Gb DP FC HBA 6 Optional FC Target or Initiator (Backup) 6 PCIe 375-3609-02 DP SAS-2 HBA 6 Additional Optional Back-end 6 PCIe 375-3481-01 QP Copper NIC 6 Optional Recommended Front-end 6 PCIe 371-4325-01 8Gb DP FC HBA 6 Optional FC Target or Initiator (Backup) 6 PCIe 375-3606-01 InfiniBand HCA 4 Optional Recommended Front-end 6 PCIe 375-3617-01 DP Optical 10GE NIC 6 Optional Recommended Front-end 6 PCIe 375-3696-01 CX2 Infiniband HCA 4 Optional Recommended Front-end 7 PCIe 375-3609-02 DP SAS-2 HBA 6 Additional Optional Back-end 7 PCIe 375-3481-01 QP Copper NIC 6 Optional Recommended Front-end 7 PCIe 371-4325-01 8Gb DP FC HBA 6 Optional FC Target or Initiator (Backup) 7 PCIe 375-3606-01 Infiniband HCA 4 Optional Recommended Front-end 7 PCIe 375-3617-01 DP Optical 10GE NIC 6 Optional Recommended Front-end Chapter 2 • Hardware Maintenance Max Note 53 7x20 Sun Part Number Slot Type Description Max Note 7 PCIe 375-3696-01 CX2 Infiniband HCA 4 Optional Recommended Front-end 8 PCIe 375-3609-02 DP SAS-2 HBA 6 Base Configuration (2 minimum) 9 PCIe 371-4325-01 8Gb DP FC HBA 6 Optional FC Target or Initiator (Backup) Attached Storage The 7420 does not contain primary storage within its chassis, and therefore connects to external storage shelves. Disk Shelf The 7420 single and cluster controller configurations allow one to six chains of 1 to 6 disk shelves. Any combination of disk-only and Logzilla-capable shelves may be combined within the chain in any order. The cabling configurations are unchanged. Half-populated shelf configurations are not supported. See Disk Shelf Overview for component specifications and diagrams. 7x20 7x20 CRU Maintenance Procedures This section provides instructions on how to replace customer replaceable components (CRUs) in Oracle's Sun ZFS Storage 7120, 7320, and 7420 controllers. Refer to Disk Shelf Details for replacing expansion storage shelf components. Prerequisites 54 ■ Read the information in the overview section for your controller to become familiar with the replaceable parts of the system: 7120 | 7320 | 7420 ■ Follow the instructions in the Safety Information and Required Tools and Information sections. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 Safety Information This section contains safety information that you must follow when servicing the storage system. For your protection, observe the following safety precautions when setting up your equipment: ■ Do not remove the side panels, or run the storage system with the side panels removed. Hazardous voltage is present that could cause injury. The covers and panels must be in place for proper air flow to prevent equipment damage. ■ Follow all cautions, warnings, and instructions marked on the equipment and described in Important Safety Information for Sun Hardware Systems. ■ Ensure that the voltage and frequency of your power source match the voltage inscribed on the electrical rating label. ■ Follow the electrostatic discharge safety practices. Electrostatic discharge (ESD) sensitive devices, such as PCI cards, HDDs, SSDs, and memory cards, require special handling. Circuit boards and HDDs contain electronic components that are extremely sensitive to static electricity. Ordinary amounts of static electricity from clothing or the work environment can destroy the components located on these boards. Do not touch the components without using antistatic precautions, especially along the connector edges. Required Tools and Information The following tools are needed to service the CRUs: ■ Antistatic wrist strap - Wear an antistatic wrist strap and use an antistatic mat when handling components such as HDDs or PCI cards. When servicing or removing storage controller components, attach an antistatic strap to your wrist and then to a metal area on the chassis. Following this practice equalizes the electrical potentials between you and the storage controller. ■ Antistatic mat - Place static-sensitive components on an antistatic mat. ■ No. 2 Phillips screwdriver ■ Nonconducting, No.1 flat-blade screwdriver or equivalent ■ Nonconducting stylus or pencil (to power on the storage controller) Chassis Serial Number To obtain support for your storage controller or to order new parts, you need your chassis serial number. You can find a chassis serial number label on the storage controller front panel on the left side. Another label is on the top of the storage controller. Alternatively, click the Sun logo in the BUI masthead to obtain the serial number or issue the following command: hostname: maintenance hardware show Chapter 2 • Hardware Maintenance 55 7x20 Controller Replacement Tasks 7x20 Controller Replacement Tasks HDD or SSD 1. Identify the failed HDD or SSD by going to the Maintenance > Hardware section of the BUI . If you are physically at the system, the amber Service and clicking the drive details icon Required indicator on the HDD or SSD should be illuminated. 2. If you are not physically at the system, turn on the locator indicator by clicking the locator icon . 3. Push the release button on the HDD or SSD to open the latch. 4. Grasp the latch (2), and pull the drive out of the drive slot. 5. After 15 seconds, navigate to the Hardware > Maintenance screen, and click the details icon on the system controller to verify that the software has detected that the drive is not present. 6. Slide the replacement drive into the slot until it is fully seated. 7. Close the latch to lock the drive in place. 56 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 The Sun ZFS Storage system software automatically detects and configures the new drive. The device appears in the BUI Maintenance > Hardware screen when you view details for the controller or drive shelf. Fan Module Fan modules are hot-swappable and can be removed and installed while the storage controller is running without affecting other hardware capabilities. Caution: Operating a controller for an extended period of time with fans removed reduces the effectiveness of the cooling system. For this reason, the replacement fan should be unpacked in advance and ready to insert into the controller chassis as soon as the faulted fan is removed. 7120 or 7320: The fan modules and status indicators are hidden under a fan door in the 7120 and 7320 storage controllers. Components may differ slightly between the 7120 and 7320, however service procedures for each are identical. The illustration shows the 7320. Leaving the door open for more than 60 seconds while the storage controller is running might cause it to overheat and shut down. 7420: The following illustration shows the fan modules in the 7420 storage controller. locate icon on the 1. To locate the chassis you want to service, click the associated Maintenance > Hardware screen of the BUI or issue the set /SYS/LOCATE status=on command at the service processor (SP) prompt. The locate LED will flash on the controller chassis. Chapter 2 • Hardware Maintenance 57 7x20 2. Verify that no cables will be damaged or will interfere when the storage controller is extended from the rack. 3. From the front of the storage controller, release the two slide release latches. 4. While squeezing the slide release latches, slowly pull the storage controller forward until the slide rails latch. 5. 7120 or 7320: To replace the fan module: Open the fan module door while unlatching the release tabs on the door. Identify the faulted fan module by locating the corresponding Service Required status locate icon on Maintenance > Hardware screen of the BUI indicator or by clicking the for the fan you want to replace. Using thumb and forefinger, pull the fan module up and out. Install the replacement fan module into the storage controller fan slot. Note: The fan must be replaced within one minute to avoid controller shutdown. Apply firm pressure to fully seat the fan module. Verify that the Fan OK status indicator is lit, and that the fault status indicator on the replaced fan module is dim. 58 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 Close the top cover door immediately after replacing the fan to maintain airflow in the storage controller. 6. 7420 To replace the fan module: Identify the faulted fan module by locating the corresponding Service Required status indicator or by clicking the locate icon on Maintenance > Hardware screen of the BUI for the fan you want to replace. Lift the latch at the top of the fan module to unlock the fan module. Pull the fan module out. Unlock and insert the 7420 fan module. Apply firm pressure to fully seat the fan module. Verify that the Fan OK status indicator is lit and that the fault status indicator on the replaced fan module is dim. 7. Verify that the Top Fan status indicator, the Service Required status indicators, and the Locator status indicator/Locator button are dim. 8. Push the release tabs on the side of each rail and slowly slide the storage controller into the rack. Chapter 2 • Hardware Maintenance 59 7x20 Power Supply Storage controllers are equipped with redundant hot-swappable power supplies. If a power supply fails and you do not have a replacement, leave the failed power supply installed to ensure proper air flow. A faulted power supply is indicated by an amber colored status LED. 1. Gain access to the rear of the storage controller where the faulted power supply is located. 2. If a cable management arm (CMA) is installed, press and hold the CMA release tab and rotate the arm out of the way. 3. Disconnect the power cord from the faulted power supply. 4. Remove the power supply. 7120 or 7320: Release the latch, then remove the power supply. Components may differ slightly between the 7120 and 7320, however service procedures for each are identical. The illustration shows the 7320. 60 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 7420: Grasp the power supply handle and press the release latch to remove the power supply. 5. Align the replacement power supply with the empty power supply chassis bay. 6. Slide the power supply into the bay until it is fully seated. The following figure shows the 7420 power supply. 7. Connect the power cord to the power supply. 8. Verify that the green AC Present status indicator is lit. 9. Close the CMA, inserting the CMA into the rear left rail bracket. details icon for the 10. Go to the Maintenance > Hardware screen of the BUI. Click controller and then click power supply to verify that the status icon is green for the newly installed power supply. Memory To identify a specific memory module that has faulted, you must open the storage controller and use the amber status LEDs on the motherboard. To identify a general memory fault, go to the Maintenance > Hardware screen of the BUI, and click on the details icon on the controller. Then click DIMMs to locate the faulted component, indicated by the warning icon . Caution: This procedure requires that you handle components that are sensitive to static discharge, which can cause the component to fail. To avoid damage, wear an antistatic wrist strap and use an antistatic mat when handling components. Chapter 2 • Hardware Maintenance 61 7x20 You must shut down the appliance before beginning this task. Note that there will be a loss of access to the storage unless the system is in a clustered configuration. Shut down the appliance using one of the following methods: ■ Log in to the BUI and click the power icon ■ on the left side of the masthead. SSH into the appliance and issue the maintenance system poweroff command. ■ SSH or serial console into the service processor (SP) and issue the stop /SYS command. ■ Use a pen or nonconducting pointed object to press and release the Power button on the front panel. ■ To initiate emergency shutdown during which all applications and files will be closed abruptly without saving, press and hold the power button for at least four seconds until the Power/OK status indicator on the front panel flashes, indicating that the storage controller is in standby power mode. 1. Disconnect the AC power cords from the rear panel of the storage controller. 2. Verify that no cables will be damaged or will interfere when the storage controller is extended from the rack. 3. From the front of the storage controller, release the two slide release latches. 4. While squeezing the slide release latches, slowly pull the storage controller forward until the slide rails latch. 5. 7120 or 7320: Components may differ slightly between the 7120 and 7320, however service procedures for each are identical. The illustration shows the 7320. To remove the top cover: Unlatch the fan module door, pull the two release tabs back, rotate the fan door to the open position and hold it there. Press the top cover release button and slide the top cover to the rear about a half-inch (1.3 cm). Lift up and remove the top cover. Also remove the air baffle by pressing the air baffle connectors outward and lifting the air baffle up and out of the server. 6. 7420: To remove the top cover: 62 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 Simultaneously lift both cover latches in an upward motion. Lift up and remove the top cover. 7. To locate the DIMM you want to service, press the Fault Remind Button on the storage controller. The following illustration shows the Fault remind button on the 7120. Chapter 2 • Hardware Maintenance 63 7x20 The following illustration shows the Fault remind button on the 7420. 8. 7420: Identify the memory riser that hosts the faulted DIMM by the Service Required status indicator. Lift the memory riser straight up to remove it from the motherboard, and place it on an antistatic mat. 9. Rotate both DIMM slot ejectors outward as far as they will go and carefully lift the faulted DIMM straight up to remove it from the socket. 64 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 10. Line up the replacement DIMM with the connector, aligning the notch with the key to ensure that the component is oriented correctly. 11. Push the DIMM into the connector until the ejector tabs lock the component in place. 12. 7120 or 7320: Components may differ slightly between the 7120 and 7320, however service procedures for each are identical. The illustration shows the 7320. To replace the cover: Place the top cover on the chassis so that it hangs over the rear of the storage controller by about an inch (2.5 cm). Slide the top cover forward until it seats. Close the fan cover and engage the fan cover latches. The cover must be completely closed for the storage controller to power on. 13. 7420: To replace the cover: Chapter 2 • Hardware Maintenance 65 7x20 Push the memory riser module into the associated CPU memory riser slot until the riser module locks in place. Place the top cover on the chassis so that it is forward of the rear of the storage controller by about an inch (2.5 cm). Slide the top cover toward the rear of the chassis until it seats and press down on the cover with both hands until both latches engage. 66 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 14. Push the release tabs on the side of each rail and slowly push the storage controller into the rack. The following image shows the 7420 chassis. 15. Connect the power cords to the power supplies. 16. Verify that standby power is on, indicated by the Power/OK status indicator flashing on the front panel about two minutes after the power cords are plugged in. 17. Use a pen or other pointed object to press and release the recessed Power button on the storage controller front panel. The Power/OK status indicator next to the Power button lights and remains lit. The Maintenance > Hardware screen of the BUI provides status of the replacement on the Details page for DIMMs. Chapter 2 • Hardware Maintenance 67 7x20 PCIe Cards and Risers Go to the Maintenance > Hardware screen of the BUI and click the details icon controller, and then click Slots to locate the faulted component. on the Caution: This procedure requires that you handle components that are sensitive to static discharge, which can cause the component to fail. To avoid damage, wear an antistatic wrist strap and use an antistatic mat when handling components. Note that the 7120 Sun Flash Accelerator F20 card is a FRU and must be replaced by an Oracle service representative. All HBAs must be of the same type. Ensure that you upgrade your system software before installing a newly-released HBA. You must shut down the controller before beginning this task. Note that there will be a loss of access to the storage unless the system is in a clustered configuration. Shut down the appliance using one of the following methods: ■ Log in to the BUI, and click the power icon on the left side of the masthead. ■ SSH into the storage system and issue the maintenance system poweroff command. ■ SSH or serial console into the service processor (SP) and issue the stop /SYS command. ■ Use a pen or non-conducting pointed object to press and release the Power button on the front panel. ■ To initiate emergency shutdown, wherein all applications and files will be closed abruptly without saving, press and hold the power button for at least four seconds until the Power/OK status indicator on the front panel flashes, indicating that the storage controller is in standby power mode. 1. Disconnect the AC power cords from the rear panel of the storage controller. 2. Verify that no cables will be damaged or will interfere when the storage controller is extended from the rack. 3. From the front of the storage controller, release the two slide release latches. 4. While squeezing the slide release latches, slowly pull the storage controller forward until the slide rails latch. 5. 7120 or 7320: To remove the top cover: Unlatch the fan module door, pull the two release tabs back, rotate the fan door to the open position and hold it there. Press the top cover release button and slide the top cover to the rear about a half-inch (1.3 cm). Lift up and remove the top cover. 6. 7420: To remove the top cover: 68 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 Simultaneously lift both cover latches in an upward motion. Lift up and remove the top cover. 7. Locate the PCIe card position in the storage controller, see Single and Cluster Controller Configurations for the 7320, the 7120 Overview, or 7420 Overview. 8. 7120 or 7320: To replace the PCIe card: Disconnect any data cables connected to the cards on the PCIe riser you want to replace. Label the cables for proper connection later. Loosen the two captive Phillips screws on the end of the rear panel crossbar and lift the crossbar up and back to remove it. Loosen the captive retaining screw holding the front end of the riser and the Phillips screw on the end of the riser. Chapter 2 • Hardware Maintenance 69 7x20 Lift the riser up to remove it from the storage controller. Carefully remove the PCIe card from the riser board connector and clean the slot with filtered, compressed air if necessary. Seat the replacement PCIe card in the slot of the riser and connect the cables. Align the riser, together with any attached PCIe cards, with the intended location on the motherboard, and carefully insert it into its slot. Slide the back of the riser into the motherboard rear panel stiffener. Tighten the screw that secures the riser to the motherboard. 70 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 Replace the rear panel PCI crossbar by sliding it down over the PCIe risers, ensuring the crossbar is secured with two captive Phillips screws. 9. 7420: To replace the PCIe card: Disengage the PCIe card slot crossbar from its locked position and rotate the crossbar into an upright position. Remove the retaining screw that holds the PCIe card to the chassis. Carefully remove the PCIe card from the connector and clean the slot with filtered, compressed air if necessary. Install the replacement PCIe card into the PCIe card slot. Install the retaining screw to hold the PCIe card to the chassis. Return the crossbar to its closed and locked position. Chapter 2 • Hardware Maintenance 71 7x20 10. 7120 or 7320: Components may differ slightly between the 7120 and 7320, however service procedures for each are identical. The illustration shows the 7320. To install the top cover: Place the top cover on the chassis so that it hangs over the rear of the storage controller by about an inch (2.5 cm), then slide the top cover forward until it seats. Close the fan cover and engage the fan cover latches. The cover must be completely closed for the storage controller to power on. 11. 7420: To install the top cover: Place the top cover on the chassis (1) so that it is forward of the rear of the storage controller by about an inch (2.5 cm). Slide the top cover toward the rear of the chassis (2) until it seats. Press down on the cover with both hands until both latches engage. 12. Push the release tabs on the side of each rail and slowly push the storage controller into the rack, making sure no cables obstruct the path of the controller. 72 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 13. Connect the power cords to the power supplies. 14. Verify that standby power is on, indicated by the Power/OK status indicator flashing on the front panel about two minutes after the power cords are plugged in. 15. Use a pen or other pointed object to press and release the recessed Power button on the storage controller front panel. The Power/OK status indicator next to the Power button lights and remains lit. 16. Connect data cables to the PCIe card, routing them through the cable management arm. 17. Go to the Maintenance > Hardware screen of the BUI, and click the details icon on the controller. Then, click Slots to verify the status of the new component. The status indicator should appear green . 18. Install the disk shelf and connect the expansion storage. Battery You might need to replace the battery if the storage controller fails to maintain the proper time when powered off and disconnected from the network. You will need a small (No.1 flat-blade) non-metallic screwdriver or equivalent. You must shut down the appliance before beginning this task. Note that there will be a loss of access to the storage unless the system is in a clustered configuration. Shut down the appliance using one of the following methods: ■ Login to the BUI and click the power icon on the left side of the masthead. ■ SSH into the storage system and issue the maintenance system poweroff command. ■ SSH or serial console into the service processor and issue the stop /SYS command. ■ Use a pen or non-conducting pointed object to press and release the Power button on the front panel. ■ To initiate emergency shutdown, wherein all applications and files will be closed abruptly without saving, press and hold the power button for at least four seconds until the Power/OK status indicator on the front panel flashes, indicating that the storage controller is in standby power mode. 1. Disconnect the AC power cords from the rear panel of the storage controller. 2. Verify that no cables will be damaged or will interfere when the storage controller is extended from the rack. 3. From the front of the storage controller, release the two slide release latches. 4. While squeezing the slide release latches, slowly pull the storage controller forward until the slide rails latch. 5. 7120 or 7320: To remove the top cover: Unlatch the fan module door, pull the two release tabs back, rotate the fan door to the open position and hold it there. Chapter 2 • Hardware Maintenance 73 7x20 Press the top cover release button and slide the top cover to the rear about a half-inch (1.3 cm). Lift up and remove the top cover. 6. 7420: To remove the top cover: Simultaneously lift both cover latches in an upward motion. 7. Lift up and remove the top cover. 8. Using a small, non-metallic screwdriver, press the latch and remove the battery from the motherboard. The 7420 battery is shown here. 74 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 7x20 The following figure shows the 7120 battery. 9. Press the replacement battery into the motherboard with the positive side (+) facing upward. 10. 7120 or 7320: Components may differ slightly between the 7120 and 7320, however service procedures for each are identical. The illustration shows the 7320. To install the top cover: Place the top cover on the chassis so that it hangs over the rear of the storage controller by about an inch (2.5 cm), then slide the top cover forward until it seats. Close the fan cover and engage the fan cover latches. The cover must be completely closed for the storage controller to power on. 11. 7420: To install the top cover: Place the top cover on the chassis (1) so that it is forward of the rear of the storage controller by about an inch (2.5 cm). Slide the top cover toward the rear of the chassis (2) until it seats. Chapter 2 • Hardware Maintenance 75 Shelf Press down on the cover with both hands until both latches engage. 12. Push the release tabs on the side of each rail and slowly push the storage controller into the rack, making sure no cables obstruct the path of the controller. 13. Connect the power cords to the power supplies. 14. Verify that standby power is on, indicated by the Power/OK status indicator flashing on the front panel about two minutes after the power cords are plugged in. 15. Use a pen, or other pointed object, to press and release the recessed Power button on the storage controller front panel. The Power/OK status indicator next to the Power button lights and remains lit. 16. Connect data cables to the PCIe card, routing them through the cable management arm. 17. When the system has finished booting, log in and set the time using the steps in the BUI Clock task. Shelf Disk Shelf Overview Oracle disk shelves are high-availability serial attached SCSI (SAS) devices that provide expanded storage. The main components are hot-swappable, including drives, I/O Modules (IOMs) or SAS Interface Module (SIM) boards for connecting to controllers and other disk shelves, and dual load-sharing power supply with fan modules. This provides a fault-tolerant 76 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf environment with no single point of failure. Component status is indicated with lights on the disk shelf, and in the Maintenance > Hardware screen of the BUI. Refer to Disk Shelf Tasks for procedural information about replacing disk shelf components. For J4500 service information, refer to http://download.oracle.com/docs/cd/E19122-01/ j4500.array/index.html. (http://download.oracle.com/docs/cd/E19122-01/j4500.array/ index.html.) Oracle Storage Drive Enclosure DE2-24P The Oracle Storage Drive Enclosure DE2-24P is a 2U chassis that supports 24 2.5" SAS-2 drives. The high-performance HDDs provide reliable storage, and the SSDs provide accelerated write operations. This disk shelf features dual, redundant I/O Modules (IOMs), and dual power supply with fan modules. Oracle Storage Drive Enclosure DE2-24C The Oracle Storage Drive Enclosure DE2-24C is a 4U chassis that supports 24 3.5" SAS-2 drives. The SSDs provide accelerated write operations, and the high-capacity HDDs provide reliable storage. This disk shelf features dual, redundant I/O Modules (IOMs), and dual power supply with fan modules. Sun Disk Shelf 24x3.5" SAS-2 Chapter 2 • Hardware Maintenance 77 Shelf The Sun Disk Shelf is a 4U chassis that supports 24 3.5" SAS-2 drives. The SSDs provide accelerated write operations, and the high-capacity HDDs provide reliable storage. This disk shelf features dual, redundant SAS Interface Module (SIM) boards, and dual power supply with fan modules. Sun Storage J4400 Array The Sun Storage J4400 Array is a 4U chassis that supports 24 3.5" SAS-1 or SATA II drives. It features dual, redundant SAS Interface Module (SIM) boards, and dual power supply with fan modules. Sun Storage J4500 Array 78 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf The Sun Storage J4500 Array is a 4U chassis that supports 48 3.5" SATA II drives in different available capacities. It features redundant mini-SAS ports, dual power supply modules, and dual cooling fan modules. SAS-1 The Sun Storage J4400 Array supports the SAS-1 (Serial Attached SCSI 1.0) storage fabric. SAS-2 The SAS-2 (Serial Attached SCSI 2.0) storage fabric supports a greater number of targets, greater bandwidth, higher reliability and bigger scale. The scale and reliability improvements are achieved with SAS-2 disks you can daisy-chain to as many as 36 shelves for certain systems, for a total of 864 disks. In addition, the high-performance SAS-2 HBA is designed for the Sun ZFS Storage 7000 series with a standard chip set to support a high-density of target devices, capable of attachment to 1024 targets. With this fabric, you are encouraged to apply entire shelves to pools, so you can gain the benefits of No Single Point of Failure configurations, and striping across the maximum possible number of devices. The following shelves implement SAS-2 disks: ■ ■ ■ Oracle Storage Drive Enclosure DE2-24P Oracle Storage Drive Enclosure DE2-24C Sun Disk Shelf Front Panel The front panel consists of the drives and indicator lights. Chapter 2 • Hardware Maintenance 79 Shelf Drive Locations The following figures show the location of the drives. Oracle Storage Drive Enclosure DE2-24P Up to four Logzilla SSDs are supported per disk shelf. Logzilla SSDs should be populated in order of slots 20, 21, 22, and 23. Oracle Storage Drive Enclosure DE2-24C, Sun Disk Shelf, and J4400 Up to four Logzilla SSDs are supported per disk shelf.Logzilla SSDs should be populated in order of slots 20, 21, 22, and 23, except for the J4400 which should have SSDs populated in slots 8, 4, 16, and 20.(The Oracle Storage Drive Enclosure DE2-24C is shown and represents all three models.) Front Panel Indicators The following figures show the front panel indicators. Oracle Storage Drive Enclosure DE2-24P 80 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf Figure Legend 1 System power indicator 4 Drive fault indicator 2 Module fault indicator 5 Power / Activity indicator 3 Locate indicator Oracle Storage Drive Enclosure DE2-24C Chapter 2 • Hardware Maintenance 81 Shelf Figure Legend 1 System power indicator 4 Power / Activity indicator 2 Module fault indicator 5 Drive fault indicator 3 Locate indicator Sun Disk Shelf Figure Legend 1 Locate button and indicator 4 Disk ready to be removed indicator 7 Over temperature warning indicator 2 System fault indicator 5 Disk fault indicator 8 SIM board fault indicator 3 System power indicator 6 Disk activity indicator 9 Power supply fault indicator Sun Storage J4400 Array 82 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf Figure Legend 1 OK status indicator 2 Fault indicator Rear Panel The rear panel consists of hot-pluggable I/O Modules (IOMs) or SIM boards, and power supply with fan modules. Oracle Storage Drive Enclosure DE2-24P Figure Legend 1 Power Supply with Fan Module 0 3 I/O Module 0 2 I/O Module 1 4 Power Supply with Fan Module 1 Oracle Storage Drive Enclosure DE2-24C Chapter 2 • Hardware Maintenance 83 Shelf Figure Legend 1 Power Supply Filler Panel, Slot 0 4 Power Supply Filler Panel, Slot 3 7 I/O Module Filler Panel 2 Power Supply with Fan Module 1 5 I/O Module Filler Panel 8 I/O Module 1 3 Power Supply with Fan Module 2 6 I/O Module 0 Note: It is especially important that power supplies and their filler panels are in the correct slots. Sun Disk Shelf and Sun Storage J4400 Array 84 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf Figure Legend 1 Power supply modules with built-in fans. Power supply 0 is on the left and power supply 1 is on the right. 2 Removable SAS Interface Module (SIM) Boards. SIM 0 is on the left, and SIM 1 is on the right. I/O Module Indicators The following disk shelves have I/O Modules (IOMs): ■ ■ Oracle Storage Drive Enclosure DE2-24P Oracle Storage Drive Enclosure DE2-24C Chapter 2 • Hardware Maintenance 85 Shelf Figure Legend 1 Fault / Locate indicator 4 SAS-2 Port 1 7 For Oracle service only 2 Power / OK indicator 5 SAS-2 Port 2 8 For Oracle service only 3 SAS-2 Port 0 6 Host port activity indicators SIM Board Indicators The following disk shelves have SIM boards: ■ ■ ■ Sun Disk Shelf Sun Storage J4400 Array Sun Storage J4500 Array The following figure shows the SIM board indicators for the Sun Disk Shelf. 86 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf Figure Legend 1 AC power indicator 6 Power switch 2 DC power indicator 7 Port fault indicator 3 Fan fault indicator 8 Port OK indicator 4 Power supply fault indicator 9 SIM board OK indicator (green)/SIM board fault indicator (amber) 5 Universal power connector 10 SIM locator indicator The following figure shows the SIM board indicators for the Sun Storage J4400 Array and the Sun Storage J4500 Array. Chapter 2 • Hardware Maintenance 87 Shelf Figure Legend 1 SIM link IN 2 SIM link Out 3 SIM link IN status indicator LEDs 4 SIM link Out status indicator LEDs 5 SIM board Power/OK LEDs Power Supply Indicators The following figure shows power supply with fan module indicators for these disk shelves: ■ ■ Oracle Storage Drive Enclosure DE2-24P Oracle Storage Drive Enclosure DE2-24C Figure Legend 1 DC power fail indicator 6 Universal power input connector 2 Fan fail indicator 7 Power cord tie wrap 3 AC power fail indicator 4 Power supply status indicator 5 Power on/off switch The following figure shows power supply with fan module indicators for these disk shelves: ■ 88 Sun Disk Shelf Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf ■ ■ Sun Storage J4400 Array Sun Storage J4500 Array Figure Legend 1 Cooling fan status indicator 6 Power cord tie wrap 2 AC power status indicator 7 Universal power input connector 3 DC power status indicator 8 Right ejection arm and captive screw latch 4 Power supply status indicator 5 Power on/off switch Disk Shelf Configurations The following tables describe and provide part numbers for the supported expansion storage shelves. Chapter 2 • Hardware Maintenance 89 Shelf Oracle Storage Drive Enclosure DE2-24P Mktg Part Number Description 7103910 Drive Enclosure DE2-24P Base Chassis Disk Shelf Rail Kit 7103911 300GB 10Krpm, SAS-2, 2.5" HDD 7103912 900GB 10Krpm, SAS-2, 2.5" HDD 7103915 73GB SSD SAS-2, 2.5" Write Flash Accelerator 7103917 Filler Panel, Drive Enclosure DE2-24P Oracle Storage Drive Enclosure DE2-24C Mktg Part Number Description 7103914 Drive Enclosure DE2-24C Base Chassis Disk Shelf Rail Kit 7103913 3TB 7.2Krpm, SAS-2, 3.5" HDD 7103916 73GB SSD XATO SAS-2, 2.5" (2.5" to 3.5" Drive Adapter) 7103918 Filler Panel, Drive Enclosure DE2-24C Sun Disk Shelf (DS2) 90 Mktg Part Number Description DS2-0BASE Sun Disk Shelf (DS2) 24x3.5" SAS-2 DS2-HD2T 2TB 7.2Krpm, SAS-2, 3.5" HDD 7101765 3TB 7.2Krpm, SAS-2, 3.5" HDD 7101274 300GB 15Krpm, SAS-2, 3.5" HDD 7101276 600GB 15Krpm, SAS-2, 3.5" HDD 7101197 73GB SSD XATO, 3.5" DS2-LOGFILLER Sun Disk Shelf (DS2) 24x3.5", LOGFiller DS2-4URK-19U Disk Shelf Rail Kit Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf Sun Storage J4400 Array Mktg Part Number Description XTA4400R00A2N12000 Sun Storage J4400, 12xHDD XTA4400R00A2N24000 Sun Storage J4400, 24HDD XTA4400A2N11SA18 Sun Storage J4400, 11xHDD, 1xSSD XTA4400A2N10SA36 Sun Storage J4400, 10xHDD, 2xSSD XTA4400A2N23SA18 Sun Storage J4400, 23xHDD, 1xSSD XTA4400A2N22SA36 Sun Storage J4400, 22xHDD, 2xSSD XTA4400A2N20SA72 Sun Storage J4400, 20xHDD, 4xSSD Sun Storage J4500 Array Mktg Part Number Description XTA4500R00A1A24TB 48x500/7K SATA HDD,1xI/O card XTA4500R00A1N48TB 48x1TB/7K SATA HDD,1xI/O card Shelf Disk Shelf Maintenance Procedures This section provides procedural details for customer replaceable units (CRUs) of any disk shelf or drive enclosure that attaches to the Sun ZFS Storage 7000 family of products. Refer to Disk Shelf Overview for component specifications and diagrams. Prerequisites Read the information in the overview section for your controller to become familiar with the replaceable parts of the system: ■ ■ 7120 | 7320 | 7420 Overviews - component diagrams and specifications 7110 | 7210 | 7310 | 7410 Overviews - component diagrams and specifications Follow the instructions in the Electrostatic Discharge Precautions section. Chapter 2 • Hardware Maintenance 91 Shelf Safety Information Follow all cautions, warnings, and instructions marked on the equipment and described in Important Safety Information for Sun Hardware Systems located at http://docs.oracle.com/cd/ E19446-01/. (http://docs.oracle.com/cd/E19446-01/.) Electrostatic Discharge Precautions ■ Remove all plastic, vinyl, and foam material from the work area. ■ Wear an antistatic wrist strap at all times when handling any CRU. ■ Before handling any CRU, discharge any static electricity by touching a grounded surface. ■ Do not remove a CRU from its antistatic protective bag until you are ready to install it. ■ After removing a CRU from the chassis, immediately place it in an antistatic bag or antistatic packaging. ■ Handle any card that is part of a CRU by its edges only and avoid touching the components or circuitry. ■ Do not slide a CRU over any surface. ■ Limit body movement (which builds up static electricity) during the removal and replacement of a CRU. Removing Power from the Disk Shelf Most disk shelf components are hot-swappable; you do not need to remove power when replacing them. Do not remove a component if you do not have an immediate replacement. The disk shelf must not be operated without all components in place. Powering off or removing all SAS chains from a disk shelf will cause the controller(s) to panic to prevent data loss, unless the shelf is part of an NSPF (no single point of failure) data pool. To avoid this, shut down the controller(s) before decommissioning the shelf. For details on NSPF profiles, see Profile Configuration. 1. Stop all input and output to and from the disk shelf. 2. Wait approximately two minutes until all disk activity indicators have stopped flashing. 3. Place the power supply on/off switches to the "O" off position. 4. Disconnect the power cords from the external power source for the cabinet.Note: All power cords must be disconnected to completely remove power from the disk shelf. 92 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf Tasks Shelf Tasks ▼ Replacing a Drive The shelf drives are hot-swappable and may be replaced without removing power from the shelf. The replacement drive must be of the same capacity and type as the drive to be replaced. To avoid possible data loss when removing non-faulted drives, label each drive with the number of the slot from which it was removed and reinstall each drive into the same slot. Faulted drives are indicated by an amber LED. Go to the Maintenance > Hardware section of the BUI, click the right-arrow icon at the beginning of the appropriate disk shelf row, and for the appropriate drive to view details, or click the locate icon click the information icon to turn on the locator LED. Important: Do not remove a component if you do not have an immediate replacement. The disk shelf must not be operated without all components in place. 1 Locate the failed disk drive at the front of the chassis. 2 Press the release button or latch to release the drive lever. Chapter 2 • Hardware Maintenance 93 Shelf 3 Pull the drive lever fully open to unlock and partially eject the drive from the chassis. OR 94 4 Grasp the middle of the drive body and pull it toward you to remove it from the chassis. 5 Ensure the new drive lever is in the fully extended position. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf 6 While constantly pushing toward the pivot point of the lever, slide the drive fully into the chassis slot. OR 7 Press the drive lever closed until it locks in place. For vertically oriented drives, push down on the top of the drive if it is higher than surrounding drives to properly seat it. The Activity LED will be steady green to indicate a ready state. 8 Go to the Maintenance > Hardware section of the BUI, click the right-arrow icon at the beginning of the appropriate disk shelf row, and then click Disk to verify that the disk icon green for the newly installed disk. is Chapter 2 • Hardware Maintenance 95 Shelf ▼ Replacing a Power Supply Disk shelves are provided with redundant power supplies to prevent loss of service due to component failure. Each power supply is accompanied by one or more chassis cooling fans in one customer-replaceable unit (CRU). Power supplies are hot-swappable, meaning they can be replaced one at a time without removing power from the disk shelf. The modules can produce a high-energy hazard and should only be replaced by instructed individuals with authorized access to the equipment. Separate indicator LEDs on the rear panel represent the operational state of power supplies and fans individually; see the rear panel illustration for details. Failed components are indicated by amber LEDs as well as amber icons in the administrative BUI. Go to the Maintenance > Hardware section of the BUI, click the right-arrow icon at the beginning of the appropriate disk shelf row, then select PSU or Fan to view details on the respective components. You can also click the locate icon to flash the chassis locator LED. Important: Do not remove a component if you do not have an immediate replacement. The disk shelf must not be operated without all components in place. 96 1 Locate the chassis and module containing the failed component. 2 Ensure the power supply on/off switch is in the "O" off position. 3 Disconnect the power cord tie strap from the power cord, and unplug the power cord from the power supply. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf 4 Release the lever/ejection arms. Oracle Storage Drive Enclosure DE2-24P or DE2-24C:Grasp the latch and the opposite side of the module, and squeeze together to release the lever. Sun Disk Shelf / J4500 / J4400:Using your thumb and forefinger, unscrew both ejection arm captive screws until loose and swing the ejection arms out until they are fully open. Chapter 2 • Hardware Maintenance 97 Shelf 5 Pull the module out of the chassis, being careful not to damage the connector pins in the back. 6 With the lever/ejection arms fully open, slide the new module into the chassis slot until it contacts the chassis backplane, and the lever/ejection arms begin to engage. 7 Close the lever/ejection arms. Oracle Storage Drive Enclosure DE2-24P or DE2-24C:Push the lever fully closed until you hear or feel a click. Sun Disk Shelf / J4500 / J4400:Push the ejection arms fully closed and secure both captive screws to seat and secure the module in the chassis. 98 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf 8 Ensure the power supply on/off switch is in the "O" off position. 9 Plug the power cord into the new power supply and attach the power cord tie strap to the power cord. 10 Place the power supply on/off switch to the "I" on position. The Power/OK status LED should be a steady green, and all other indicators should be off. 11 at the Go to the Maintenance > Hardware section of the BUI, and click the right-arrow icon beginning of the appropriate disk shelf row. As appropriate for the failure, click either PSU or Fan to verify that the icon is green for the newly installed power supply with fan module. ▼ Replacing an I/O Module The following disk shelves have I/O Modules (IOMs): ■ ■ Oracle Storage Drive Enclosure DE2-24P Oracle Storage Drive Enclosure DE2-24C The I/O Modules (IOMs), which are similar to SIM boards, are hot-swappable so you can replace them without removing power to the system. A faulted I/O Module is indicated by an amber LED. Go to the Maintenance > Hardware section of the BUI, click the right-arrow icon at the beginning of the appropriate disk shelf row, and then click Slot to view details, or click the locate icon to turn on the locator LED. Important: Do not remove a component if you do not have an immediate replacement. The disk shelf must not be operated without all components in place. 1 Locate the failed I/O Module at the back of the disk shelf. 2 Label and disconnect the I/O Module interface cables. 3 Using your thumb and forefinger, squeeze the release button toward the lever hole to release the lever. Chapter 2 • Hardware Maintenance 99 Shelf 4 Grasp the lever and remove the I/O Module, being careful not to damage the connector pins in back. 5 With the lever of the new I/O Module in the open position, slide the I/O Module into the disk shelf, being careful of the connector pins. 6 Push the lever fully closed until you hear or feel a click. 7 Reconnect the interface cables to their original locations. 8 Wait approximately 60 seconds for the I/O Module to complete its boot process, at which time the Power LED should be solid green and the Fault/Locate LED should be off. All four activity LEDs should be solid green for each SAS-2 port that has an interface cable connected to it. 9 at the Go to the Maintenance > Hardware section of the BUI, click the right-arrow icon beginning of the appropriate disk shelf row, and then click Slot to verify that the I/O Module icon is green for the newly installed I/O Module. ▼ Replacing a SIM Board The following disk shelves have SIM boards: ■ ■ ■ Sun Disk Shelf Sun Storage J4500 Array Sun Storage J4400 Array The SIM boards, which are similar to I/O Modules, are hot-swappable so you can replace them without removing power to the system. The SIM boards are multi-pathed, so you can remove one of the SIM boards at any time, regardless of the state of the blue SIM OK indicator. A faulted SIM board is indicated by an amber LED. Go to the Maintenance > Hardware section of at the beginning of the appropriate disk shelf row, and the BUI, click the right-arrow icon then click Slot to view details, or click the locate icon to turn on the locator LED. Important: Do not remove a component if you do not have an immediate replacement. The disk shelf must not be operated without all components in place. 100 1 Locate the failed SIM at the back of the disk tray. 2 Label and disconnect the tray interface cables. 3 Use two hands to disconnect the SAS cable. Grasp the metal body of the connector with one hand and firmly grasp and pull the tab gently toward the connector body with the other, then pull the connector body outward to extract it from the bulkhead. Do not twist or pull the tab in Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Shelf any direction other than parallel with the connector body or it may break. If the tab breaks, use a small sharp object (such as a fine-tipped screwdriver) to lift the metal spring at the top of the connector shell to unlatch it. 4 Loosen the two extraction arm captive screws using your thumb and forefinger. If the captive screws are too tight to loosen by hand, use a No.2 Phillips screwdriver to loosen each screw. 5 Pull each ejector tab outward and push to the sides to release and partially eject the SIM from the chassis. 6 Grasp the middle of the SIM board and slide it out of the slot. Chapter 2 • Hardware Maintenance 101 Shelf 7 With the ejector arms in the full open position, align the new SIM board with the open slot and slide it into the tray until the ejector arms contact the tray connectors and begin to swing closed. 102 8 Swing both ejector arms in until they are flush with the SIM board panel to seat the board. 9 Tighten both captive screws to secure the board. 10 Reconnect the SAS interface cables to their original locations. 11 Wait approximately 60 seconds for the SIM board to complete its boot process, at which time the Power LED should be solid green and the SIM locate LED should be off. 12 at the Go to the Maintenance > Hardware section of the BUI, click the right-arrow icon beginning of the appropriate disk shelf row, and then click Slot to verify that the SIM board icon is green for the newly installed SIM board. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Faults Faults Hardware Faults This section describes connecting to the controller Service Processor (SP) and configuration considerations for maximum serviceability. In rare cases, faults associated with uncorrectable CPU errors are not diagnosable or displayed in the controller. These faults will be preserved by and observable on the ILOM. The following sections describe how to connect to and manage faults for these cases. Connect to ILOM Connect to the server ILOM (Service Processor) on the server platform to diagnose hardware faults that do not appear in the BUI. In a cluster environment, an ILOM connection should be made to each controller. The server ILOM provides options for (i) network and (ii) serial port connectivity. Network connection is the preferred choice, as the ILOM serial port does not always allow adequate means of platform data collection. WARNING : Failure to configure ILOM connectivity may lead to longer than necessary hardware fault diagnosis and resolution times. Management Port Configuration All standalone controllers should have at least one NIC port configured as a management interface. Select the Allow Admin option in the BUI to enable BUI connections on port 215 and CLI connections on ssh port 22. All cluster installations should have at least one NIC port on each controller configured as a management interface as described above. In addition, the NIC instance number must be unique on each controller. For example, nodeA uses nge0 and nodeB uses nge1, so that neither may be used as a cluster data interface. In addition, these interfaces must be locked to the controller using the Configuration -> Cluster option in the BUI. In some cases, this may require installation of an additional network interface card on each controller in a cluster configuration. If access to the appliance data interfaces is impossible for any reason, the management network interface will maintain BUI and CLI access. During a cluster takeover, interfaces are taken down on the failed controller. So, locked interface configuration is required to gather diagnostic information from a failed controller. WARNING : Failure to configure locked management interfaces on a cluster may lead to longer than necessary fault diagnosis and resolution times. Chapter 2 • Hardware Maintenance 103 HBA Expansion pt.1 Observing and Clearing CPU faults from ILOM Log in to the server as root using the ILOM CLI. To view server faults, type the following command to list all known faults on the system: -> show /SP/faultmgmt The server lists all known faults, for example: SP/faultmgmt Targets: 0 (/SYS/MB/P0) Properties: Commands: cd show To clear the CPU fault, type the following command: -> set /SYS/MB/Pn clear_fault_action=true For example, to clear a fault on CP0: -> set /SYS/MB/P0 clear_fault_action=true Are you sure you want to clear /SYS/MB/P0 (y/n)? y See Also Cluster Configuration HBA Expansion pt.1 Expanding from 2 to 3 HBAs Follow the steps below to migrate your clustered controllers from two to three HBAs without loss of service. To upgrade your standalone controller, you will need to power down the appliance, disconnect the expansion storage, and follow the instructions for installing PCI cards and connecting expansion storage to your appliance model, found elsewhere in this guide. Note: Location of HBAs in the diagrams below are not representative. Refer to the PCIe Options of your appliance model's Hardware Maintenance Overview for proper slot allocation. Cabling Diagrams 104 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 HBA Expansion pt.1 fig.1 Cluster with two HBAs and chains of disk shelves. fig.2 Power down and disconnect the first controller, then install new HBA according to the instructions on installing PCI cards.Rack and cable together the new chain of disk shelves, then power on shelves. Chapter 2 • Hardware Maintenance 105 HBA Expansion pt.1 fig.3 Connect the controller to the second and third chain as shown. fig.4 Reconnect the controller as shown to the remaining chains. Power on the controller and wait for it to regain control of connected storage. 106 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 HBA Expansion pt.1 fig.5 Power down and disconnect the second controller, then install new HBA according to the instructions on installing PCI cards. fig.6 Connect the controller to the second and third chain as shown. Chapter 2 • Hardware Maintenance 107 HBA Expansion pt.2 fig.7 Reconnect the controller as shown to the remaining chains. Power on the controller and wait for it to regain control of connected storage. fig.8 Cluster with three HBAs and chains of disk shelves. HBA Expansion pt.2 Expanding from 3 to 4 HBAs Follow the steps below to migrate your clustered controllers from three to four HBAs without loss of service. To upgrade your standalone controller, you will need to power down the appliance, disconnect the expansion storage, and follow the instructions for installing PCI cards and connecting expansion storage to your appliance model, found elsewhere in this guide. 108 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 HBA Expansion pt.2 Note: Location of HBAs in the diagrams below are not representative. Refer to the PCIe Options of your appliance model's Hardware Maintenance Overview for proper slot allocation. Cabling Diagrams fig.1 Cluster with three HBAs and chains of disk shelves. Chapter 2 • Hardware Maintenance 109 HBA Expansion pt.2 fig.2 Power down and disconnect the first controller, then install new HBA according to the instructions on installing PCI cards.Rack and cable together the new chain of disk shelves, then power on shelves. fig.3 Connect the controller to the third and fourth chain as shown. 110 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 HBA Expansion pt.2 fig.4 Reconnect the controller as shown to the remaining chains. Power on the controller and wait for it to regain control of connected storage. fig.5 Power down and disconnect the second controller, then install new HBA according to the instructions on installing PCI cards. Chapter 2 • Hardware Maintenance 111 HBA Expansion pt.2 fig.6 Connect the controller to the third and fourth chain as shown. fig.7 Reconnect the controller as shown to the remaining chains. Power on the controller and wait for it to regain control of connected storage. 112 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 HBA Expansion pt.3 fig.8 Cluster with four HBAs and chains of disk shelves. HBA Expansion pt.3 Expanding from 4 to 5 HBAs Follow the steps below to migrate your clustered controllers from four to five HBAs without loss of service. To upgrade your standalone controller, you will need to power down the appliance, disconnect the expansion storage, and follow the instructions for installing PCI cards and connecting expansion storage to your appliance model, found elsewhere in this guide. Note: Location of HBAs in the diagrams below are not representative. Refer to the PCIe Options of your appliance model's Hardware Maintenance Overview for proper slot allocation. Cabling Diagrams fig.1 Cluster with four HBAs and chains of disk shelves. Chapter 2 • Hardware Maintenance 113 HBA Expansion pt.3 fig.2 Power down and disconnect the first controller, then install new HBA according to the instructions on installing PCI cards.Rack and cable together the new chain of disk shelves, then power on shelves. fig.3 Install new HBA according to the instructions on installing PCI cards, and connect the controller to the first and third chain as shown. 114 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 HBA Expansion pt.3 fig.4 Reconnect the controller as shown to the remaining chains. Power on the controller and wait for it to regain control of connected storage. fig.5 Power down and disconnect the second controller, then install new HBA according to the instructions on installing PCI cards. Chapter 2 • Hardware Maintenance 115 HBA Expansion pt.3 fig.6 Connect the controller to the first and third chain as shown. fig.7 Reconnect the controller as shown to the remaining chains. Power on the controller and wait for it to regain control of connected storage. 116 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 HBA Expansion pt.4 fig.8 Cluster with five HBAs and chains of disk shelves. HBA Expansion pt.4 Expanding from 5 to 6 HBAs Follow the steps below to migrate your clustered controllers from five to six HBAs without loss of service. To upgrade your standalone controller, you will need to power down the appliance, disconnect the expansion storage, and follow the instructions for installing PCI cards and connecting expansion storage to your appliance model, found elsewhere in this guide. Note: Location of HBAs in the diagrams below are not representative. Refer to the PCIe Options of your appliance model's Hardware Maintenance Overview for proper slot allocation. Cabling Diagrams Chapter 2 • Hardware Maintenance 117 HBA Expansion pt.4 fig.1 Cluster with five HBAs and chains of disk shelves. fig.2 Power down and disconnect the first controller, then install new HBA according to the instructions on installing PCI cards.Rack and cable together the new chain of disk shelves, then power on shelves. 118 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 HBA Expansion pt.4 fig.3 Install new HBA according to the instructions on installing PCI cards, and connect the controller to the fourth and sixth chain as shown. fig.4 Reconnect the controller as shown to the remaining chains. Power on the controller and wait for it to regain control of connected storage. Chapter 2 • Hardware Maintenance 119 HBA Expansion pt.4 fig.5 Power down and disconnect the second controller, then install new HBA according to the instructions on installing PCI cards. fig.6 Connect the controller to the fourth and sixth chain as shown. 120 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 HBA Expansion pt.4 fig.7 Reconnect the controller as shown to the remaining chains. Power on the controller and wait for it to regain control of connected storage. fig.8 Cluster with six HBAs and chains of disk shelves. Chapter 2 • Hardware Maintenance 121 122 3 C H A P T E R 3 System Maintenance System Introduction The Maintenance > System screen provides several system-level features. The screen allows the administrator to: ■ ■ ■ ■ ■ ■ ■ View the status of the system disks Manage software updates and update the system software Create and restore appliance configuration backups Create and upload a support bundle Repeat the initial setup with existing settings Reset the system to the factory defaults View pending disk firmware updates System Disks The system disks section shows the status of the system disks, and their current usage. The BUI displays this with a pie-chart, and the CLI as a text list. For example: tarpon:> maintenance system disks show Properties: profile = mirror root = 1.14G var = 52.4M update = 2.52M stash = 14.8M dump = 16.0G cores = 18K unknown = 39.0G free = 401G 123 System Disks: DISK disk-000 disk-001 LABEL HDD 7 HDD 6 STATE healthy healthy Note: The "disk" column is not required by the GUI. Support Bundles The appliance can generate support bundles containing system configuration information and core files for use by remote support in debugging system failures. Support bundles are generated automatically in response to faults if the Phone Home service is enabled. Administrators can manually generate and upload a support bundle from this section of the Maintenance > System screen. Once generated, support bundles are automatically uploaded to Oracle's Support files Service at http://support.oracle.com. (http://support.oracle.com.) To facilitate this, the appliance must be connected to the Internet, either directly or through the web proxy configured on the Phone Home service screen. If the upload fails, the system will make another attempt. After a support bundle has been successfully uploaded, the support bundle and core files are automatically deleted from the system. Managing Support Bundles Using the BUI To generate a support bundle, click the icon next to Support Bundles on the Maintenance > System screen. You are presented with the randomly generated filename for the support bundle. Provide this filename to support personnel so that they can retrieve your support bundle. For each support bundle currently being generated or uploaded or which has failed to upload, the following options may be available: Icon Description Cancel the current operation. If the bundle is being generated, it will be deleted. If the bundle is being uploaded, the upload will be cancelled and the appliance will not retry it later. Download the support bundle. Try again to upload the bundle to support. Cancel any pending operation and delete the support bundle. 124 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 System Managing Support Bundles Using the CLI To generate and upload a new support bundle, use the sendbundle command: loader:> maintenance system loader:maintenance system> sendbundle A support bundle is being created and sent to Oracle. You will receive an alert when the bundle has finished uploading. Please save the following filename, as Oracle support personnel will need it in order to access the bundle: /cores/ak.9a4c3d7b-50c5-6eb9-c2a6-ec9808ae1cd8.tar.gz As the message indicates, you must provide this filename to support personnel in order for them to retrieve your bundle. Manage bundles from the maintenance system bundles context in the CLI, as follows: loader:maintenance system> bundles loader:maintenance system bundles> list BUNDLE /cores/ak.9a4c3d7b-50c5-6eb9-c2a6-ec9808ae1cd8.tar.gz loader:maintenance system bundles> STATUS Uploading PROGRESS 7% Bundles are identified by the filename, omitting the ak. prefix and the file type suffix. To delete a support bundle, use the destroy command. To view details, use the select and list commands: loader:maintenance system bundles> select 9a4c3d7b-50c5-6eb9-c2a6-ec9808ae1cd8 loader:maintenance system bundles 9a4c3d7b-50c5-6eb9-c2a6-ec9808ae1cd8> list Properties: filename = /cores/ak.9a4c3d7b-50c5-6eb9-c2a6-ec9808ae1cd8.tar.gz status = uploading step_progress = 14.709744730821669 These read-only properties indicate that the appliance is 14% of the way through uploading the file. To retry a failed upload or cancel a pending operation, enter the retry and cancel commands respectively. Initial Setup Initial setup will step through the tasks performed as part of the initial configuration. This will not change any of the current settings unless explicitly requested. User data on the storage pool (including projects and shares) will not be affected. To perform an initial setup: ■ ■ BUI: click the "INITIAL SETUP" button on the Maintenance > System screen. CLI: enter the maintenance system context, then enter the setup command. Chapter 3 • System Maintenance 125 Updates Factory Reset Factory reset will reset the appliance configuration back to factory settings of the current software version, and reboot the appliance. All configuration changes will be lost, and the appliance will need to go through initial configuration again, as when it was first installed. User data on the storage pool (including projects and shares) will not be affected - however the pool will need to be imported as part of the initial setup process. To perform a factory reset: ■ ■ ■ BUI: click the "FACTORY RESET" button on the Maintenance > System screen. CLI: enter the maintenance system context, then issue the factoryreset command. GRUB: Add -c to the GRUB menu on the line beginning with kernel. Note: Factory reset of a single controller while configured into a cluster is not supported. The controller must be unclustered first. Updates System Updates The system update feature provides customers, developers, and field personnel with the ability to update a system's software after the system is installed. Software updates are delivered as opaque binary downloads that contain some or all of: ■ ■ ■ ■ Management and system software. Firmware for internal components such as HBAs and network devices. Firmware for disks and flash devices. Firmware for external storage enclosure components. In general, the update release notes describe what is in the update, and the update process automates all of the steps of activating the delivered components. Procedure Overview The procedure for updating the system is as follows: ■ 126 First, the software update media is downloaded from an Oracle support website or retrieved from another official source. The media is represented by a single compressed file named after the version number, such as: ak-nas-2010-02-09-1-0.pkg.gz. The file can be renamed if needed, as the true version number is recorded internally within the image. The compressed media packages will vary in size, but typically will be on the order of several hundred megabytes. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Updates ■ Second, the software media is uploaded to the appliance. This can be done via either the BUI or the CLI; see below for details of this operation. ■ After the media is uploaded, it will be unpacked and verified. If all verification checks pass, it will appear in the list of update images as eligible for installation. Any number of images can be maintained on the appliance, subject to a system disk space quota, without actually applying them. If an update has not yet been applied (i.e. is not running and is not a rollback target), it can be deleted via either the BUI or the CLI. You might want to delete images in order to free up needed space to download new images. ■ Administrators should verify that the system is in a healthy state prior to applying the update. The details are described below in the preconditions section. ■ After the media is unpacked and verified, the update can be applied. During this process, an update health check will be performed to verify the appliance is ready to update. You may be asked to set update options and confirm. For more information on these questions, see the section on deferred updates. If the update is no longer appropriate for the system (because you have skipped past its version number), an error message may be provided. During the update, messages and a progress meter will appear to indicate that the update is proceeding. The installation portion of the update will take about half an hour to complete; however, the full upgrade process may not be complete at that point. See below regarding additional firmware upgrades that may take place following the reboot. ■ While the upgrade is in progress, up until the reboot and following the reboot during any firmware upgrades, it is non-disruptive: the controller continues to provide data services to clients. If the system software fails during the upgrade, it will reboot and continue running the software from before the upgrade. Important: Do not perform a cluster takeover operation or a reboot while an upgrade is in progress. ■ Following the post-upgrade reboot, component firmware will be updated (see firmware updates below) which will take additional time that depends on the size of the system configuration and the amount of firmware that has changed since the previously-installed version was delivered; very large Sun Storage 7410 configurations may take several hours to complete all firmware upgrades once the update itself has been applied. For details on the update process using the BUI or CLI, review the sections below. Preconditions Best practices include verifying several preconditions prior to applying an update. Whenever possible, administrators should ensure that these preconditions are satisfied immediately prior to applying an update on the storage controller. In a clustered environment, these should be verified on both storage controllers before applying the update to either one. ■ Ensure that any resilvering operations have completed. This can be observed in Configuration/Storage or the equivalent CLI context. ■ Ensure that there are no active problems. Chapter 3 • System Maintenance 127 Updates ■ Verify that firmware updates are not in progress. ■ Check the most recent product release notes for additional preconditions that should be observed for the software release to which you are upgrading. Update Health Checks System level health checks are provided to help ensure that no pathologies will interfere with the software update. If a problem is encountered, it is noted in the Alert Log and the update process is aborted. System software updates will not proceed until all problems have been corrected. You can manually run the same health checks in advance of any planned update. This allows you to check the state of the system prior to scheduling an update maintenance window so you can correct any problems that could interfere with the update process. Any problem report that is issued by a manual health check is identical to that issued by the health checks integrated in the update process. As with the integrated health checks, you are presented with a link to the Alert Log when problems are found. If no problems are found, the System Ready state transitions to Yes to indicate that the system is ready for software updates. Note: Running an update health check does not replace meeting required preconditions. Precondition checks must also be executed and problems resolved prior to updating the system software. BUI After you select and start an update, update health checks may be issued from the software update dialog box in the BUI. The system remains in the Unchecked state until the Check button is clicked. During the health check operation, an indicator shows its progress. 128 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Updates After completion, the System Ready state changes to Yes or No with a link to the Alert Log. CLI To execute the update health checks via the CLI, execute the upgrade command in maintenance system updates after selecting the update media: gill:maintenance system updates> select [email protected],1-0 gill:maintenance system updates [email protected],1-0> check You have requested to run checks associated with waiting upgrade media. This will execute the same set of checks as will be performed as part of any upgrade attempt to this media, and will highlight conditions that would prevent successful upgrade. No actual upgrade will be attempted, and the checks performed are of static system state and non-invasive. Do you wish to continue? Are you sure? (Y/N) Healthcheck running ... / Healthcheck completed. There are no issues at this time which would cause an upgrade to this media to be aborted. Chapter 3 • System Maintenance 129 Updates Troubleshooting Update Health Check Failures Prior to the actual update, health checks are performed automatically when an update is started. If an update health check fails, it can cause an update to abort (see Example 1). Update health checks only validate issues that can impact updates. 7000:> maintenance system updates select [email protected],1-1.19 upgrade This procedure will consume several minutes and requires a system reboot upon successful update, but can be aborted with at any time prior to reboot. A health check will validate system readiness before an update is attempted, and may also be executed independently using the check command. Are you sure? (Y/N) error: System is not in an upgradeable state: prerequisite healthcheck reports problems. See alert log for more. Example 1. BUI and CLI update health check failures Actions to Take to Resolve Health Check Alerts After an update health check failure, you can review the Alert Log and take action to resolve each failure based on the message in the log. The following table lists the update health check failures that can block an update, and describes the associated Alert Log message and recommended order of steps you can take to resolve the issue. For component faults, follow the instructions for removal and installation found in the Customer Service Manual. 130 ID and Alert Log Message Failure Resolution Steps B1 "System software update cannot proceed: Slot <label> in disk shelf <name> is reported as absent." SIM cannot be detected. 1, 2, 4 B2 "System software update cannot proceed: Slot <label> in disk shelf <name> is faulted." SIM is faulted. 1, 2, 4 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Updates ID and Alert Log Message Failure Resolution Steps C1 "System software update cannot proceed: Some slots of disk shelf <name> have no firmware revision information." SIM is missing firmware revision information. 1, 4 C2 "System software update cannot proceed: The slots of disk shelf <name> have non-uniform part numbers." SIMs report different part numbers. 2, 4 C5 "System software update cannot proceed: The slots of disk shelf <name> have mixed firmware revisions <rev1> and <rev2>." SIMs report different firmware revisions. 4 zero paths>." Disk shelf does not have two paths. 1, 2, 4 E2 "System software update cannot proceed: Disk shelf <name> path <pathname> is <state>." Disk shelf path is not online. 1, 2, 4 log> disk <label> in disk shelf <name> has <just one Disk or log device that is 3, 4 path | zero paths>." configured in a pool does not have two paths. PAN1 "Slot <slot> has a Revision B3 SAS HBA; Revision C0 (or later) required." A revision B3 SAS HBA is present. 4 Take the following steps in the order listed above to resolve the issue detected during the upgrade health check. 1. If a SAS port LED is unlit, check all connections and replace cables as needed. 2. Identify affected chassis, then disconnect and remove faulted SIM. After two minutes, re-seat SIM and wait for steady Power LED before reconnecting cables. 3. Identify affected chassis, and remove faulted disk. After 30 seconds, re-seat disk and wait for steady or flashing LED. 4. Contact Oracle Support for component service or replacement. Deferred Updates Each update may come with new firmware or updates to external resources. In general, these updates are backwards-compatible and applied automatically without user intervention. There are exceptions, however, for non-reversible updates. These updates involve updating a resource external to the system software in a way that is incompatible with older software releases. Once the update is applied, rolling back to previous versions will result in undefined behavior. For these updates, you will always be given an explicit option of applying them automatically during upgrade or applying them after the fact. They are therefore referred to as "deferred updates". Chapter 3 • System Maintenance 131 Updates When applying an update to a version with incompatible version changes, you will be given an option to apply these version changes as part of the upgrade. For each version change, the benefits of applying the change will be presented to you. The default is to not apply them, requiring you to return to the updates view and apply them once the system has rebooted after the upgrade is applied. This allows you to verify that the rest of the software is functional and a rollback is not required before applying the update. If you elect to not apply deferred updates during an upgrade, you can return to the updates view at any point to apply the update. If deferred updates are available for the current software version, they will appear as a list below the current set of available updates, with an 'Apply' button to apply the updates. Deferred updates in a cluster take effect on both storage controllers simultaneously, and can only be applied while both controllers are operational. Because deferred updates are listed only for resources present on the local storage controller, in a cluster it may be the case that deferred updates are available only for resources active on the peer controller. In a cluster, it is therefore necessary to check both storage controllers to determine the availability of deferred updates. Note: Replication does not work across deferred updates. After deferred updates are applied that increment the stream format version, it is no longer possible to replicate to an older system. See Incompatible target Replication Failure for a description. Feature Version introduced "passthrough-x" aclinherit property 2009.Q2.0.0 132 User quotas 2009.Q3.0.0 COMSTAR 2009.Q3.0.0 Triple-Parity RAID 2009.Q3.0.0 Dedup 2010.Q1.0.0 Replication 2010.Q1.0.0 Received Properties 2010.Q1.0.0 Slim ZIL 2010.Q3.1.0 Snapshot Deletion 2010.Q3.1.0 Recursive Snapshots 2010.Q3.1.0 Multi Replace 2010.Q3.1.0 RAIDZ Mirror 2011.1.0.0 Optional Child Directory 2011.1.0.0 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Updates Reboot After an Update Following the completion of the update process, the system will reboot automatically. If you have the serial console open, you will notice during this reboot that multiple GRUB menu entries are available, ordered from the newest software (at the top) to the oldest software (at the bottom). The default menu entry will be the top -- the new software to which you just updated. If you do nothing this entry will boot by default, completing the update. The previous entries are rollback targets that can be used to initiate a rollback to previous versions of the system software. Rollback is discussed below. GNU GRUB version 0.95 (613K lower / 3537536K upper memory) +-------------------------------------------------------------------------+ | Sun Storage 7110 2010.02.09,1-0 | | Sun Storage 7110 2009.09.01,1-18 | | | +-------------------------------------------------------------------------+ Use the ^ and v keys to select which entry is highlighted. Press enter to boot the selected OS, ’e’ to edit the commands before booting, or ’c’ for a command-line. As the system boots up using the new system software, you will see some special messages on the first boot indicating that an update is completing and noting the previous and new versions of the system software: SunOS Release 5.11 Version ak/[email protected],1-0 64-bit Copyright 1983-2010 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. System update in progress. Updating from: ak/[email protected],1-18 Updating to: ak/[email protected],1-0 Updating system datasets ....... done. Configuring network devices ... done. Configuring devices. Sun Storage 7110 Version ak/[email protected],1-0 Copyright 2010 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Reading ZFS config: done. Mounting ZFS filesystems: (27/27) monk console login: Hardware Firmware Updates Following the application of a software upgrade, any hardware for which the upgrade includes newer versions of firmware will be upgraded. There are several types of devices for which firmware upgrades may be made available; each has distinct characteristics. Chapter 3 • System Maintenance 133 Updates Disks, storage enclosures, and certain internal SAS devices will be upgraded in the background. When this is occurring, the firmware upgrade progress will be displayed in the left panel of the Maintenance/System BUI view, or in the maintenance system updates CLI context. These firmware updates are almost always hardware related, though it may briefly show some number of outstanding updates when applying certain deferred updates to components other than hardware. As of 2010Q3.4, when there are outstanding updates, an informational or warning icon will appear next to the number of updates remaining. Clicking the icon brings up the Firmware Updates dialog, which lists the current remaining updates. For each update we also show the current version of the component, the time of the last attempted update, as well as the reason why the last attempt didn't succeed. We consider any outstanding updates to be in one of 3 states: Pending, In Progress and Failed. An update begins in the Pending state, and is periodically retried, at which time it moves into the In Progress state. If we fail to upgrade, due to a transient condition, the upgrade is moved back to the Pending state, and otherwise to the Failed state. In general, it is only an indication of a problem if: ■ There are updates in the Failed state. ■ Updates remain in the Pending state (or in limbo between the Pending and In Progress states) for an extended period of time (more than half an hour), without the number of remaining updates decreasing. The following conditions don't indicate a problem: ■ Disks firmware updates are shown as pending for extended periods of time, with a status message indicating that they are not part of any pool. This is expected, given that we only update disk firmware, for disks that are part of a pool. In order to update these disks, add them to a pool. ■ There are multiple chassis being updated, we are making progress (the number of remaining updates decreases), and some of the chassis transiently appear pending with a status indicating that some disk has only one path. This is also expected, since when we update a chassis, we may reset one of its expanders. Resetting an expander causes some disks to temporarily have only one path, and as a result, upgrades to other chassis will be held back until it is safe to do so again non-disruptively. Note that currently the Firmware Updates dialog doesn't automatically refresh, so you would have to close it and re-open it to get an updated view. Applying hardware updates is always done in a completely safe manner. This means that the system may be in a state where hardware updates cannot be applied. This is particularly important in the context of clustered configurations. During takeover and failback operations, any in-progress firmware upgrade will be completed; pending firmware upgrades will be suspended until the takeover or failback has completed, at which time the restrictions described 134 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Updates below will be reevaluated in the context of the new cluster state and, if possible, firmware upgrades will resume. Important: Unless absolutely necessary, takeover and failback operations should not be performed while firmware upgrades are in progress. The rolling upgrade procedure documented below meets all of these best practices and addresses the per-device-class restrictions described below. It should always be followed when performing upgrades in a clustered environment. In both clustered and standalone environments, these criteria will also be reevaluated upon any reboot or diagnostic system software restart, which may cause previously suspended or incomplete firmware upgrades to resume. ■ Components internal to the storage controller (such as HBAs and network devices) other than disks and certain SAS devices will generally be upgraded automatically during boot; these upgrades are not visible and will have completed by the time the management interfaces become available. ■ Upgrading disk or flash device firmware requires that the device be taken offline during the process. If there is insufficient redundancy in the containing storage pool to allow this operation, the firmware upgrade will not complete and may appear "stalled." Disks and flash devices that are part of a storage pool which is currently in use by the cluster peer, if any, will not be upgraded. Finally, disks and flash devices that are not part of any storage pool will not be upgraded. ■ Upgrading the firmware in a disk shelf requires that both back-end storage paths be active to all disks within all enclosures, and for storage to be configured on all shelves to be upgraded. For clusters with at least one active pool on each controller, these restrictions mean that disk shelf firmware upgrade can be performed only a controller that is in the "owner" state. During the firmware upgrade process, hardware may appear to be removed and inserted, or offlined and onlined. While alerts attributed to these actions are suppressed, if you are viewing the Maintenance/Hardware screen or the Configuration/Storage screen, you may see the effects of these upgrades in the UI in the form of missing or offline devices. This is not a cause for concern; however, if a device remains offline or missing for an extended period of time (several minutes or more) even after refreshing the hardware view, this may be an indication of a problem with the device. Check the Maintenance/Problems view for any relevant faults that may have been identified. Additionally, in some cases, the controllers in the disk shelves may remain offline during firmware upgrade. If this occurs, no other controllers will be updated until this condition is fixed. If an enclosure is listed as only having a single path for an extended period of time, check the physical enclosure to determine whether the green link lights on the back of the SIM are active. If not, remove and re-insert the SIM to re-establish the connection. Verify that all enclosures are reachable by two paths. Rollback The rollback procedure reverts all of the system software and all of the metadata settings of the system back to their state just prior to applying an update. This feature is implemented by taking a snapshot of various aspects of the system before the new update is applied, and rolling back this snapshot to implement the rollback. The implications of rollback are as follows: Chapter 3 • System Maintenance 135 Updates ■ Any appliance configuration changes are reverted and lost. For example, assume you are running version V, and then you update to V+1, and then you change your DNS server. If you execute a rollback, then your DNS server setting modification is effectively undone and removed from the system permanently. ■ Conversely, any changes made to user data are not reverted: if you update from V to V+1, and clients then create directories or modify shares in any way, those changes still exist after the rollback (as you would expect). ■ If the appliance is running version V, and has previous rollback targets V-1 and V-2, and you revert all the way to version V-2 (thereby "skipping" V-1), then you not only are removing the system software settings and system software for V, but also for V-1. That is, after a rollback to V-2, it is as if updates V-1 and V never happened. However, the software upload images for V-1 and V will still be saved on the system and you can apply them again after the rollback if you wish by re-executing the update. If after applying an update, the system is back up and running, you can use either the BUI or the CLI to initiate a rollback to one of two previously applied updates. If the system is not able to run at all after an update, then use the fail-safe rollback procedure. Fail-safe Rollback Administrators can execute a fail-safe rollback of the system software from the serial console by selecting one of the other boot menu entries, if present. Although rollback can also be requested from the BUI or CLI, rollback is offered from the boot menu because it is possible that rollback will be needed in scenarios where the new system software has completely failed, i.e. has failed to even boot. To rollback from the console, access the serial console as usual, and during boot, before the ten second timeout, use the arrow key to move the menu selection down to one of the earlier entries: GNU GRUB version 0.95 (613K lower / 3537536K upper memory) +-------------------------------------------------------------------------+ | Sun Storage 7110 2010.02.09,1-0 | | Sun Storage 7110 2009.09.01,1-18 | | | +-------------------------------------------------------------------------+ Use the ^ and v keys to select which entry is highlighted. Press enter to boot the selected OS, ’e’ to edit the commands before booting, or ’c’ for a command-line. After the rollback boot menu entry is selected, the system will boot the old kernel software, but the rollback must be manually confirmed on the console in order to commit the rollback, which will effectively remove all changes to the system that have happened since, as described above. The confirmation step looks like this: SunOS Release 5.11 Version ak/[email protected],1-18 64-bit Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. 136 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Updates System rollback in progress. Rollback to: ak/[email protected],1-18 Proceed with system rollback [y,n,?] Entering "y" proceeds with the rollback, and the system will complete boot using the prior snapshot. Entering "n" cancels the rollback and immediately reboots the system, allowing the administrator to select a different boot image (i.e. the current system software or an older snapshot). Cluster Upgrade In a clustered system, a rolling upgrade can be performed, eliminating downtime while the upgrade is performed. This section assumes familiarity with the Sun Storage clustering model: if you are not familiar with the clustering concepts and terminology, please read about clustering concepts in the System Administration Guide first. To describe the rolling upgrade procedure, this document will refer to the two clustered storage controllers as A and B, where A is the controller that will be updated first, and B is the controller that will be updated second. A key best practice in rolling upgrades is that each controller should be upgraded at a time when it is not providing service to clients. The procedure described here meets this requirement. In addition, all general upgrade best practices described above also apply to rolling upgrades. Important: Do not perform a takeover operation while an upgrade is in progress. 1. Using either the CLI or the BUI, upload the update software image to both storage controllers. 2. If the cluster has a single storage pool, the controller to which that pool is assigned will be designated B; the one without a storage pool is designated A. If the cluster has two or more storage pools and each controller is assigned at least one of them, then decide at this time which controller will be designated A and which will be designated B. The choice is arbitrary, but A's storage pool(s) will be taken over first, so clients using those resource will experience a standard takeover-induced availability delay first. 3. Log in to controller B, go to Configuration/Cluster or the CLI equivalent, and perform a takeover, which will cause controller A to reboot. The software will not prevent you from beginning the upgrade without taking over. However, if you do not perform the takeover, during the upgrade you will be unable to make any changes to the appliance's configuration even though that appliance will continue to provide service, and you will be performing an upgrade on a controller while it is providing service. 4. Using the serial console, or the CLI or BUI if you have dedicated private network interfaces assigned, log in to controller A. Go to Maintenance/System or the CLI equivalent, select the software update, and apply it. At the end of the upgrade procedure, controller A will reboot again, this time running the new system software. 5. Log into controller A and perform a takeover as above. This will cause controller B to reboot and controller A to take control of all resources and provide service to clients. Chapter 3 • System Maintenance 137 Updates 6. Validate the behavior of the new software and ensure that firmware upgrades complete. Since controller A is now providing service using the new software while B remains on the previous version, this provides an opportunity to ensure that all services are working correctly as seen on client systems. If a serious problem is encountered, roll back the software on controller A, which will cause it to reboot; controller B (still running previous software) will take over, and when controller A recovers it will be running the previous version as well. Important: Controller firmware updates will not proceed while the two controllers are running different versions of the system software. It is recommended to wait for all other firmware upgrades to complete before continuing. 7. Log in to controller B. Go to Maintenance/System or the CLI equivalent, select the desired update, and apply it. At the end of the procedure, storage controller B will reboot again. Controller B will boot up and be running the new system software. 8. The upgrade procedure is now complete. To restore normal operation, log in to storage controller A, go to Configuration/Cluster, and execute a failback operation, returning the resources to their respective assigned controllers. The following table describes the state of the cluster at the end of each of the steps above, during an update from version V to version V+1. Step Controller A State Controller A Version Controller B State Controller B Version 1,2 CLUSTERED V CLUSTERED V 3 STRIPPED V OWNER V 4 STRIPPED V+1 OWNER V 5,6 OWNER V+1 STRIPPED V 7 OWNER V+1 STRIPPED V+1 8 CLUSTERED V+1 CLUSTERED V+1 It is not advisable to make configuration changes to either storage controller while an upgrade is in progress. Most notably, while controllers are running different software versions, configuration changes made to one controller will not be propagated to its peer controller. For instance, suppose a change is made to controller A at step 4 above. That change will not be propagated to controller B until step 7, when both controllers are again running the same software version. Alternatively, if validation of controller A during step 6 uncovered a problem that necessitated controller A be rolled back to version V, then the changes made at step 4 would also be undone as part of the rollback. Accordingly, accessing the BUI or logging into the CLI while controllers are running different software versions will display a warning that configuration changes will not be propagated. 138 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Updates Similarly, the appliance can be configured to generate alerts when a cluster is comprised of controllers running different software versions (events "Cluster rejoin mismatch" and "Cluster rejoin mismatch on peer"). If you upgrade, change the root password, and then rollback in a cluster, the nodes will not be able to re-join after the rollback. Updating via the BUI Click the add icon next to Available Updates and specify the pathname on your desktop or local client of the update media. During the upload, a progress bar is displayed indicating the progress of the upload: Note that on some older browsers, the progress bar may not be updated continuously during the upload; if you see a "watch" cursor just wait a minute -- in the worst case the upload will proceed all the way to completion and you may not see the progress bar. Unpacking and Verifying Media This step will happen automatically after the media is done uploading: Beginning an Upgrade After the update is uploaded, unpacked and verified, it will appear as an update: Chapter 3 • System Maintenance 139 Updates Click the information icon to view the Release Notes for the software update. To begin the upgrade, click on the apply icon. During this process, an update health check will be performed to verify the appliance is ready to update. As the upgrade progresses, you will see the most recent message in the status field of the update. To cancel the update at any time (and without ill effect), click on the cancel icon. Rolling Back To roll back, locate a previous image and click on the rollback icon. You will be asked to confirm that you wish to execute a rollback, and then the system will reboot and execute the rollback. Unlike fail-safe rollback, you will not be asked for further confirmation when the system reboots. Removing Update Media To remove update media, highlight the corresponding row and click on the trash icon. Applying Deferred Updates Any deferred updates will be displayed below the list of available updates. If no deferred updates are available, no list will be displayed. The deferred updates will describe what effects they will have on the system. Clicking the 'Apply' button will apply all available deferred updates. Deferred updates will apply to both nodes in a cluster, and the cluster peer must be up and available to apply any deferred updates. Updating via the CLI Because you log into the appliance to use the CLI, the upload as described above is actually a download. To download the media onto the appliance via the CLI, execute the download command in maintenance system updates: dory:maintenance system updates> download dory:maintenance system updates download (uncommitted)> get url = (unset) user = (unset) password = (unset) You must set the "url" property to be a valid URL for the download. This may be either local to your network or over the internet. The URL can be either HTTP (beginning with "http://") or FTP (beginning with "ftp://"). If user authentication is required, it may be a part of the URL (e.g. 140 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Updates "ftp://myusername:mypasswd@myserver/export/foo"), or you may leave the username and password out of the URL and instead set the user and password properties. dory:maintenance system updates download (uncommitted)> ftp://foo/update.pkg.gz url = ftp://foo/update.pkg.gz dory:maintenance system updates download (uncommitted)> user = bmc dory:maintenance system updates download (uncommitted)> Enter password: password = ******** dory:maintenance system updates download (uncommitted)> Transferred 157M of 484M (32.3%) ... set url= set user=bmc set password commit Unpacking and Verifying Media After the file has been transferred, it will be automatically unpacked and verified: dory:maintenance system updates download (uncommitted)> commit Transferred 484M of 484M (100%) ... done Unpacking ... done dory:maintenance system updates> list UPDATE DATE STATUS [email protected],1-0-nd 2009-10-14 08:45 AKUP_WAITING ... Beginning an Upgrade To begin an upgrade, select the update that constitutes the upgrade. During this process, an update health check will be performed to verify the appliance is ready to update. From this context, you can set any properties specific to the update, including applying deferred updates. For more information on the set of properties available for the particular update, run the help properties command. User-controllable properties will begin with the update_ prefix: clownfish:maintenance system updates [email protected],1-0> help properties Properties that are valid in this context: version => Update media version date => Update release date status => Update media status update_zfs_upgrade => Apply incompatible storage pool update clownfish:maintenance system updates [email protected],1-0> get version = 2009.04.03,1-0 date = 2009-4-3 08:45:01 status = AKUP_WAITING update_zfs_upgrade = deferred clownfish:maintenance system updates [email protected],1-0> set update_zfs_upgrade=onreboot update_zfs_upgrade = onreboot clownfish:maintenance system updates [email protected],1-0> Chapter 3 • System Maintenance 141 Updates After you set any properties, execute the upgrade command. You are prompted for confirmation and (assuming an affirmative) the upgrade begins: dory:maintenance system updates> select [email protected],1-0-nd dory:maintenance system updates [email protected],1-0-nd> upgrade The selected software update requires a system reboot in order to take effect. The system will automatically reboot at the end of the update process. The update will take several minutes. At any time during this process, you can cancel the update with [Control-C]. Are you sure? (Y/N) y Updating from ... ak/[email protected],1-0 Backing up smf(5) ... done. Loading media metadata ... done. Selecting alternate product ... SUNW,iwashi Installing Sun Storage 7110 2009.10.14,1-0 pkg://sun.com/ak/SUNW,[email protected],1-0:20091014T084500Z Creating system/boot/ak-nas-2009.10.14_1-0 ... done. Creating system/root/ak-nas-2009.10.14_1-0 ... done. ... As the upgrade proceeds, the latest message will be printed. You can cancel the upgrade at any time by pressing ^C, at which point you will be prompted for confirmation: Updating from ... ak/[email protected],1-0 Backing up smf(5) ... done. Loading media metadata ... ^C This will cancel the current update. Are you sure? (Y/N) y error: interrupted by user dory:maintenance system updates [email protected],1-0-nd> Rolling Back To roll back to an earlier version, select the update that corresponds to that version and execute the rollback command. You will be asked to confirm that you wish to execute a rollback, and then the system will reboot and execute the rollback. Unlike fail-safe rollback, you will not be asked for further confirmation when the system reboots. Removing Update Media To remove update media, use the destroy command, specifying the update to be removed: dory:maintenance system updates> destroy [email protected],1-0-nd This will destroy the update "[email protected],1-0-nd". Are you sure? (Y/N) y dory:maintenance system updates> Applying Deferred Updates To see if there are any available deferred updates, run the show command. If deferred updates are available, you can use the apply command: 142 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Passthrough x clownfish:maintenance system updates> show Updates: UPDATE [email protected],1-1.9 [email protected],1-0 DATE 2009-4-1 04:18:48 2009-4-3 08:45:01 STATUS AKUP_PREVIOUS AKUP_CURRENT Deferred updates: The following incompatible updates are available. Applying these updates will enable new software features as described below, but will prevent older versions of the software from accessing the underlying resources. You should apply deferred updates once you have verified that the current software update is functioning and a rollback is not required. Applying deferred updates in a cluster will also update any resources on the cluster peer. 1. Support for the "passthrough-x" aclinherit property for shares. clownfish:maintenance system updates> apply Applying deferred updates will prevent rolling back to previous versions of software. Are you sure? (Y/N) clownfish:maintenance system updates> apply Passthrough x Passthrough-x Deferred Update For filesystems, ACLs are inherited according to the "aclinherit" property on the filesystem (or inherited from the project). Previous versions of software allowed four options for this setting: "discard", "noallow", "restricted", and "passthrough". The 2009.Q2.0.0 release introduces a new option, "passthrough-x", with slightly different semantics as described in the product documentation: Same as 'passthrough', except that the owner, group, and everyone ACL entries inherit the execute permission only if the file creation mode also requests the execute bit. The "passthrough" mode is typically used to cause all "data" files to be created with an identical mode in a directory tree. An administrator sets up ACL inheritance so that all files are created with a mode, such as 0664 or 0666. This all works as expected for data files, but you might want to optionally include the execute bit from the file creation mode into the inherited ACL. One example is an output file that is generated from tools, such as "cc" or "gcc". If the inherited ACL doesn't include the execute bit, then the output executable from the compiler won't be executable until you use chmod(1) to change the file's permissions. In order to use this new mode, the storage pool must be upgraded. If you choose not to upgrade the pool and attempt to use this new property, you will get an error indicating that the storage pool needs to be upgraded first. There is no other implication of applying this update, and it can Chapter 3 • System Maintenance 143 User Quotas be ignored if there is no need to use this new setting. Applying this update is equivalent to upgrading the on-disk ZFS pool to http://www.opensolaris.org/os/community/zfs/version/14/. (http://www.opensolaris.org/os/community/zfs/version/14/.) User Quotas User Quotas Deferred Update With the 2009.Q3 software release, the system now supports user and group quotas on a per-share basis. In order to make use of this feature, a deferred update must be applied to upgrade all shares in the system to support this feature. Applying this deferred update also allows the current usage (user or group) to be queried on a per-filesystem or per-project basis. To quote the product documentation: Quotas can be set on a user or group at the filesystem level. These enforce physical data usage based on the POSIX or Windows identity of the owner or group of the file or directory. There are some significant differences between user and group quotas and filesystem and project data quotas. Be sure to read the complete documentation under Space Management before attempting to use user or group quotas. This update is applied in the background, and takes time proportional to the number of shares and amount of data on the system. Until this deferred update is finished, attempt to apply user quotas will produce an error indicating that the update is still in progress. COMSTAR COMSTAR Deferred Update The COMSTAR framework relies on a ZFS pool upgrade for complete support of persistent group reservations (PGRs). Before this upgrade has been applied, the number of reservations stored with each LUN is severely limited, and may even be zero. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 16. 144 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Replication Triple Parity RAID Triple-Parity RAID Deferred Update This update provides the ability to use the triple-parity RAID storage profile, RAID-Z with three parity sectors per stripe. Triple-parity offers increased protection against drive failures and additional overall availability. In order to use this new mode, the storage pool must be upgraded. If you choose not to upgrade the pool and attempt to use this new property, you will get an error indicating that the storage pool needs to be upgraded first. There is no other implication of applying this update, and it can be ignored if there is no need to use this new setting. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 17. Dedup Data Deduplication Deferred Update This update provides the ability to use data deduplication. In order to use this new mode, the storage pool must be upgraded. If you choose not to upgrade the pool and attempt to use this new property, you will get an error indicating that the storage pool needs to be upgraded first. There is no other implication of applying this update, and it can be ignored if there is no need to use this new setting. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 21. Replication Replication Deferred Update The 2010.Q1 release stores replication configuration differently than 2009.Q3 and earlier releases. This update migrates existing target, action, and replica configuration created under an earlier release to the new form used by 2010.Q1 and later. After upgrading to 2010.Q1 but before this update is applied, incoming replication updates for existing replicas will fail. Replicas received under earlier releases will not be manageable via the BUI or CLI, though they will occupy space in the storage pool. Additionally, the system will not send replication updates for actions configured on earlier releases. Chapter 3 • System Maintenance 145 Received Properties After applying this update, incoming replication updates for replicas originally received on earlier releases will continue normally and without a full resync. The system will also send incremental replication updates for actions configured under earlier releases. Received Properties Received Properties Deferred Update The 2010.Q1 feature that enables administrators to customize properties on replicated shares relies on a ZFS pool upgrade. This upgrade provides support of persistent local changes to received properties. Before this upgrade has been applied, the system will not allow administrators to change properties on replicated shares. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 22. Slim ZIL Introduction This deferred update changes the layout of ZFS intent log blocks to improve synchronous write performance. These improvements rely on a ZFS pool upgrade provided by this update. Before this update has been applied, log records will continue to be written in the old format and performance may be reduced. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 23. Snapshot Deletion Snapshot Deletion Deferred Update This deferred update increases snapshot deletion parallelism and reduces the size of transaction groups associated with snapshot deletion to improve systemic responsiveness. These improvements rely on a ZFS pool upgrade provided by this update. Before this update has been applied, new snapshot data will be stored in the old format and deleted using the old algorithm. Note that any snapshots created before this update is applied will also be deleted using the old algorithm. 146 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 RAIDZ Mirror Applying this update is equivalent to upgrading the on-disk ZFS pool version 26. Recursive Snapshots Recursive Snapshots Deferred Update This deferred update allows recursive snapshots to be taken without suspending the ZFS intent log, which greatly improves snapshot creation performance especially on heavily loaded controllers. These improvements rely on a ZFS pool upgrade provided by this update. Before this update has been applied, the system will be able to create snapshots but will do so using the old, much slower, algorithm. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 27. Multi Replace Multi Replace Deferred Update This deferred update allows importing a pool with a missing log device and corrects the behavior of the system when a device that is being resilvered is itself removed or replaced. These fixes rely on a ZFS pool upgrade provided by this update. Before this update has been applied, the system will be unable to import pools with missing log devices and will not correctly handle replacement of resilvering devices (see CR 6782540). Applying this update is equivalent to upgrading the on-disk ZFS pool to version 28. RAIDZ Mirror RAIDZ/Mirror Deferred Update This deferred update improves both latency and throughput on several important workloads. These improvements rely on a ZFS pool upgrade provided by this update. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 29. Chapter 3 • System Maintenance 147 Optional Child Dir Optional Child Dir Introduction This deferred update improves list retrieval performance and replication deletion performance by improving dataset rename speed. These improvements rely on a ZFS pool upgrade provided by this update. Before this update has been applied, the system will be able to retrieve lists and delete replications, but will do so using the old, much slower, recursive rename code. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 31. ConfigurationBackup Configuration Backup The configuration backup function enables the administrator to: ■ ■ Backup the appliance configuration, consisting of system metadata only (such as the network configuration, local users and roles, service settings, and other appliance metadata). Restore a previously saved configuration from a backup. ■ Export a saved configuration, as a plain file, so that it may be stored on an external server, or included in a backup of a share on the appliance itself. ■ Import a saved configuration that was previously exported from this system or another system, making it available for a restore operation. Backup Contents A configuration backup does include: ■ ■ ■ ■ Metadata associated with the system as a whole, such as settings for NTP, NIS, LDAP, and other services. Network device, datalink, and interface configuration. User accounts, roles and privileges, preferences, and encrypted passwords for local users (not directory users). Alerts and thresholds and their associated rules. A configuration backup does not include: ■ 148 User data (shares and LUNs). Your user data must be backed up separately, using NDMP backup software, snapshots, and/or remote replication. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 ConfigurationBackup ■ User passwords for directory users. These remain stored solely in your separate network directory service, such as LDAP or Active Directory, and will not be stored in the backup or restored. ■ Metadata directly associated with user data, such as snapshot schedules, user quotas, compression settings, and other attributes of shares and LUNs. ■ Analytics and logs. Events can be redirected to external SNMP trap receivers or e-mail destinations using Alerts rules. ■ System software. The system software is automatically backed up as part of the System Update capability. Restore Impact The restore operation takes a selected configuration backup, and modifies all of the corresponding system settings to reflect those in the backup, including removing aspects of the configuration that were not present at the time of the backup. Administrators should adhere to the following guidelines when planning a restore: ■ Scheduled downtime - The restore process takes several minutes to complete and will impact service to clients, as the active networking configuration and data protocols are reconfigured. Therefore, a configuration restore should only be used on a development system, or during a scheduled downtime. ■ Service interruption - Clients accessing data on the system through a data protocol such as NFS will see service interrupted, as the network is reconfigured and the NFS service restarted. If the selected backup copy was taken when a service was disabled by the administrator, that setting will be restored, and therefore client sessions will be terminated for that protocol. ■ Session interruption - If restore is initiated from a web browser, that web browser session will also be disconnected during the restore process as the network is reconfigured. If the restored configuration does not include the same routing and network address settings used by the current browser connection, or if the browser is connected to a network address managed by DHCP, the browser session will be interrupted during the restore. The restore process will complete in the background, but you will need to reload or point the browser at a new, restored network address to continue. For this reason, it may be desirable to initiate a complex configuration restore from the service processor serial console using the CLI. ■ Un-cluster, restore, and re-cluster - Configuration backups may be initiated for appliances that are joined in a cluster, but a configuration restore may not be used while systems are actively clustered. The clustering process means that settings are being synchronized between cluster peers, and each peer appliance also is maintaining private settings. For this reason, you must first use the Unconfiguring Clustering procedure to un-cluster the two systems. Then, restore the configuration backup on a selected head, and then re-cluster the two systems, at which point the other system will automatically synchronize itself with the restored configuration. Chapter 3 • System Maintenance 149 ConfigurationBackup ■ Root privileges required - Configuration backups include all system metadata, and therefore require all possible privileges and authorizations to create or apply. Therefore, unlike other delegated administrative options, only the root user is authorized to perform a configuration backup or restore. ■ Verify setting for new features - It is permitted to restore a configuration that was saved before applying a system update to a new version of the appliance software. In some cases, services and properties that were present at the time of the backup may have different effects, and new services and properties may exist in the newer software that did not exist at the time of the backup. Similar to the system update process, the configuration restore process will make every effort to transfer applicable settings, and apply reasonable defaults to those properties that did not exist at the time of the backup. When restoring across software versions, administrators should manually verify settings for new features following the restore. ■ Password maintenance - The root password is not changed or reverted to the password at the time of the backup if it was different. The current root password is maintained on the system across the restore. For more details about passwords, refer to the summary of Security Considerations. Security Considerations A configuration backup contains information that is normally only accessible to the root administrative user on the appliance. Therefore, any configuration backup that is exported to another system or into a filesystem share must apply security restrictions to the backup file to ensure that unauthorized users cannot read the backup file. Local user passwords are stored in the backup file in encrypted (hashed) format, not as clear text. However, on the system, access to these password hashes is restricted, as they could be used as input to dictionary attacks. Therefore, administrators must carefully protect configuration backups that are exported, either by restricting file access to the backup, or by applying an additional layer of encryption to the entire backup file, or both. Directory user passwords are not stored in the appliance, and therefore are not stored in the configuration backup. If you have deployed a directory service such as LDAP or AD for administrative user access, there are no copies of directory service password hashes for directory users stored in the configuration backup. Only the user name, user ID, preferences, and authorization settings for directory users are stored in the backup and then restored. Following a configuration restore, the local root administrative user password is not modified to the root password at the time of the backup. The root password is left as-is, unmodified, by the restore process, to ensure that the password used by the administrator who is executing the restore process (and thus has logged in, using that password) is retained. If the administrator's intent was to also change the root password at the time of configuration restore, that step must be executed manually following the restore, using the normal administrative password change procedure. 150 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 ConfigurationBackup Managing Configuration Backups Using the BUI The following section outlines how various Configuration Backup tasks can be accomplished using the Configuration Backup area near the bottom of the Maintenance > System screen in the BUI. Create a Configuration Backup To create a backup, simply click the "Backup" button above the list of saved configurations and follow the instructions. You will be prompted to enter a descriptive comment for the backup. Restore from a Saved Configuration Click the rollback icon on any saved configuration to begin the process of reverting the system to that saved configuration. Review the Restore Impact guidance above, and confirm that it is ok to proceed. Delete a Saved Configuration To delete a Saved Configuration simply click the that is no longer required. trash can icon to delete the configuration Export a Saved Configuration To export a Saved Configuration, mouse over the configuration list entry you wish to export and click the download icon. Your browser will prompt you to save the file locally. The file is a compressed archive whose contents are versioned and may vary over time. You should not attempt to unpack or modify the content of the archive, and doing so will render it unable to be imported back to the appliance successfully. Import a Saved Configuration To import a previously exported Saved Configuration, click the add icon at the top of the saved configurations list and then use your web browser's file selection dialog to locate the previously exported configuration. You should upload the single, compressed archive file previously saved using the export function. Managing Configuration Backups Using the CLI The following section outlines how various Configuration Backup tasks can be accomplished using the CLI in the maintenance system configs context. Chapter 3 • System Maintenance 151 ConfigurationBackup Listing Configurations host:maintenance system configs> list CONFIG DATE SYSTEM bfa614d7-1db5-655b-cba5-bd0bb0a1efc4 2009-8-5 17:14:28 host cb2f005f-cf2b-608f-90db-fc7a0503db2a 2009-8-24 17:56:53 host VERSION 2009.08.04,1-0 2009.08.18,1-0 Create a Configuration Backup The backup command saves a configuration backup. You will be prompted to enter a descriptive comment for the backup, and then enter done to execute the backup operation. host:maintenance system configs> backup Backup Configuration. Enter a descriptive comment for this configuration, and click Commit to backup current appliance settings: host:maintenance system configs conf_backup step0> set comment="pre-upgrade" comment = pre-upgrade host:maintenance system configs conf_backup step0> done host:maintenance system configs> Restore from a Saved Configuration The restore command reverts the system to a saved configuration. You will be prompted to enter the universal unique identifier for the backup (see the output of list, above), and then enter done to execute the restore. Review the Restore Impact guidance above, and confirm that it is ok to proceed. host:maintenance system configs> restore Restore. Select the configuration to restore: host:maintenance system configs conf_restore step0> set uuid=36756f96-b204-4911-8ed5-fefaf89cad6a uuid = 36756f96-b204-4911-8ed5-fefaf89cad6a host:maintenance system configs conf_restore step0> done Note: Storage pools are not automatically unconfigured when you execute the restore command. Delete a Saved Configuration Then the destroy command deletes a saved configuration: host:maintenance system configs> destroy cb2f005f-cf2b-608f-90db-fc7a0503db2a Are you sure you want to delete the saved configuration "new"? y host:maintenance system configs> Export a Saved Configuration The export command exports a saved configuration, by means of executing an HTTP or FTP PUT operation against a remote HTTP or FTP server. You can also use the export function to 152 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Problems export the file to a share on the appliance itself, that has the HTTP or FTP protocol enabled for writing. You can enter a username and password for authentication to the remote server if one is required. Import a Saved Configuration The import command imports a saved configuration, by means of executing an HTTP or FTP GET operation against a remote HTTP or FTP server. You can also use the import function to import a configuration stored in a share on the appliance itself, that has the HTTP or FTP protocol enabled for reading. You can enter a username and password for authentication to the remote server if one is required. Problems Problems To aid serviceability, the appliance detects persistent hardware failures (faults) and software failures (defects, often included under faults) and reports them as active problems on this screen. If the phone home service is enabled, active problems are automatically reported to Oracle Support, where a support case may be opened depending on the service contract and the nature of the fault. Active problems display For each problem, the appliance reports what happened, when the problem was detected, the severity and type of the problem, and whether it has been phoned home. Below are some example faults as they would be displayed in the BUI: Date Description Type 2009-09-16 13:56:36 SMART health-monitoring firmware reported that a disk failure is imminent. Major Fault Never 2009-09-05 17:42:55 A disk of a different type (cache, log, or data) was inserted Minor into a slot. The newly inserted device must be of the same Fault type. Never 2009-08-21 16:40:37 The ZFS pool has experienced currently unrecoverable I/O failures. Major Error Never 2009-07-16 22:03:22 A memory module is experiencing excessive correctable errors affecting large numbers of pages. Major Fault Never Chapter 3 • System Maintenance Phoned Home 153 Problems This information can also be viewed in the CLI: gefilte:> maintenance problems show Problems: COMPONENT DIAGNOSED problem-000 2010-7-27 00:02:49 TYPE Major Fault DESCRIPTION SMART health-monitoring firmware reported that a failure is imminent on disk ’HDD 17’. Selecting any fault shows more information about the fault including the impact to the system, affected components, the system's automated response (if any), and the recommended action for the administrator (if any). In the CLI, only the "uuid", "diagnosed", "severity", "type", and "status" fields are considered stable. Other property values may change from release to release. For hardware faults, you may be able to select the affected hardware component to locate it on the Hardware screen. Repairing problems Problems can be repaired by performing the steps described in the suggested action section. This typically involves replacing the physical component (for hardware faults) or reconfiguring and restarting the affected service (for software defects). Repaired problems no longer appear on this screen. While the system typically detects repairs automatically, in some cases manual intervention may be required. If a problem persists after the affected components have been repaired, contact support. You may be instructed to mark the problem repaired. This should only be done under the direction of service personnel or as part of a documented Oracle repair procedure. Related features ■ ■ 154 A persistent log of all faults and defects is available under Logs as the Fault log. Faults and defects are subcategories of Alerts. Filter rules can be configured to cause the appliance to email administrators or perform other actions when faults are detected. Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Logs Logs Introduction Alerts This is the appliance alert log, recording key events of interest during appliance operation. The following are example alert log entries as they would appear in the BUI: Time Event ID Description Type 2009-9-16 13:01:56 f18bbad1-8084-4cab-c950-82ef5b8228ea An I/O path from slot 'PCIe 0' to chassis 'JBOD #1' has been removed. Major alert 2009-9-16 13:01:51 8fb8688c-08f2-c994-a6a5-ac6e755e53bb A disk has been inserted into slot 'HDD 4' of chassis 'JBOD #1'. Minor alert 2009-9-16 13:01:51 446654fc-b898-6da5-e87e-8d23ff12d5d0 A disk has been inserted into slot 'HDD 15' of chassis 'JBOD #1'. Minor alert An info icon next to the Event ID means that extended information is available. Click the icon and this information will be displayed below the list of alerts. The appliance can also be configured to send email, raise an SNMP trap, or perform other actions when particular alerts occur. This is configured in the Alerts section. All alerts appear in this log, regardless of whether they have actions configured for them. Faults The fault log records hardware and software faults. This is a useful reference when troubleshooting hardware failure, as timestamps are available for these hardware fault events. The following are example fault log entries as they would appear in the BUI: Date Event ID 2009-9-5 17:42:35 9e46fc0b-b1a4-4e69-f10f-e7dbe80794fe The device 'HDD 6' has failed or could not be opened. 2009-9-3 19:20:15 d37cb5cd-88a8-6408-e82d-c05576c52279External sensors indicate that a fan is no Minor Fault longer operating correctly. Chapter 3 • System Maintenance Description Type Major Fault 155 Logs Date Event ID Description Type 2009-8-21 16:40:48 c91c7b32-83ce-6da8-e51e-a553964bbdbcThe ZFS pool has experienced currently Major Error unrecoverable I/O failures. These faults will generate alert log entries, and so will use the alert reporting settings (such as sending email), if configured. Faults that require administrator attention will appear in Problems. System This is the operating system log, available to read via the appliance interfaces. This may be useful when troubleshooting complex issues, but should only be checked after first examining the alert and fault logs. The following are example system log entries as they would appear in the BUI: Time Module Priority Description 2009-10-11 14:13:38 ntpdate error no server suitable for synchronization found 2009-10-11 14:03:52 genunix notice ^MSunOS Release 5.11 Version ak/[email protected],1-0 64-bit 2009-10-11 14:02:04 genunix notice done 2009-10-11 14:02:01 genunix notice syncing file systems... 2009-10-11 13:52:16 nxge warning WARNING: nxge : ==> nxge_rxdma_databuf_free: DDI Audit The audit log records user activity events, including login and logout to the BUI and CLI, and administrative actions. If session annotations are used (see Users), each audit entry should be noted with a reason. The following are example audit log entries as they would appear in the BUI: Time User Host Summary 2009-10-12 05:20:24 root deimos Disabled ftp service 2009-10-12 03:17:05 root deimos User logged in 2009-10-11 22:38:56 root deimos Browser session timed out Session Annotation 2009-10-11 21:13:35 root <console> Enabled ftp service 156 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Logs Phone Home If Phone Home is used, this log will show communication events with Oracle support. The following are example phone home entry as it would appear in the BUI: Time Description Result 2009-10-12 05:24:09 Uploaded file 'cores/ak.45e5ddd1-ce92-c16e-b5eb-9cb2a8091f1c.tar.gz' to Oracle OK support BUI Use the Maintenance > Logs screen to navigate logs using list controls, and switch between logs using the local navigation buttons. CLI Logs can be viewed under the maintenance logs section of the CLI. Listing logs Use the show command to list available logs, and the timestamp of the last log entry: caji:> maintenance logs caji:maintenance logs> show Logs: LOG alert audit fltlog scrk system ENTRIES 2 42 2 0 100 LAST 2009-10-16 2009-10-16 2009-10-16 2009-10-16 02:44:04 18:19:53 02:44:04 03:51:01 Up to 100 recent entries for each log are visible using the CLI. Viewing a log Logs may be selected for viewing with the show command: caji:maintenance logs> select audit show Entries: ENTRY TIME SUMMARY entry-000 2009-10-15 00:59:37 root, <console>, Enabled datalink:nge0 service entry-001 2009-10-15 00:59:39 root, <console>, Enabled interface:nge0 service Chapter 3 • System Maintenance 157 Logs entry-002 entry-003 entry-004 entry-005 entry-006 2009-10-15 2009-10-15 2009-10-15 2009-10-15 2009-10-15 01:00:39 01:41:44 01:42:01 17:56:30 17:56:53 entry-007 2009-10-15 18:00:21 entry-008 2009-10-15 18:14:47 entry-009 2009-10-15 20:46:27 entry-010 2009-10-15 21:51:46 entry-011 2009-10-15 21:51:46 entry-012 2009-10-15 21:56:44 root, root, root, root, root, CLI root, CLI root, root, CLI root, root, root, CLI <console>, User logged in <console>, Enabled nis service <console>, Imported storage pool "pool-0" <console>, User logged in deimos.sf.fishworks.com, User logged in via deimos.sf.fishworks.com, User logged out of <console>, Browser session timed out deimos.sf.fishworks.com, User logged in via <console>, Rebooted appliance <console>, User logged out deimos.sf.fishworks.com, User logged in via ... Most recent entries are displayed at the bottom of the list. Entry details All log entry details are available when selecting that entry and running show: caji:maintenance logs> select audit caji:maintenance logs audit> select entry-000 show Properties: timestamp = 2009-10-15 00:59:37 user = root address = <console> summary = Enabled datalink:nge0 service annotation = The "annotation" is the session annotation, which can be enabled when configuring users. 158 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 Glossary 7110 Sun Storage 7110 Unified Storage System 7120 Sun ZFS Storage 7120 7210 Sun Storage 7210 Unified Storage System 7310 Sun Storage 7310 Unified Storage System 7320 Sun ZFS Storage 7320 7410 Sun Storage 7410 Unified Storage System 7420 Sun ZFS Storage 7420 Active Directory Microsoft Active Directory server Alerts Configurable log, email or SNMP trap events Analytics appliance feature for graphing real-time and historic performance statistics ARC Adaptive Replacement Cache BUI Browser User Interface CLI Command Line Interface Cluster Multiple heads connected to shared storage Controller See ''Storage Controller'' CPU Central Prcessing Unit CRU Customer Replaceable Component Dashboard appliance summary display of system health and activity Dataset the in-memory and on-disk data for a statistic from Analytics DIMM dual in-line memory module Disk Shelf the expansion storage shelf that is connected to the head node or storage controller DNS Domain Name Service 159 DTrace DTrace a comprehensive dynamic tracing framework for troubleshooting kernel and application problems on production systems in real-time FC Fibre Channel FRU Field Replaceable Component FTP File Transfer Protocol GigE Gigabit Ethernet HBA Host Bus Adapter HCA Host Channel Adapter HDD Hard Disk Drive HTTP HyperText Transfer Protocol Hybrid Storage Pool combines disk, flash, and DRAM into a single coherent and seamless data store. Icons icons visible in the BUI IOM I/O Module; similar to a SIM iSCSI Internet Small Computer System Interface Kiosk a restricted BUI mode where a user may only view one specific screen L2ARC Level 2 Adaptive Replacement Cache LDAP Lightweight Directory Access Protocol LED light-emitting diode Logzilla write IOPS accelerator LUN Logical Unit Masthead top section of BUI screen Modal Dialog a new screen element for a specific function NFS Network File System NIC Network Interface Card NIS Network Information Service PCIe Peripheral Component Interconnect Express PCM Power Cooling Module, consisting of a PSU and one or more fans Pool provide storage space that is shared across all filesystems and LUNs 160 Sun ZFS Storage 7120, 7320, and 7420 Appliance Customer Service Manual • December 2012 E38247–01 ZFS Project a collection of shares PSU Power Supply Unit, included with fans in a power cooling module (PCM) QDR quad data rate Readzilla read-optimized flash SSD for the L2ARC Remote Replication replicating shares to another appliance Rollback reverts all of the system software and all of the metadata settings of the system back to their state just prior to applying an update SAS Serial Attached SCSI SAS-2 Serial Attached SCSI 2.0 SATA Serial ATA Schema configurable properties for shares Scripting automating CLI tasks Service appliance service software Share ZFS filesystem shared using data protocols SIM SAS Interface Module Snapshot an image of a share SSD Solid State Drive SSH Secure Shell Statistic a metric visible from Analytics Storage Controller the head node of the appliance Support Bundle auto-generated files containing system configuration information and core files for use by remote support in debugging system failures Title Bar local navigation and function section of BUI screen Updates software or firmware updates WebDAV Web based Distributed Authoring and Versioning ZFS on-disk data storage subsystem 161 162