HP ProLiant DL985 Troubleshooting Manual
HP ProLiant DL985 Troubleshooting Manual

HP ProLiant DL985 Troubleshooting Manual

Hp proliant servers troubleshooting guide
Hide thumbs Also See for ProLiant DL985:
Table of Contents

Quick Links

HP ProLiant Servers

Troubleshooting Guide

Abstract
This document describes common procedures and solutions for the many levels of troubleshooting for an HP ProLiant server. This document is intended
for the person who installs, administers, and troubleshoots servers or server blades. HP assumes you are qualified in the servicing of computer
equipment and trained in recognizing hazards in products with hazardous energy levels.
Part Number: 375445-402
April 2011
Edition: 11
Table of Contents
loading

Summary of Contents for HP ProLiant DL985

  • Page 1: Troubleshooting Guide

    HP ProLiant Servers Troubleshooting Guide Abstract This document describes common procedures and solutions for the many levels of troubleshooting for an HP ProLiant server. This document is intended for the person who installs, administers, and troubleshoots servers or server blades. HP assumes you are qualified in the servicing of computer equipment and trained in recognizing hazards in products with hazardous energy levels.
  • Page 2 © Copyright 2004, 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
  • Page 3: Table Of Contents

    Contents Introduction ..........................8 What's new ..............................8 Revision history ............................8 375445-401 (January 2011) ......................8 375445-xx9 (June 2010) ........................8 375445-xx8 (July 2009) ........................9 375445-xx7 (November 2008) ......................10 375445-xx6 (September 2007)......................10 375445-xx5 (June 2006) ........................11 375445-xx4 (May 2006) .........................
  • Page 4 UPS problems ..........................39 General hardware problems ........................40 Problems with new hardware ......................40 Unknown problem ..........................41 Third-party device problems ......................41 Internal system problems ..........................42 Battery pack problems ........................42 CD-ROM and DVD drive problems ..................... 42 Diskette drive problems ........................
  • Page 5 Management tools............................73 Automatic Server Recovery ........................ 73 ROMPaq utility ..........................73 Remote Insight Lights-Out Edition II ..................... 74 iLO and iLO 2 technology ......................... 74 Integrated Lights-Out 3 technology ..................... 74 Erase Utility ............................. 75 Redundant ROM support ........................75 USB support ............................
  • Page 6 Power capacity ..........................89 Product configuration resources ........................89 Device driver information ........................89 DDR3 memory configuration......................90 Operating System Version Support ..................... 90 Operating system installation and configuration information (for factory-installed operating systems) ..90 Server configuration information ......................90 Installation and configuration information for the server setup software ..........
  • Page 7 Port 85 codes and iLO messages ....................... 167 Troubleshooting the system using port 85 codes ................167 Processor-related port 85 codes ....................... 167 Memory-related port 85 codes......................168 Expansion board-related port 85 codes .................... 169 Miscellaneous port 85 codes ......................170 Windows®...
  • Page 8: Introduction

    Introduction What's new The eleventh edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-402, includes the following additions and updates: • Updated the HP ProLiant 100 Series Server troubleshooting information (on page 12) section to provide troubleshooting information for the HP ProLiant ML110 G7 Server and HP ProLiant DL120 G7 Server. •...
  • Page 9: 375445-Xx8 (July 2009)

    Breaking the server down to the minimum hardware configuration (on page 17) • Updated Diagnostic flowcharts (on page 23): General diagnosis flowchart (on page 25) Server power-on problems flowchart (on page 27) Server and p-Class server blade POST problems flowchart (on page 32) c-Class server blade POST problems flowchart (on page 33) Server and p-Class server blade fault indications flowchart (on page 35) •...
  • Page 10: 375445-Xx7 (November 2008)

    HP Insight Remote Support software (on page 77) • Added new content to HP Resources for Troubleshooting (on page 87): HP Guided Troubleshooting website (on page 87) DDR3 memory configuration (on page 90) Power capacity (on page 89) • Added new error messages: ADU Error Messages (on page 92) POST error messages and beep codes (on page 116) 375445-xx7 (November 2008)
  • Page 11: 375445-Xx5 (June 2006)

    Added new technology Expanded existing information Added new firmware update procedures for unsupported processor stepping 375445-xx5 (June 2006) The fifth edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-xx5, included the following additions: • Added three new c-Class server blade flowcharts: c-Class server blade power-on problems flowchart (on page 29) c-Class server blade POST problems flowchart (on page 33) c-Class server blade fault indications flowchart (on page 37)
  • Page 12: Getting Started

    Getting started HP ProLiant 100 Series Server troubleshooting information Use this guide for troubleshooting information on the HP ProLiant ML110 G7 Server and the HP ProLiant DL120 G7 Server. For troubleshooting information on HP ProLiant 100 Series Servers other than the HP ProLiant ML110 G7 Server and HP ProLiant DL120 G7 Server, see the respective server user guides.
  • Page 13: Pre-Diagnostic Steps

    When additional information becomes necessary, use this section to identify websites and supplemental documents that contain troubleshooting information. • Error messages (on page 92) Use this section for a complete list of the following messages: ADU error messages (on page 92) POST error messages and beep codes (on page 116) Event list error messages (on page 159) HP BladeSystem infrastructure error codes...
  • Page 14: Warnings And Cautions

    This symbol indicates the presence of electric shock hazards. The area contains no user or field serviceable parts. Do not open for any reason. WARNING: To reduce the risk of injury from electric shock hazards, do not open this enclosure. This symbol on an RJ-45 receptacle indicates a network interface connection.
  • Page 15: Electrostatic Discharge

    WARNING: To reduce the risk of personal injury or damage to the equipment: Observe local occupation health and safety requirements and guidelines for manual • weight in kg handling. weight in lb Obtain adequate assistance to lift and stabilize the chassis during installation or •...
  • Page 16: Symptom Information

    Symptom information Before troubleshooting a server problem, collect the following information: • What events preceded the failure? After which steps does the problem occur? • What has been changed since the time the server was working? • Did you recently add or remove hardware or software? If so, did you remember to change the appropriate settings in the server setup utility, if necessary? •...
  • Page 17: Performing Processor Procedures In The Troubleshooting Process

    Performing processor procedures in the troubleshooting process Because this document supports multiple generations of HP ProLiant server models, it also covers processes that include troubleshooting of various models and types of processors. Before performing any troubleshooting steps that involve processors, review the following guidelines: •...
  • Page 18: Performing Processor Procedures In The Troubleshooting Process (On

    Always use the recommended minimum configuration above before removing any processors. If you are unable to isolate the issue with the configuration above, you will then remove all but one of the additional processors. CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)."...
  • Page 19: Common Problem Resolution

    Common problem resolution Loose connections Action: • Be sure all power cords are securely connected. • Be sure all cables are properly aligned and securely connected for all external and internal components. • Remove and check all data and power cables for damage. Be sure no cables have bent pins or damaged connectors.
  • Page 20: Dimm Handling Guidelines

    DIMM handling guidelines CAUTION: Failure to properly handle DIMMs can cause damage to DIMM components and the system board connector. When handling a DIMM, observe the following guidelines: • Avoid electrostatic discharge (on page 15). • Always hold DIMMs by the side edges only. •...
  • Page 21: Hard Drive Led Combinations

    • Drives must be the same capacity to provide the greatest storage space efficiency when drives are grouped together into the same drive array. Hard drive LED combinations Hot-plug SCSI hard drive LED combinations Activity Online Fault LED Interpretation LED (1) LED (2) On or off Flashing...
  • Page 22: Server Updates With An Hp Trusted Platform Module And Bitlocker™ Enabled

    Interpretation Online/activity Fault/UID LED LED (green) (amber/blue) On, off, or flashing Alternating amber The drive has failed, or a predictive failure alert has been received and blue for this drive; it also has been selected by a management application. On, off, or flashing Steadily blue The drive is operating normally, and it has been selected by a management application.
  • Page 23: Diagnostic Flowcharts

    Diagnostic flowcharts Troubleshooting flowcharts To effectively troubleshoot a problem, HP recommends that you start with the first flowchart in this section, "Start diagnosis flowchart (on page 25)," and follow the appropriate diagnostic path. If the other flowcharts do not provide a troubleshooting solution, follow the diagnostic steps in "General diagnosis flowchart (on page 25)."...
  • Page 24 HP BladeSystem c-Class Technical Documentation (http://www.hp.com/go/bladesystem/documentation) Select Support, Drivers and Manuals, and then select the product. Select Manuals, and then locate the link for the maintenance and service guide. HP BladeSystem p-Class Support and Documents (http://www.hp.com/products/servers/proliant-bl/p-class/info) To locate the HP BladeSystem p-Class System Maintenance and Service Guide, select the product. Select Manuals (guides, supplements, addendums, etc).
  • Page 25: Start Diagnosis Flowchart

    Start diagnosis flowchart Use the following flowchart to start the diagnostic process. General diagnosis flowchart Diagnostic flowcharts 25...
  • Page 26 The General diagnosis flowchart provides a generic approach to troubleshooting. If you are unsure of the problem, or if the other flowcharts do not fix the problem, use the following flowchart. Diagnostic flowcharts 26...
  • Page 27: Power-On Problems Flowchart

    Power-on problems flowchart Server power-on problems flowchart Some servers have an internal health LED and an external health LED, while other servers have a single system health LED. The system health LED provides the same functionality as the two separate internal and external health LEDs.
  • Page 28 Diagnostic flowcharts 28...
  • Page 29 p-Class server blade power-on problems flowchart c-Class server blade power-on problems flowchart For the location of server LEDs and information on their statuses, see the server documentation on the HP website (http://www.hp.com/support). Diagnostic flowcharts 29...
  • Page 30 Symptoms: • The server does not power on. • The system power LED is off or amber. • The health LED is red or amber. Possible causes: • Improperly seated or faulty power supply • Loose or faulty power cord •...
  • Page 31: Post Problems Flowchart

    POST problems flowchart Symptoms: • Server does not complete POST NOTE: The server has completed POST when the system attempts to access the boot device. • Server completes POST with errors Possible problems: • Improperly seated or faulty internal component •...
  • Page 32 Server and p-Class server blade POST problems flowchart Diagnostic flowcharts 32...
  • Page 33: Operating System Boot Problems Flowchart

    c-Class server blade POST problems flowchart Operating system boot problems flowchart Symptoms: • Server does not boot a previously installed OS • Server does not boot SmartStart Possible causes: • Corrupted OS • Hard drive subsystem problem Diagnostic flowcharts 33...
  • Page 34 • Incorrect boot order setting in RBSU There are two ways to use SmartStart when diagnosing OS boot problems on a server blade: • Use iLO to remotely attach virtual devices to mount the SmartStart CD onto the server blade. •...
  • Page 35: Server Fault Indications Flowchart

    Server fault indications flowchart Symptoms: • Server boots, but a fault event is reported by Insight Management Agents • Server boots, but the internal health LED, external health LED, or component health LED is red or amber NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation.
  • Page 36 For the location of server LEDs and information on their statuses, see the server documentation on the HP website (http://www.hp.com/support). Diagnostic flowcharts 36...
  • Page 37 c-Class server blade fault indications flowchart Diagnostic flowcharts 37...
  • Page 38: Hardware Problems

    Hardware problems Procedures for all ProLiant servers The procedures in this section are comprehensive and include steps about or references to hardware features that may not be supported by the server you are troubleshooting. CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)."...
  • Page 39: Ups Problems

    For more information, see the HP Power Advisor on the HP website (http://www.hp.com/go/hppoweradvisor). If running a redundant configuration, be sure that all of the power supplies in the system are the same. For a list of supported power supplies, see the server documentation on the HP website (http://www.hp.com/support).
  • Page 40: General Hardware Problems

    General hardware problems Problems with new hardware Action: Be sure the hardware being installed is a supported option on the server. For information on supported hardware, see the server documentation. If necessary, remove unsupported hardware. To be sure the problem is not caused by a change to the hardware release, see the release notes included with the hardware.
  • Page 41: Unknown Problem

    Be sure all boards are properly installed in the server. To see if the utility recognizes and tests the device, run HP Insight Diagnostics (on page 76). Uninstall the new hardware. Unknown problem Action: Power down and disconnect power to the server. Following the guidelines and cautionary information in the server documentation, reduce the server to the minimum hardware configuration by removing all cards or devices that are not necessary to start the server.
  • Page 42: Internal System Problems

    If the device is the only device on a bus, be sure the bus works by installing a different device on the bus. Restarting the server each time to determine if the device is working, move the device: To a different slot on the same bus (not applicable for PCI Express) To a PCI, PCI-X, or PCI Express slot on a different bus To the same slot in another working server of the same or similar design If the board works in any of these slots, either the original slot is bad or the board was not properly...
  • Page 43: Diskette Drive Problems

    If attempting to boot from a USB CD-ROM drive: Refer to the operating system and server documentation to be sure both support booting from a USB CD-ROM drive. Be sure legacy support for a USB CD-ROM drive is enabled in RBSU. Data read from the drive is inconsistent, or drive cannot read data Action: Clean the drive and media.
  • Page 44: Drive Problems (Hard Drives And Solid State Drives)

    Drive is not found Action: Be sure no loose connections (on page 19) exist with the drive. Non-system disk message is displayed Action: Remove the non-system diskette from the drive. Check for and disconnect any non-bootable USB devices. Diskette drive cannot write to a diskette Action: If the diskette is not formatted, format the diskette.
  • Page 45 • If the drive fault LED is flashing, replace the hard drive. See the server maintenance and service guide. • If the drive fault LED is not flashing and the operating system supports HP Insight Diagnostics, version 7.40 or later ("HP Insight Diagnostics"...
  • Page 46: Sd Card Problems

    Be sure no loose connections (on page 19) exist. Be sure the correct drive controller drivers are installed. Be sure the hard drive is configured properly: To determine the proper configuration, see the hard drive documentation. Remove the hard drive and be sure the configuration jumpers are set properly. For a non-hot-plug hard drive, be sure a conflict does not exist with another hard drive.
  • Page 47: Usb Drive Key Problems

    Reseat the SD card. USB drive key problems System does not boot from the drive Action: Be sure that USB is enabled in RBSU. Be sure the drive boot order in RBSU is set so that the server boots from the USB drive key. Reseat the USB drive key.
  • Page 48: Hp Trusted Platform Module Problems

    Be sure hot-plug fan requirements are being met. Refer to the server documentation. All fans in an HP BladeSystem c-Class enclosure are operating at a high speed ...while fans in the other enclosures are operating at normal speed. Action: If all fan LEDs are solid green, but the fans in this enclosure are operating at a higher speed than normal, then access more information from the Onboard Administrator or iLO 3.
  • Page 49: Memory Problems

    Memory problems General memory problems are occurring Action: • Isolate and minimize the memory configuration. Use care when handling DIMMs ("DIMM handling guidelines" on page 20). Be sure the memory meets the server requirements and is installed as required by the server. Some servers may require that memory banks be populated fully or that all memory within a memory bank must be the same size, type, and speed.
  • Page 50 Be sure a memory count error did not occur ("Memory count error exists" on page 49). See the message displaying memory count during POST. Server fails to recognize new memory Action: Be sure the memory is the correct type for the server and is installed according to the server requirements.
  • Page 51: Ppm Problems

    PPM problems Action: If the PPMs are not integrated on the system board: CAUTION: Do not operate the server for long periods with the access panel open or removed. Operating the server in this manner results in improper airflow and improper cooling that can lead to thermal damage.
  • Page 52: Tape Drive Problems

    If the server includes PPMs that are not integrated on the system board, remove all PPMs from the server except for the PPM associated with the remaining processor. Replace the remaining processor with a known functional processor. If the problem is resolved after you restart the server, a fault exists with one or more of the original processors.
  • Page 53: Graphics And Video Adapter Problems

    CAUTION: Running the Drive Assessment Test overwrites the tape. If it is not possible to overwrite the tape, run the logs-based Device Analysis Test instead. Check the backup logs. Verify that a supported configuration is being used. Check for media damage: Incorrect label placement Broken, missing, or loose leader pin Damaged cartridge seam...
  • Page 54: System Open Circuits And Short Circuits

    • Be sure that the server has adequate power to support the video or graphic option. Some high-power adapters require specific cabling, fans, or power. For more information, see the documentation that ships with the option, or see the server documentation on the HP website (http://www.hp.com/support).
  • Page 55: Mouse And Keyboard Problems

    Press any key, or type the password, and wait a few moments for the screen to activate to be sure the energy saver feature is not in effect. Be sure the video driver is current. Refer to the third-party video adapter documentation for driver requirements.
  • Page 56: Audio Problems

    Be sure the device driver is not corrupted by replacing the driver. Restart the system and check whether the input device functions correctly after the server restarts. Replace the device with a known working equivalent device (another similar mouse or keyboard). If the problem still occurs with the new mouse or keyboard, the connector port on the system I/O board is defective.
  • Page 57: Modem Problems

    Modem problems No dial tone exists Action: Be sure the cables are plugged in as specified in the modem documentation. Connect a working telephone directly to the wall jack, and then test the line for a dial tone. If no dial tone is detected, the phone line is not working. Contact the local telephone company and arrange to correct the problem.
  • Page 58 Modem does not connect to another modem Action: Be sure a dial tone exists. Be sure the line is not in use at another extension before using it. Be sure you are dialing the correct telephone number. Be sure the modem on the other end is working. Modem disconnects while online Action: Be sure no loose connections (on page 19) exist.
  • Page 59: Network Controller Problems

    You are unable to connect at 56 Kbps Action: Find out the maximum baud rate at which the ISP connects, and change the settings to reflect this. Reattempt to connect at a lower baud rate. Be sure no line interference exists. Retry the connection by dialing the number several times. If conditions remain poor, contact the telephone company to have the line tested.
  • Page 60: Expansion Board Problems

    Network controller stopped working when an expansion board was added Action: Be sure no loose connections (on page 19) exist. Be sure the server and operating system support the controller. Refer to the server and operating system documentation. Be sure the new expansion board has not changed the server configuration, requiring reinstallation of the network driver.
  • Page 61: Software Problems

    Software problems The best sources of information for software problems are the operating system and application software documentation, which may also point to fault detection tools that report errors and preserve the system configuration. Other useful resources include HP Insight Diagnostics (on page 76) and HP SIM. Use either utility to gather critical system hardware and software information and to help with problem diagnosis.
  • Page 62: Operating System Updates

    Errors are displayed in the error log Action: Follow the information provided in the error log, and then refer to the operating system documentation. Problems occur after the installation of a service pack Action: Follow the instructions for updating the operating system ("Operating system updates"...
  • Page 63: Restoring To A Backed-Up Version

    If you apply the update and have problems, locate files to correct the problems on the HP website (http://www.hp.com/support). Restoring to a backed-up version If you recently upgraded the operating system or software and cannot resolve the problem, you can try restoring a previously saved version of the system.
  • Page 64: Linux Operating Systems

    Sun Solaris—Device Configuration Assistant boot diskette. Refer to the Solaris documentation for more information. IBM OS/2—Power up the server from the startup diskettes. Refer to the OS/2 documentation for more information. Linux—Refer to the operating system documentation for information. Linux operating systems For troubleshooting information specific to Linux operating systems, refer to the Linux for ProLiant website (http://h18000.www1.hp.com/products/servers/linux).
  • Page 65: Rom Problems

    ROM problems Remote ROM flash problems General remote ROM flash problems are occurring Action: Be sure you follow these requirements for using the Remote ROM flash utility: • A local administrative client system that is running the Microsoft® Windows NT® 4.0, Windows®...
  • Page 66: Boot Problems

    Failure occurs during ROM flash After the online flash preparation has been successfully completed, the system ROM is flashed offline. The flash cannot be interrupted during this process, or the ROM image is corrupted and the server does not start. The most likely reason for failure is a loss of power to the system during the flash process.
  • Page 67 Remove the access panel. Change positions 1, 5, and 6 of the system maintenance switch to on. Install the access panel. Install the server into the rack. Power up the server. After the system beeps, repeat steps 1 through 3. Change positions 1, 5, and 6 of system maintenance switch to off.
  • Page 68: Software Tools And Solutions

    Software tools and solutions Configuration tools SmartStart software SmartStart is a collection of software that optimizes single-server setup, providing a simple and consistent way to deploy server configuration. SmartStart has been tested on many ProLiant server products, resulting in proven, reliable configurations. SmartStart assists the deployment process by performing a wide range of configuration activities, including: •...
  • Page 69: Using Rbsu

    • Displaying system information • Selecting the primary boot controller • Configuring memory options • Language selection For more information on RBSU, see the HP ROM-Based Setup Utility User Guide on the Documentation CD or the HP website (http://www.hp.com/support/smartstart/documentation). Using RBSU To use RBSU, use the following keys: •...
  • Page 70: Boot Options

    By default, the auto-configuration process configures the system for the English language. To change any default settings in the auto-configuration process (such as the settings for language, operating system, and primary boot controller), execute RBSU by pressing the F9 key when prompted. After the settings are selected, exit RBSU and allow the server to reboot automatically.
  • Page 71: Array Configuration Utility

    Array Configuration Utility ACU is a browser-based utility with the following features: • Runs as a local application or remote service • Supports online array capacity expansion, logical drive extension, assignment of online spares, and RAID or stripe size migration •...
  • Page 72: Option Rom Configuration For Arrays

    • Partition types, sizes, or layout • Software RAID information • Operating system device names or mount points Option ROM Configuration for Arrays Before installing an operating system, you can use the ORCA utility to create the first logical drive, assign RAID levels, and establish online spare configurations.
  • Page 73: Management Tools

    Select the Advanced Options menu. Select Service Options. Select Serial Number. The following warnings appear: WARNING! WARNING! WARNING! The serial number is loaded into the system during the manufacturing process and should NOT be modified. This option should only be used by qualified service personnel. This value should always match the serial number sticker located on the chassis.
  • Page 74: Remote Insight Lights-Out Edition Ii

    Remote Insight Lights-Out Edition II RILOE II enables browser access to servers through a hardware-based, OS-independent graphical remote console. Some of the features include virtual diskette drive and power button, server management through any standard browser, dedicated LAN connectivity, automatic network configuration, external power backup, group administration, and functions available with the Remote Insight Board.
  • Page 75: Erase Utility

    • Access advanced troubleshooting features through the iLO 3 interface. For more information about iLO 3 features (which may require an iLO Advanced Pack or iLO Advanced for BladeSystem license), see the iLO 3 documentation on the Documentation CD or on the HP website (http://www.hp.com/go/ilo).
  • Page 76: Diagnostic Tools

    • Operating environments which do not provide native USB support Diagnostic tools HP Insight Diagnostics HP Insight Diagnostics is a proactive server management tool, available in both offline and online versions, that provides diagnostics and troubleshooting capabilities to assist IT administrators who verify server installations, troubleshoot problems, and perform repair validation.
  • Page 77: Integrated Management Log

    Survey functionality is installed with every SmartStart-assisted HP Insight Diagnostics installation, or it can be installed through the HP PSP ("ProLiant Support Packs" on page 79). NOTE: The current version of SmartStart provides the memory spare part numbers for the server. To download the latest version, see the HP website (http://www.hp.com/support).
  • Page 78: Keeping The System Current

    service level. Notifications may be sent to your authorized HP Channel Partner for on-site service, if configured and available in your country. The software is available in two variants: • HP Insight Remote Support Standard: This software supports server and storage devices and is optimized for environments with 1–50 servers.
  • Page 79: Proliant Support Packs

    • VCRM manages the repository for Windows and Linux PSPs as well as online firmware. Administrators can browse a graphical view of the PSPs or configure VCRM to automatically update the repository with Internet downloads of the latest software from HP. •...
  • Page 80: System Online Rom Flash Component Utility

    • Performs local or remote (one-to-many) online deployment • Deploys firmware and software together • Supports offline and online deployment • Deploys necessary component updates only (except Linux RPMs) • Downloads the latest components from Web (except Linux RPMs) • Enables direct update of BMC firmware (iLO and LO100i) For more information about HP Smart Update Manager and to access the HP Smart Update Manager User Guide, see the HP website (http://www.hp.com/go/foundation).
  • Page 81: Care Pack

    Care Pack HP Care Pack Services offer upgraded service levels to extend and expand bundled services with easy-to-buy, easy-to-use support packages that help you make the most of your server investments. For more information, see the HP website (http://www.hp.com/services/carepack). Firmware maintenance HP has developed technologies to help ensure that HP servers provide maximum uptime with minimal maintenance.
  • Page 82: Verifying Firmware Versions

    When you flash the system ROM, ROMPaq writes over the backup ROM and saves the current ROM as a backup, enabling you to switch easily to the alternate ROM version if the new ROM becomes corrupted for any reason. This feature protects the existing ROM version, even if you experience a power failure while flashing the ROM.
  • Page 83: Updating Firmware

    • Subscriber's Choice (on page 80) Updating firmware To update the firmware: Check the firmware version on the device ("Verifying firmware versions" on page 82). Determine the latest firmware version available. If a TPM is installed and enabled on the server, disable BitLocker™ before updating the firmware. For more information, see the operating system documentation.
  • Page 84: System Rompaq Firmware Upgrade Utility

    System ROMPaq Firmware Upgrade Utility The Systems ROMPaq Firmware Upgrade Utility for ProLiant servers is available as a SoftPaq download from the HP website (http://www.hp.com/support). The Enhanced SoftPaq download contains utilities to restore or upgrade the System ROM on ProLiant servers: •...
  • Page 85: Unsupported Processor Stepping With Intel® Processors

    Online deployment To deploy components in an online manner: Choose one of the following options: Insert the Firmware Maintenance CD or DVD, or Smart Update Firmware DVD. The firmware maintenance interface opens automatically. In Linux, if autostart is not enabled, you must manually start the CD or DVD. Browse to the contents, and then select hpsum.exe.
  • Page 86: Unsupported Processor Stepping With Amd Processors

    A new or replacement processor may be a newer stepping. At boot, the server indicates if the current system ROM does not support the new stepping processor. The following message is displayed: Unsupported Processor Detected System will ONLY boot ROMPAQ Utility. If this message is displayed, update the system ROM in one of the following ways: •...
  • Page 87: Hp Resources For Troubleshooting

    HP resources for troubleshooting Online resources HP Technical Support website Troubleshooting tools and information, as well as the latest drivers and flash ROM images, are available on the HP website (http://www.hp.com/support). HP Guided Troubleshooting website HP Guided Troubleshooting is available for many products and components on the HP website (http://www.hp.com/support/gts).
  • Page 88: Hp Care Pack Services

    To create a profile and select notifications, refer to the HP website (http://www.hp.com/go/subscriberschoice). Change control and proactive notification HP offers Change Control and Proactive Notification to notify customers 30 to 60 days in advance of upcoming hardware and software changes on HP commercial products. For more information, refer to the HP website (http://www.hp.com/go/pcn).
  • Page 89: Teardown Procedures, Part Numbers, Specifications

    Teardown procedures, part numbers, specifications Refer to the server maintenance and service guide, available in the following locations: • Documentation CD that ships with the server • HP Business Support Center website (http://www.hp.com/go/bizsupport) • HP Technical Documentation website (http://www.docs.hp.com) Technical topics Refer to white papers on one of the following: •...
  • Page 90: Ddr3 Memory Configuration

    DDR3 memory configuration See the DDR3 Memory Configuration Tool on the HP website (http://www.hp.com/go/ddr3memory-configurator). Operating System Version Support For information about specific versions of a supported operating system, refer to the operating system support matrix (http://www.hp.com/go/supportos). Operating system installation and configuration information (for factory-installed operating systems) Refer to the factory-installed operating system installation documentation that ships with the server.
  • Page 91: Installation And Configuration Information For The Server Management System

    Installation and configuration information for the server management system Refer to the HP Systems Insight Manager Installation and User Guide on the Management CD or the HP website (http://www.hp.com/go/hpsim). Fault tolerance, security, care and maintenance, configuration and setup Refer to the server documentation available in the following locations: •...
  • Page 92: Error Messages

    Error messages ADU error messages Introduction to ADU error messages This section contains a complete alphabetical list of all ADU ("Array diagnostic software" on page 77) error messages for ADU version 7.85.16.0 and earlier. IMPORTANT: This guide provides information for multiple servers. Some information may not apply to the server you are troubleshooting.
  • Page 93 Accelerator Status: Cache was Automatically Configured During Last Controller Reset Description: Cache board was replaced with one of a different size. Action: No action is required. Accelerator Status: Data in the Cache was Lost..due to some reason other than the battery being discharged. Description: Data in cache was lost, but not because of the battery being discharged.
  • Page 94 Accelerator Status: Obsolete Data Detected Description: During reset initialization, obsolete data was found in the cache due to the drives being moved and written to by another controller. Action: No action is required. The controller either writes the data to the drives or discards the data completely.
  • Page 95: Array Accelerator Battery Pack X Not Fully Charged

    Accelerator Status: Warranty Alert Description: Catastrophic problem exists with array accelerator board. Refer to other messages on Diagnostics screen for exact meaning of this message. Action: Replace the array accelerator board. Adapter/NVRAM ID Mismatch Description: EISA NVRAM has an ID for a different controller from the one physically present in the slot. Action: Run the server setup utility.
  • Page 96: Configuration Signature Is Zero

    Configuration Signature is Zero Description: ADU ("Array diagnostic software" on page 77) detected that NVRAM contains a configuration signature of zero. Old versions of the server setup utility could cause this. Action: Run the latest version of server setup utility to configure the controller and NVRAM. Configuration Signature Mismatch Description: The array accelerator board is configured for a different array controller board.
  • Page 97: Controller Restarted With A Signature Of Zero

    Controller Reported POST Error. Error Code: X Description: The controller returned an error from its internal POST. Action: Replace the controller. Controller Restarted with a Signature of Zero Description: ADU ("Array diagnostic software" on page 77) did not find a valid configuration signature to use to get the data.
  • Page 98 Drive (Bay) X is a Replacement Drive Description: This drive has been replaced. This message is displayed if a drive is replaced in a fault-tolerant logical volume. Action: If the replacement was intentional, allow the drive to rebuild. Drive (Bay) X is a Replacement Drive Marked OK Description: The drive has been replaced and marked OK by the firmware, in one of three possible scenarios: the drive was replaced in a non-fault-tolerant configuration;...
  • Page 99: Drive Monitoring Features Are Unobtainable

    Drive Monitoring Features Are Unobtainable Description: ADU ("Array diagnostic software" on page 77) is unable to get monitor and performance data due to a fatal command problem (such as drive time-out), or is unable to get data due to these features not being supported on the controller.
  • Page 100: Identify Logical Drive Data Did Not Match With Nvram

    Identify Logical Drive Data did not Match with NVRAM Description: The identify unit data from the array controller does not match with the information stored in NVRAM. This can occur if new, previously configured drives have been placed in a system that has also been previously configured.
  • Page 101 Action: Check for drive failures, wrong drive replaced, or loose cable messages. If a drive failure occurred, replace the failed drive or drives, and then restore the data for this logical drive from the tape backup. Otherwise, follow the procedures for correcting problems when an incorrect drive is replaced or a loose cable is detected.
  • Page 102: Mirror Data Miscompare

    Logical Drive X Status = Wrong Drive Replaced Description: A physical drive in this logical drive has failed. The incorrect drive was replaced. Action: Power down the server. Replace the drive that was incorrectly replaced. Replace the original drive that failed with a new drive. CAUTION: Do not run the server setup utility and try to reconfigure, or data will be lost.
  • Page 103: Other Controller Indicates Different Hardware Model

    Other Controller Indicates Different Hardware Model Description: The other controller in the redundant controller configuration is a different hardware model. Action: Be sure both controllers are using the same hardware model. If they are, make sure the controllers are fully seated in their slots. Other Controller Indicates Different Firmware Version Description: The other controller in the redundant controller configuration is using a different firmware version.
  • Page 104: Ris Copies Between Drives Do Not Match

    RIS Copies Between Drives Do Not Match Description: The drives on this controller contain copies of the RIS that do not match. The hard drives in the array do not have matching configuration information. Action: Resolve all other errors encountered. Obtain the latest version of ADU, and then rerun ADU ("Array diagnostic software"...
  • Page 105: Set Configuration Command Issued

    Description: SMART is unable to communicate with the drive, because the cable is not securely connected, or the drive cage connection has failed. Action: Power down the system. Reconnect the cable securely. Restart the system. If the problem persists, replace the cables and connectors as needed. SCSI Port X, Drive ID Y RIS Copies Within This Drive Do Not Match Description: The copies of RIS on the drive do not match.
  • Page 106: Soft Firmware Upgrade Required

    Soft firmware upgrade required Description: ADU ("Array diagnostic software" on page 77) has determined that the controller is running firmware that has been soft upgraded by the Upgrade Utility. However, the firmware running is not present on all drives. This could be caused by the addition of new drives in the system. Action: Update all drives to the latest firmware version ("Firmware maintenance"...
  • Page 107: Storage Enclosure On Scsi Bus X Indicated That The Fan Is Degraded

    Description: The cooling fan located in the external storage unit has failed. Action: Replace the fan. Storage Enclosure on SCSI Bus X Indicated that the Fan is Degraded..SOLUTION: this condition usually occurs on enclosures with multiple fans and one of those fans has failed. Replace any fans not operating properly.
  • Page 108 Place the drives in their original locations. Restart the server, and then complete the expand operation. Move the drives to their new locations after the expand operation is completed. Swapped Cables or Configuration Error Detected. An Unsupported Drive Arrangement Was Attempted..SOLUTION: Power down system then move drives back to their original location.
  • Page 109: System Board Is Unable To Identify Which Slots The Controllers Are In

    Description: More logical drives were created than are supported on this controller, causing lost logical drive volumes. Action: Identify the drives containing lost volumes, and then move them to another controller so the lost volumes can be recreated. CAUTION: Removing a drive that contains valid volume data causes all valid data to be lost. System Board is Unable to Identify which Slots the Controllers are in Description: The slot indicator on the system board is not working correctly.
  • Page 110: Unknown Disable Code

    Unable to Communicate with Drive on SCSI Port X, Drive ID Y Description: The array controller cannot communicate with the drive. Action: If the hard drive amber LED is on, replace the drive. Unable to Retrieve Identify Controller Data. Controller May be Disabled or Failed ...SOLUTION: Power down the system.
  • Page 111 WARNING - Drive Write Cache is Enabled on X Description: Drive has its internal write cache enabled. The drive may be a third-party drive, or the operating parameters of the drive may have been altered. Condition can cause data corruption if power to the drive is interrupted.
  • Page 112: Adu Version 8.0 Through 8.28 Error Messages

    Write Memory Error Description: Data cannot be written to the cache memory. This typically means that a parity error was detected while writing data to the cache. This can be caused by an incomplete connection between the cache and the controller. This is not a data loss circumstance. Action: Power down the system and be sure that the cache board is fully connected to the controller.
  • Page 113 Array Accelerator: This controller has been set up to be a part of a redundant pair of controllers..but the array accelerator cache sizes are different on the two controllers. Make certain that both controllers are using array accelerators with the same amount of cache memory installed. Action: Adjust the memory installed in the array accelerators to matching sizes.
  • Page 114 Drive Offline due to Erase Operation: The physical drive is offline and the erase process has completed..The drive may now be brought online through the re-enable erased drive command in ACU. Action: Re-enable the physical drive using the Array Configuration Utility (on page 71). Drive Offline due to Erase Operation: The physical drive is offline from having an erase in progress.
  • Page 115 Logical drive state: The logical drive is queued for expansion. Action: No action is required. Logical drive state: The logical drive is queued for rebuilding. Action: No action is required. Normal operations can occur; however, performance will be less than optimal during the rebuild process.
  • Page 116: Post Error Messages And Beep Codes

    Redundancy State: This controller has been setup to be part of a redundant pair of controllers..but redundancy is disabled. Redundancy is disabled for an unknown reason. Action: Contact HP support ("Contacting HP" on page 177). Redundant Path Failure: Multi-domain path failure Action: Check the storage device I/O module and cables to restore redundant paths.
  • Page 117: Non-Numeric Messages Or Beeps Only

    Non-numeric messages or beeps only Advanced Memory Protection mode: Advanced ECC Audible Beeps: None Possible Cause: Advanced ECC support is enabled. Action: None. Advanced Memory Protection mode: Advanced ECC with hot-add support Audible Beeps: None Possible Cause: Advanced ECC with Hot-Add support is enabled. Action: None.
  • Page 118: Fatal Dma Error

    Critical Error Occurred Prior to this Power-Up Audible Beeps: None Possible Cause: A catastrophic system error, which caused the server to crash, has been logged. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 76) and replace failed components as indicated.
  • Page 119: Fatal Global Protocol Error

    Replace any failed processors or reseat any loose processors. Fatal Global Protocol Error Audible Beeps: None Possible Cause: The system experienced a critical error that caused an NMI. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 76) and replace failed components as indicated.
  • Page 120: Ilo Generated Nmi

    iLO Generated NMI Audible Beeps: None Possible Cause: The iLO controller generated an NMI. Action: Check the iLO logs for details of the event. Internal CPU Check - Processor Audible Beeps: None Possible Cause: A processor has experienced an internal error. Action: Run Insight Diagnostics ("HP Insight...
  • Page 121: Network Server Mode Active And No Keyboard Attached

    CAUTION: Before installing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)." Failure to follow the recommended guidelines can cause damage to the system board requiring replacement of the system board.
  • Page 122: No Floppy Drive Present

    Node Interleaving disabled - Invalid memory configuration Description: Each node must have the same memory configuration to enable interleaving. Action: Populate each node with the same memory configuration and enable interleaving in RBSU. No Floppy Drive Present Audible Beeps: None Possible Cause: No diskette drive is installed or a diskette drive failure has occurred.
  • Page 123: Power Supply Solution Not Fully Redundant

    Power Supply Solution Not Fully Redundant Audible beeps: None Possible cause: The minimum power supply requirement is installed, but a redundant power supply is missing or failed. Action: Do one of the following: • Install a power supply. • Replace failed power supplies to complete redundancy. Processor X Unsupported Wattage.
  • Page 124: This Dimm Does Not Support Thermal Monitoring

    Audible Beeps: None Possible Cause: The primary system ROM is corrupt. The system is booting from the redundant ROM. Action: Run ROMPaq Utility to restore the system ROM to the correct version. Temperature violation detected - system Shutting Down in X seconds Audible Beeps: 1 long, 1 short Possible Cause: The system has reached a cautionary temperature level and is shutting down in X seconds.
  • Page 125: Unsupported Pci Card Detected Remove Pci Card From Slot

    Unsupported PCI Card Detected Remove PCI Card from Slot Audible beeps: 2 short Possible cause: The PCI card installed in the slot referenced in the message is strictly not supported on this system. Action: Remove the card from the slot reported in the message. Unsupported power supply detected in bay X Audible Beeps: 1 long, 1 short Possible Cause: The power supply in bay X is not supported by the server.
  • Page 126 Remove power to USB drive and reboot. The following message should appear: OBDR is now enabled for the attached USB tape drive. WARNING: A Type 2 Header PCI Device Has Been Detected... The BIOS will not configure this card. It must be configured properly by the OS or driver. Audible Beeps: 2 short Possible Cause: Only Type 0 and Type 1 Header PCI Devices are configured by the system ROM.
  • Page 127: 100 Series

    WARNING: ProLiant Demand Based Power Management cannot be supported with the following processor configuration. The system will run in Full Performance mode. Audible Beeps: None Possible Cause: The system is configured for HP Static Low mode and the current processor cannot support this mode.
  • Page 128 Action: Replace the system board. Run the server setup utility. 102-System Board Failure, CMOS Test Failed. Audible Beeps: None Possible Cause: 8237 DMA controllers, 8254 timers, and similar devices. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board.
  • Page 129: Series

    162-System Options Not Set Audible Beeps: 2 long Possible Cause: Configuration is incorrect. The system configuration has changed since the last boot (addition of a hard drive, for example) or a loss of power to the real-time clock has occurred. The real-time clock loses power if the onboard battery is not functioning correctly.
  • Page 130 207 - Invalid Memory Configuration Detected. DIMMs installed when no corresponding processor is detected. Description: Processor is required to be installed for memory to be used. Action: CAUTION: Before installing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)."...
  • Page 131 207-Invalid Memory Configuration - Mismatched DIMMs within DIMM Bank Audible Beeps: 1 long, 1 short Possible Cause: Installed DIMMs in the same bank are of different sizes. Action: Install correctly matched DIMMs. 207-Invalid Memory Configuration - Mismatched DIMMs within DIMM Bank..Memory in Bank X Not Utilized.
  • Page 132: Invalid Lockstep Memory Configuration

    Audible Beeps: 1 long, 1 short, or none Possible Cause: Installed DIMMs have a primary width of x8. Action: Install DIMMs that have a primary width of x4 if Advanced ECC memory support is required. 209-Online Spare Memory Configuration - No Valid Banks for Online Spare Audible Beeps: 1 long, 1 short Possible Cause: Two valid banks are not available to support an online spare memory configuration.
  • Page 133: 300 Series

    210-Memory Board Power Fault on board X Audible Beeps: 1 long, 1 short Possible Cause: A problem exists with a memory board powering up properly. Action: Exchange DIMMs and retest. Replace the memory board if problem persists. 210-Memory Board Failure on board X Audible Beeps: 1 long, 1 short Possible Cause: A problem exists with a memory board powering up properly.
  • Page 134: 400 Series

    301-Keyboard Error or Test Fixture Installed Audible Beeps: None Possible Cause: Keyboard failure occurred. Action: Power down the server, and then reconnect the keyboard. Be sure no keys are depressed or stuck. If the failure reoccurs, replace the keyboard. 303-Keyboard Controller Error Audible Beeps: None Possible Cause: System board, keyboard, or mouse controller failure occurred.
  • Page 135: 600 Series

    404-Parallel Port Address Conflict Detected..A hardware conflict in your system is keeping some system components from working correctly. If you have recently added new hardware remove it to see if it is the cause of the conflict. Alternatively, use Computer Setup or your operating system to insure that no conflicts exist.
  • Page 136: 1100 Series

    611-Primary Floppy Port Address Assignment Conflict Audible Beeps: 2 short Possible Cause: A hardware conflict in the system is preventing the diskette drive from operating properly. Action: Run the server setup utility to configure the diskette drive port address and manually resolve the conflict. Run Insight Diagnostics ("HP Insight Diagnostics"...
  • Page 137 Audible Beeps: None Possible Cause: The temperature measured by one of the system temperature sensors has exceeded acceptable levels. In many cases, this is due to the ambient inlet air temperature exceeding acceptable levels. Action: Be sure to follow all of the environmental requirements for the server. •...
  • Page 138 1611-Fan x Failure Detected (Fan Zone CPU) Audible Beeps: 2 short Possible Cause: Required fan is not installed or spinning. Action: Check the fans to be sure they are working. Be sure each fan cable is properly connected, if applicable, and each fan is properly seated. If the problem persists, replace the failed fans.
  • Page 139 Be sure the assembly is properly connected and each fan is properly seated. If the problem persists, replace the failed fans. If a known working replacement fan is not spinning, replace the assembly. 1611-Power Supply Zone Fan Assembly Failure Detected. Single fan..failure.
  • Page 140: 1700 Series

    1615-Power Supply Failure, Power Supply Unplugged, or Power Supply Fan Failure in Bay X Audible Beeps: None Possible Cause: The power supply has failed, or it is installed but not connected to the system board or AC power source. Action: Reseat the power supply firmly and check the power cable or replace power supply. 1616-Power Supply Configuration Failure -A working power supply must be installed in Bay 1 for proper cooling.
  • Page 141 1708 - Slot X Drive Array Controller - Bootstrap NVRAM restored from backup. System restart required Audible Beeps: None Possible Cause: The specified Smart Array controller Bootstrap NVRAM was restored in one of the following ways: • It was detected as corrupt, and the backup copy was restored. •...
  • Page 142 Audible Beeps: None Possible Cause: Flash ROM is failing. The controller detects a checksum failure, but is unable to reprogram the backup ROM. Action: Update the controller to the latest firmware version ("Firmware maintenance" on page 81). If the problem persists, replace the controller. 1714-Slot X Drive Array Controller - Redundant ROM Reprogramming Failure ...Backup ROM has automatically been activated.
  • Page 143 Audible Beeps: None Possible Cause: The firmware does not support the number of devices currently attached to the controller. Action: • If release notes indicate that support for additional devices has been added, upgrade to the latest version of controller firmware. •...
  • Page 144 Audible Beeps: None Possible Cause: Drive parameter tracking reports a predictive-failure condition on the indicated drive. It may fail at some time in the future. Action: • If the drive is part of a non-fault-tolerant configuration, back up all data before replacing the drive and restore all data afterward.
  • Page 145 Audible Beeps: None Possible Cause: The controller has detected an additional array of drives that was connected when the power was off. The logical drive configuration information has been updated to add the new logical drives. The maximum number of logical drives supported is 32. Additional logical drives will not be added to the configuration.
  • Page 146 Audible Beeps: None Possible Cause: An unsupported redundant cabling configuration for the Smart Array firmware version is installed. Action: Disconnect the redundant SAS cables, and then update the Smart Array firmware to the correct version. 1736-HP Trusted Platform Module Error Audible Beeps: 2 short Possible Cause: A TPM is installed, but the System ROM is unable to communicate with the TPM.
  • Page 147 Audible Beeps: None Possible Cause: A problem exists with the storage enclosure redundant cabling. A single path was found to drives that were previously connected redundantly. Action: Check the storage box I/O module and cable to restore redundant paths to the drives, then do one of the following: •...
  • Page 148 Audible Beeps: None Possible Cause: A drive erase operation was previously initiated by the user and is in progress or is scheduled for all drives in the list. Action: None required 1745-Slot X Drive Array - Drive Erase Operation Completed..The following disk drive(s) have been erased and will remain offline until hot-replaced or re-enabled by the Array Configuration Utility: (followed by a list of drives)
  • Page 149 Action: Attach an Array Accelerator memory module to this controller, or move the drives back to the original controller. If Capacity Expansion operations are pending, be sure that the original Array Accelerator module is attached. 1748-Slot X Drive Array - Unsupported Array Accelerator Battery Attached..Please install battery pack(s) with the correct part number.
  • Page 150 Audible Beeps: None Possible Cause: The current controller firmware does not support the attached Array Accelerator module type. Action: Upgrade the controller firmware, or replace the Array Accelerator module. 1763-Array Accelerator Daughtercard is Detached; Please Reattach Audible Beeps: None Possible Cause: Array accelerator module is loose, missing, or defective. Action: Reseat array accelerator module.
  • Page 151 Action: • Press the F2 key to accept the data loss and re-enable the logical drives. • Restore data from backup. • Replace drive or array accelerator, as appropriate. 1770-Slot X Drive Array - SCSI Drive Firmware Update Recommended - ..Please upgrade firmware on the following drive(s) using ROM Flash Components (download from www.hp.com/support/proliantstorage): Model XYZ (minimum version = ####) Audible Beeps: None...
  • Page 152 1776-Slot X Drive Array - SCSI Bus Termination Error ...Internal and external drives cannot both be attached to the same SCSI port. SCSI port Y: Check cables Audible Beeps: None Possible Cause: External and internal connectors of the specified SCSI ports are connected to drives. The indicated SCSI bus is disabled until this problem is resolved.
  • Page 153: Drive Array Resuming Automatic Data Recovery Process

    SCSI Port Y: Side-Panel must be Closed to Prevent Overheating SCSI Port Y: Redundant Power Supply Malfunction Detected SCSI Port Y: Wide SCSI Transfer Failed SCSI Port Y: Interrupt Signal Inoperative SCSI Port y: Unsupported ProLiant Storage System Detected Audible Beeps: None Possible Cause: Environment threshold was violated on the drive enclosure.
  • Page 154: Slot X Drive Array Controller Failure

    1783-Slot X Drive Array Controller Failure Audible Beeps: None Possible Cause: The controller failed. Action: Reseat the array accelerator module. Reseat the controller in the PCI slot. Update the controller to the latest firmware version ("Firmware maintenance" on page 81). If the problem persists, replace the controller.
  • Page 155 1786-Disk 0 Software RAID Failure, Booting Disk 1 Audible Beeps: None Possible Cause: The operating system has marked the RAID 1 bootable partition on Disk 0 as bad or the hard drive has failed. Action: The system attempts to boot from Disk 1. Perform one of the following actions: •...
  • Page 156: Drive Array Operating In Interim Recovery Mode

    If the replacement drive failed, replace with another drive. If the rebuild was aborted due to a read error from another physical drive in the array, back up all readable data on the array, run ADU, and then restore the data. 1787-Drive Array Operating in Interim Recovery Mode...
  • Page 157 Possible Cause: Drives that were working when the system was last used are now missing or are not starting up. A possible drive problem or loose SCSI cable exists. Action: Power down the system. Be sure all cables are properly connected. Be sure all drives are fully seated.
  • Page 158 1795-Drive Array - Array Accelerator Configuration Error..Data does not correspond to this drive array. Array Accelerator is temporarily disabled. Audible Beeps: None Possible Cause: Power was interrupted while data was in the array accelerator memory, or the data stored in the array accelerator does not correspond to this drive array.
  • Page 159: Event List Error Messages

    Event list error messages Introduction to event list error messages This section contains event list error messages recorded in the IML ("Integrated Management Log" on page 77), which can be viewed through different tools. The format of the list is different when viewed through different tools. An example of the format of an event as displayed on the IMD follows: **001 of 010** ---caution---...
  • Page 160: Automatic Operating System Shutdown Initiated Due To Fan Failure

    Automatic operating system shutdown initiated due to fan failure Event Type: Fan failure Action: Replace the fan. Automatic Operating System Shutdown Initiated Due to Overheat Condition..Fatal Exception (Number X, Cause) Event Type: Overheating condition Action: Check fans. Also, be sure the server is properly ventilated and the room temperature is set within the required range.
  • Page 161: Processor Correctable Error Threshold Passed (Slot X, Socket Y)

    Processor Correctable Error Threshold Passed (Slot X, Socket Y) Event Type: Correctable error threshold exceeded Action: CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board.
  • Page 162: System Fans Not Redundant

    System Fans Not Redundant Event Type: Fans not redundant Action: Add a fan or replace the failed fan. System Overheating (Zone X, Location) Event Type: Overheating condition Action: Check fans. System Power Supplies Not Redundant Event Type: Power supply not redundant Action: Add a power supply or replace the failed power supply.
  • Page 163: Server Blade Management Module Error Codes

    Access the diagnostics. For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). Server blade management module error codes Server blade error codes Location LED codes 1-1 or 1-2 Server Blade - Slot 1 2-1 or 2-2 Server Blade - Slot 2 3-1 or 3-2...
  • Page 164 Press the server blade management module reset button. Replace the power backplane. For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). Server blade management module power backplane B error codes LED code: 12-1, 12-2, 12-3, or 12-4 Location: Server blade power backplane B Action: Perform the following steps to resolve the problem.
  • Page 165 Location: Interconnect module - side A (10-connector) Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. Press the server blade management module reset button. Reseat the interconnect module. For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info).
  • Page 166: Power Management Module Error Codes

    For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). Replace the interconnect module. For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). Unknown server blade management module error code LED code: 19-1 Location: Unknown Action: Perform the following steps to resolve the problem.
  • Page 167: Port 85 Codes And Ilo Messages

    Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. Press the power management module reset button. Unknown power management module error code LED code: 19-1 Location: Unknown Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. Press the power management module reset button.
  • Page 168: Memory-Related Port 85 Codes

    Hard drives Peripheral devices IMPORTANT: Processor socket 1 and PPM slot 1 must be populated at all times or the server does not function properly. CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board.
  • Page 169: Expansion Board-Related Port 85 Codes

    Bring the server to base configuration by removing all components that are not required by the server to complete POST. For more information, see "Breaking the server down to the minimum hardware configuration (on page 17)." This process can include removing all: Expansion boards DIMMs, except the first bank Hard drives...
  • Page 170: Miscellaneous Port 85 Codes

    CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board. Remove all processors and PPMs, except the processor installed in socket 1 and the corresponding PPM.
  • Page 171: Windows® Event Log Processor Error Codes

    Windows® Event Log processor error codes Message ID: 4137 Severity: Error Description: The processor in slot X, socket X has corrected an excessive number of internal errors. The system will continue to operate. Action: CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)."...
  • Page 172: Insight Diagnostics Processor Error Codes

    Description: The system encountered an NMI prior to this boot. The NMI source was "Uncorrectable cache memory error." Action: CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board.
  • Page 173: Msg_Cpu_Rr_5

    • Ensure the processor heatsinks are attached correctly (do not remove them). • Check diagnostics and the Integrated Management Log (on page 77) for heat-related events. • Upgrade to the latest versions of system BIOS and HP Insight Diagnostics (on page 76). CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)."...
  • Page 174: Msg_Cpu_Rr_9

    Action: Replace the board that CMOS is on. MSG_CPU_RR_9 Event type: MMX hardware is not present. Action: CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board.
  • Page 175: Msg_Cpu_Rr_13

    MSG_CPU_RR_13 Event type: MMX logical instruction has failed. Action: CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 17)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board.
  • Page 176: Msg_Cpu_Rr_17

    MSG_CPU_RR_17 Event type: Stress integer math test has failed. Action: • Ensure proper ventilation and cooling for the server. • Ensure the processor heatsinks are attached correctly (do not remove them). • Check diagnostics and the Integrated Management Log (on page 77) for heat-related events. •...
  • Page 177: Contacting Hp

    Contacting HP Contacting HP technical support or an authorized reseller Before contacting HP, always attempt to resolve problems by completing the procedures in this guide. IMPORTANT: Collect the appropriate server information ("Server information you need" on page 177) and operating system information ("Operating system information you need"...
  • Page 178: Operating System Information You Need

    • Explanation of the issue, the first occurrence, and frequency • Any changes in hardware or software configuration before the issue surfaced • Third-party hardware information: Product name, model, and version Company name • Specific hardware configuration: Product name, model, and serial number Number of processors and speed Number of DIMMs and their size and speed List of controllers and NICs...
  • Page 179: Linux Operating Systems

    • An updated Emergency Repair Diskette • If HP drivers are installed: Version of the PSP used List of drivers from the PSP • The drive subsystem and file system information: Number and size of partitions and logical drives File system on each logical drive •...
  • Page 180: Novell Netware Operating Systems

    • A list of each third-party hardware component installed, with the firmware revisions • A list of each third-party software component installed, with the versions • A detailed description of the problem and any associated error messages Novell NetWare operating systems Collect the following information: •...
  • Page 181: Ibm Os/2 Operating Systems

    /etc/conf/cf.d/sdevice /etc/inittab /etc/conf/cf.d/stune /etc/conf/cf.d/config.h /etc/conf/cf.d/sdevice /var/adm/messages (if PANIC messages are displayed) • If HP drivers are installed: Version of the EFS used List of drivers from the EFS • If management agents are installed, version number of the agents • System dumps, if they can be obtained (in case of panics) •...
  • Page 182: Sun Solaris Operating Systems

    Whether Entry, Advanced, Advanced with SMP, or e-Business All services running at the time the problem occurred • A list of each third-party hardware component installed, with the firmware revisions • A list of each third-party software component installed, with the versions •...
  • Page 183: Acronyms And Abbreviations

    Acronyms and abbreviations ABEND abnormal end ACPI Advanced Configuration and Power Interface Array Configuration Utility Advanced Data Guarding (also known as RAID 6) Array Diagnostics Utility Advanced Memory Protection Automatic Server Recovery baseboard management controller CCITT International Telegraph and Telephone Consultative Committee CMOS complementary metal-oxide semiconductor central processing unit...
  • Page 184 direct memory access driver update error checking and correcting Extended Feature Supplement EISA Extended Industry Standard Architecture electrostatic discharge FBDIMM fully buffered DIMM Firmware Deployment Tool HP SIM HP Systems Insight Manager integrated device electronics Integrated Lights-Out iLO 2 Integrated Lights-Out 2 iLO 3 Integrated Lights-Out 3 Integrated Management Display...
  • Page 185 Integrated Management Log interrupt request keyboard, video, and mouse low-voltage differential multimedia extensions non-maskable interrupt NVRAM non-volatile memory OBDR One Button Disaster Recovery ORCA Option ROM Configuration for Arrays PCI-X peripheral component interconnect extended POST Power-On Self Test processor power module ProLiant Support Pack Preboot Execution Environment Acronyms and abbreviations 185...
  • Page 186 RBSU ROM-Based Setup Utility RILOE Remote Insight Lights-Out Edition RILOE II Remote Insight Lights-Out Edition II reserve information sector Red Hat Package Manager storage area network serial attached SCSI SATA serial ATA Systems Insight Manager SIMM single inline memory module Service Pack 1 support software diskette trusted platform module...
  • Page 187 universal serial bus Version Control Agent VCRM Version Control Repository Manager video graphics array Acronyms and abbreviations 187...
  • Page 188: Index

    Index 120PCI.HAM backplane, error codes backup issue, tape drive backup, restoring batteries, insufficient warning when low accelerator error log batteries, replacing accelerator status 92, 93, 94 battery 38, 41, 99, 111, 135 ACPI support battery pack, array accelerator ACU (Array Configuration Utility) beep codes adapters 94, 99...
  • Page 189 drive errors 42, 43, 55, 96, 97, 98, 101, 104, configuration errors 106, 107, 128, 157 configuration of system 67, 89 109, 121, 134, 135, 146 configuration signature drive failure 55, 103, 153 configuration tools drive failure, detecting drive LEDs 21, 42 connection errors 57, 58...
  • Page 190 fans 46, 47, 117, 118, 136, 137, 138, 159 HP Insight Diagnostics 75, 158, 171, 172, 173, fault-tolerance methods 174, 175 FBDIMMs 20, 48 HP Insight Diagnostics survey functionality features 8, 87 HP Insight Remote Support software HP ProLiant Essentials Foundation Pack Fibre Channel adapters firmware 65, 78, 88, 95, 105...
  • Page 191 LED combinations, SATA hard drive LED combinations, SCSI hard drive network connection problems LEDs network controller problems LEDs, hard drive network controllers 58, 59 LEDs, PPM failure network interconnect blades LEDs, processor failure new hardware LEDs, troubleshooting NMI event 117, 118, 119, 120 Linux 63, 178 no dial tone...
  • Page 192 processor problems 17, 50, 84, 85, 119, 120, physical drive state port 85 code, expansion board-related 132, 170 port 85 code, list processor stepping 84, 85, 125 port 85 code, memory-related processor uncorrectable internal error processor-related port 85 codes port 85 code, miscellaneous processors 17, 50, 88, 119, 120, 122, 123, 124, port 85 code, processor-related...
  • Page 193 software 60, 67, 88 ROM problems ROM redundancy 65, 74 software errors ROM update utility software failure ROM, types software problems software RAID failure ROM, updating 65, 79 software resources 67, 89 ROM, verifying ROM-Based Setup Utility (RBSU) 41, 67 software troubleshooting 60, 63 ROMPaq Disaster Recovery...
  • Page 194 technical topics websites, reference 23, 86 telephone numbers what's new temperature 118, 123, 135 when to reconfigure or reload software testing devices white papers 86, 88 third-party devices 40, 98 Windows Event Log processor error codes time and date, setting TPM (Trusted Platform Module) 22, 45, 47, 59, 65 troubleshooting...

Table of Contents