Enclosures & backplanes: Next-Generation chassis monitoring.
Until recently, a single technician or engineer maintained a few pieces of critical equipment. Today's paradigm has changed the economics of equipment monitoring and maintenance: Companies, as well as the military, can no longer afford to assign dedicated personnel to monitor equipment. Moreover, with shrinking electronic packaging, entire systems can be tucked away in areas inaccessible to or too costly for humans to service, such as on a mountaintop, an airplane wing or deep within a naval vessel.
In the past, redundancy has often been employed as a method of increasing system reliability, but this method further complicates system packaging and software development and essentially doubles deployment costs. Today's reduced parts count and greater package densities improve system reliability, possibly eliminating the need for a redundant system--provided that stable power and effective cooling can be guaranteed.
Additional factors influence the market for chassis monitoring solutions, such as the military's growing dependence on its "buy COTS" directive. Systems that were designed for commercial use, and are designed only for limited temperature ranges, are now being stretched to operate in much harsher environments and must operate reliably for three to five times the commercial product life expectancy. A product that monitors and controls system environmentals and reports anomalies is vital.
In order to operate the system within its defined limits, the product must be able to monitor and control the main system processor and report the results to an independent outside monitoring station perhaps halfway around the world. The system monitor can prevent damage to the system by automating the controlled shutdown of sub-systems, such as power supplies, when voltages fall far out of specifications or temperatures become dangerously high. It can also reset the system remotely in the event of a system hang. Should a system fail, an independently-operating system monitor can be a valuable aid in failure analysis by tracking the events that led up to the failure by storing error conditions in a non-volatile failure log.
Current Requirements for Systems Monitoring
A System Monitor Must ...
* Monitor temperature, voltages and fans. (Monitoring other functions such as airflow is a plus.)
* Present actual temperatures, voltages and fan speeds to the user.
* Control fan speeds and perform an action of the user's choosing upon severe overtemp or overvoltage.
* Permit user-adjustable action upon overtemp.
* Control the power to a system as well as the reset line, allowing for unattended shutdown, re-start, and a cold or warm boot.
A System Monitor Should ...
* Log total power-on hours for maintenance purposes as well as log abnormal conditions in non-volatile memory.
* Withstand extremes of temperature and vibration.
* Communicate via Ethernet and RS-232.
* Be easily configurable by the user to work in a variety of applications.
* Maintain security through encryption and/or multiple-level passwords to prevent unauthorized control of a system.
* Not require special software.
* Be small enough to fit into nearly any chassis.
* Have an independent power source.
* Feature firmware that can be field upgraded.
Why Past Implementations Didn't Work
The need for monitoring has been understood for many years, and dozens of schemes have been designed to implement it. Particularly in the VME and CompactPCT[TM] industries, system monitoring has become a new focus of opportunity for electronic packaging suppliers. Perhaps the most ambitious is the now ubiquitous Intelligent Platform Management Interface (IPMI). Championed primarily by Dell, Hewlett-Packard, Intel and NEC, IPMI has a complete and complex protocol that uses I2C as the basic physical interface. Dozens of sensor types are defined and supported by IPMI which has a self-discovery feature that allows a system to become aware of a new, previously unknown sensor that has been hot-inserted. And, because the SMBUS I2C interface specification is already written into the PICMIG specs, many telecommunications companies, especially those deploying CompactPCI, AdvancedTCA[TM] and CompactTCA systems, use a subset of IPMI as their means of monitoring.
Using IPMI makes sense if all the boards in the system utilize the full implementation, but this is seldom the case. For most users, IPMI is a very difficult protocol for what would otherwise be a relatively simple task. In addition, I2C, Inter-Integrated Circuit, the physical link of IPMI, was originally developed to communicate only across a single PCB (although extensions have been added to the specification). The design criteria have some relatively severe restrictions on the allowed capacitance of the bus, which limits bus length unless translators or repeaters are used.
Other than IPMI, no true contender has emerged as the overall standard for system monitoring and control. However, due to the difficulty in implementing IPMI, many manufacturers have built proprietary systems. While these often work well on a specific system, they often lack flexibility and can be difficult to configure. Most efforts directed to system monitoring eventually fail because they are built as part of a specific project and therefore lack easy configurability. Constantly modifying the hardware and software for other purposes puts strains on already limited engineering resources. Attempts to make a "one-size-fits-all" monitor have not been completely successful due to either "tunnel-vision," failure to understand the broad needs of the industry or "creeping elegance." The goals of the monitor require more resources than the entire rest of the system. A generalized system monitor simply didn't exist--one that had a broad feature set, was easily configurable and would work in virtually any chassis, and thus could be sold as a stand-alone unit.
To be small, rugged and easily configured, a microcontroller with a high degree of integration, a large code space, a powerful instruction set and plenty of I/O was needed. Based on these criteria, Dawn implemented its new "RuSH[TM]" [micro]P technology in its System Health Monitor, Model 426, using advanced hardware in conjunction with internally-developed microcode developed to support both the "musts" and "shoulds" stated above.
The code was written as a series of modules or "blocks" that perform specific tasks such as monitoring voltage and fan speed, displaying characters on the LCD, or sending data out over the Internet. This modularity allows for easy customization to suit the needs of virtually any customer with only minimal NRE charges. This building-block approach results in reduced test time and improved quality since each block has been thoroughly debugged and proven. Any additional code needed for customer-specific tasks can be written almost without regard to existing code. In addition, all system-control functions are password-protected to insure that unauthorized personnel do not have the ability to change critical system parameters. Although the System Monitor does not support the full implementation of IPMI, it nonetheless can act as a capable IPMI node.
by Charles Linquist, Dawn VME Products
Charles Linquist is Assistant Chief Technology Officer and heads New Product Engineering for Dawn VME Products in Fremont, CA. He is a graduate of Iowa State University's engineering school. He can be reached at (510) 657-4444 or email@example.com.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Field Applications|
|Publication:||ECN-Electronic Component News|
|Date:||Oct 1, 2004|
|Previous Article:||Four-way tactile switches with a center-push button.|
|Next Article:||Low-weight, high-strength PXI chassis for high-power applications.|