Printer Friendly
The Free Library
5,670,445 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Software failure: counting up the risks.


When Boeing's new 777 airliner first takes to the skies in a few years, computers will control such crucial functions as setting flaps and adjusting engine speed. Electrical circuits will relay a pilot's actions to these computers, where complicated programs will interpret the signals and send out the instructions necessary for carrying out the appropriate maneuvers. Pilots will no longer fly the aircraft via direct electrical and mechanical controls, except when using an emergency backup system Noun 1. backup system - a computer system for making backups
ADP system, ADPS, automatic data processing system, computer system, computing system - a system of one or more computers and associated software with common storage
.

Because of the disastrous consequences of even a single fault, the software for such a computer system must be extremely reliable. A new analysis, however, demonstrates that testing complex software to estimate the probability of failure cannot establish that a given computer program actually meets such high levels of reliability.

The analysis also affirms that using multiple programs, which independently arrive at an answer to a given problem. doesn't necessarily guarantee sufficiently high reliability.

"This leaves us in a terrible bind," say Ricky W. Butler and George B. Finelli of the NASA NASA: see National Aeronautics and Space Administration.
NASA
 in full National Aeronautics and Space Administration

Independent U.S.
 Langley Research Center Langley Research Center (LaRC) Oldest of NASA's field centers, LaRC is located in Hampton, Virginia and directly borders Poquoson, Virginia and Langley Air Force Base. LaRC focuses primarily on aeronautical research, though the Lunar Lander was flight-tested at this facility and a  in Hampton, Va., the computer scientists who performed the analysis. "We want to use digital processors in life-critical applications, but we have no feasible way of establishing that they meet their ultra-reliability requirements."

In a paper presented last week in New Orleans New Orleans (ôr`lēənz –lənz, ôrlēnz`), city (2006 pop. 187,525), coextensive with Orleans parish, SE La., between the Mississippi River and Lake Pontchartrain, 107 mi (172 km) by water from the river mouth; founded  at the Association for Computing (body) Association for Computing - (ACM, before 1997 - "Association for Computing Machinery") The largest and oldest international scientific and educational computer society in the industry.  Machinery's conference on software for critical systems, they argue: "Without a major change in the design and verification methods used for life-critical systems A life-critical system or safety-critical system is a system whose failure or malfunction may result in:
  • death or serious injury to people, or
  • loss or severe damage to equipment or
  • environmental harm.
, major disasters are almost certain to occur with increasing frequency."

Many military aircraft and the European-built A320 airliner already use computer-controlled "fly-by-wire" systems. Computers also play important roles in medical technology, transportation systems, industrial plants, nuclear power stations This is a list of major nuclear power plants in all countries in the world.

This is an incomplete list. You can help

Name of power station Installed capacity in MW Country
Atucha I nuclear power plant 357 Argentina
 and telephone networks - realms in which a software failure can cause tragedy (SN: 2/16/91, p.104).

"I think this is ... an important paper," says David L. Parnas, a computer scientist at McMaster University McMaster University, at Hamilton, Ont., Canada; nondenominational; founded 1887. It has faculties of humanities, science, social sciences, business, engineering, and health sciences, as well as a school of graduate studies and a divinity college.  in Hamilton, Ontario. "Its very convincing and provides a lot of insight."

The traditional method of determining the reliability of a light bulb or a piece of electronic equipment involves observing the frequency of failures among a sample of test specimens operated under realistic conditions for a predetermined pre·de·ter·mine  
v. pre·de·ter·mined, pre·de·ter·min·ing, pre·de·ter·mines

v.tr.
1. To determine, decide, or establish in advance:
 period of time. Using these data, engineers can estimate failure probabilities of not only individual components but also entire systems.

Unlike hardware, however, software doesn't wear out or break. "Software errors are the product of improper human reasoning," Butler says.

Unless they are caught, software errors persist throughout a system's lifetime. That makes conventional methods of risk assessment difficult to apply.

The problem is further compounded by the high degree of reliability required for life-critical applications. Historically, manufacturers of aircraft and other systems in which faults could threaten human lives have accepted a reliability level that corresponds to a failure rate of about 1 in a billion for every hour of operation.

Butler and Finelli demonstrate that techniques often used by computer scientists and programmers to quantify software risk take too long to be practical when used to assess systems that require such high reliability. For example, software design often involves a repetitive cycle of testing and repair, in which the program is tested until it fails. Testing resumes after the cause of failure is determined and the fault repaired.

But it generally takes longer and longer to find and remove each successive fault. To establish that a complicated computer program presents minimal risk would require year, if not decades, of testing on the fastest computers available. Butler says.

In an attempt to reduce the risk of failure, computer-system designers sometimes use multiple versions of a program, written by different teams, to perform certain functions. The idea is that although each version may contain flaws, it's highly unlikely that all or even a majority of the programs would contain the same error. However, experiments have shown that computer programs independently written to do the same thing often contain surprisingly similar mistakes.

Many computer experts at last week's meeting pointed to these findings as evidence that limits should be placed on the complexity of computer programs that go into life-critical applications. "Do we want to run with systems that are not as demonstrably de·mon·stra·ble  
adj.
1. Capable of being demonstrated or proved: demonstrable truths.

2. Obvious or apparent: demonstrable lies.
 safe as we say they are ... when we cannot demonstrate ultra-reliability before development?" asks Martyn Thomas of Praxis prax·is  
n. pl. prax·es
1. Practical application or exercise of a branch of learning.

2. Habitual or established practice; custom.
 plc, in Bath, England.

"We should build only those systems that rely on software to a degree that can be assessed," contends Bev Littlewood of City University in London, England. That means accepting a higher risk or building simpler computer systems.

A few remain optimistic op·ti·mist  
n.
1. One who usually expects a favorable outcome.

2. A believer in philosophical optimism.



op
. "Maybe we're being a lot more demanding than we need to be," says John D. Musa of AT&T Bell Laboratories in Murray Hill Murray Hill may refer to one of the following places:
  • Murray Hill, Kentucky
  • Murray Hill, Manhattan, a residential neighborhood in New York City
  • Murray Hill, Queens, a different locality in New York City
  • Murray Hill, New Jersey
  • Murray Hill, Pennsylvania
, N.J. "There are risks in everything we do in engineering."

He adds that software developers have a variety of tools and techniques that can help them deliver - if not assess - highly reliable systems.
COPYRIGHT 1991 Science Service, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1991, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Author:Peterson, Ivars
Publication:Science News
Date:Dec 14, 1991
Words:815
Previous Article:Waving a red flag against melanoma.
Next Article:'Tis the season for an El Nino warming.
Topics:



Related Articles
New help in cleaning up your machines' hydraulic fluid. (Technology News)
How auditors can detect financial statement misstatement. (includes related articles involving case studies)
PROFESSIONAL SERVICES: HOW TO SELL TRUST.(Company Business and Marketing)
Picking a business partner.
Directors: The Enterprise SAN Building Blocks.(Technology Information)
POWER OUTAGE SNARLS SCV; DEPUTIES, COURTS, DETENTION CENTER LEFT SCRAMBLING BY LOSS OF POWER.(NEWS)
Felonious housekeeping?(Insider Report)
Thrombocytopenia and acute renal failure in Puumala hantavirus infections.(Research)
Increasing process reliability in fine-pitch wire bonding: a 2-year study identifies close ties between capillary performance and bonding...
Viewpoint--networks first.(SOFTWARE INTELLIGENCE)(Viewpoint essay)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles