This file is available on a Cryptome DVD offered by Cryptome. Donate $25 for a DVD of the Cryptome 10-year archives of 35,000 files from June 1996 to June 2006 (~3.5 GB). Click Paypal or mail check/MO made out to John Young, 251 West 89th Street, New York, NY 10024. Archives include all files of cryptome.org, cryptome2.org, jya.com, cartome.org, eyeball-series.org and iraq-kill-maim.org. Cryptome offers with the Cryptome DVD an INSCOM DVD of about 18,000 pages of counter-intelligence dossiers declassified by the US Army Information and Security Command, dating from 1945 to 1985. No additional contribution required -- $25 for both. The DVDs will be sent anywhere worldwide without extra cost.


6 April 1999. Thanks to John Ganter
Source: http://ganter.sandia.gov/orfac/NisError/ for full report (50K).


SAND98-2737
Unlimited Release
Printed February 1999
Document information and disclaimers

Managing Errors to Reduce Accidents
in High Consequence
Networked Information Systems

John H. Ganter
Decision Support Systems Software Engineering
Sandia National Laboratories
P. O. Box 5800
Albuquerque, New Mexico 87185
jganter@sandia.gov, http://ganter.sandia.gov

This paper is based on a presentation at the Workshop on Information Assurance and Trustworthy Networks, held by the Cross Industry Working Team (XIWT) and Bellcore in Washington, D.C., 17-18 November 1998.

ABSTRACT

Computers have always helped to amplify and propagate errors made by people. The emergence of Networked Information Systems (NISs), which allow people and systems to quickly interact worldwide, has made understanding and minimizing human error more critical. This paper applies concepts from system safety to analyze how hazards (from hackers to power disruptions) penetrate NIS defenses (e.g., firewalls and operating systems) to cause accidents. Such events usually result from both active, easily identified failures and more subtle latent conditions that have resided in the system for long periods. Both active failures and latent conditions result from human errors. We classify these into several types (slips, lapses, mistakes, etc.) and provide NIS examples of how they occur. Next we examine error minimization throughout the NIS lifecycle, from design through operation to reengineering. At each stage, steps can be taken to minimize the occurrence and effects of human errors. These include defensive design philosophies, architectural patterns to guide developers, and collaborative design that incorporates operational experiences and surprises into design efforts. We conclude by looking at three aspects of NISs that will cause continuing challenges in error and accident management: immaturity of the industry, limited risk perception, and resource tradeoffs.

Contents

Introduction
Concepts for Describing Failures and Accidents in Systems
Some Terms for Describing Human Effects in Systems
System Defenses and Accident Trajectories
Paradoxical Defenses: Defenses that Have the Potential To Be Hazardous
Defenses Throughout the System Lifecycle
. The design phase
. The operations phase
. Maintenance phase

Continuous Safety Management Challenges in NISs
Conclusions
References