Printer Friendly

On-line books: an advanced technology electronic library system.

The United States Marine Corps has built a state-of-the-art information storage and retrieval system at its research center in Quantico, Virginia. The system, called On-Line Books, enables any marine with a computer and modem anywhere in the world to access an electronic catalog, and soon, the complete text of the Marine Corps University's war-fighting collections.

After using On-Line Books' hypertext capabilities to search and select the specific pages or even paragraphs desired, a marine can request a hard copy printout, or mail the information electronically to any mailbox in the Marine Corps' worldwide network.

"This electronic library is the only one of its kind in the world able to both view and mail full text," said Colonel William Pedersen, director of the Marine Corps Central Design and Programming Activity (MCDPA), which was responsible for development of On-Line Books.

Optical Disc Storage of On-Line

Books Data

Central to On-Line Books is the Storage Machine, a mainframe-attached, optical disc-based network archive server developed by FileTek Inc., 6100 Executive Boulevard, Rockville, MD 20852 (301-984-1542; fax 301-770-2568).

FileTek specializes in storage management solutions for Fortune 1000 companies and governmental agencies. Current customers include Perot Systems Corporation, the National Aeronautics and Space Administration (NASA), Shearson Lehman Brothers, Liberty Mutual, United Airlines, Con Edison, and the Department of Defense.

The Storage Machine Incorporates advanced hierarchical storage management software, magnetic disks, multiple terabytes of optical discs mounted in robotic library devices, and shelf storage. It connects to IBM or DEC host mainframes via direct channel interfaces or through a high-speed local area network. Storage capacities range from 234 gigabytes to over 4 terabytes for a single server configuration. Users retrieve and view data from standard 3270 terminals.

The Storage Machine allows for the optional storage of index data for user files on optical platters, greatly reducing potential host Direct Access Storage Disk (DASD) requirements. Host-based applications that utilize IBM or DEC teleprocessing environments can retrieve, update, append, and delete record-level information during actual transactions.

Colonel Pedersen compared the cost of storage on optic disc with the cost of storage on conventional, mainframe-based disks. "Data stored on the Storage Machine optical disc costs ten cents per megabyte," said Pedersen. "This is a dramatic reduction over the $10 per megabyte to store the same information on DASD."

The Storage Machine is capable of handling scanned text, graphical images, audio, and video. According to Pedersen, "The Storage Machine is the mothership for EVERYTHING that defines warfighting: books, papers, maps, oral histories, speeches, and live film footage. From this data repository we can press individual selections into special collections of CD-ROM to enable marines to fight smart in any clime and place."

The Marine Corps University can prepare and produce custom multimedia CD-ROM discs in support of specific Marine Corps missions within a matter of hours.

The Storage Machine combines automated storage management software with a hierarchy of media including magnetic disk for highspeed buffering, "jukeboxes" with optical disc platters, and shelf storage. Robotic accessors in the jukeboxes transfer cartridges between slots and drives automatically as users request information.

Storage Machine software manages resource allocation and contention. It also controls backup and migration of files to the appropriate storage level, creation of record indices and file catalogs, directory and transfer management, and On-Line Books recovery.

The Storage Machine is an enterprise-wide total data integration solution to information management problems associated with the storage of all types of archival information including coded data, voice, image, and video. Among its many uses, the Storage Machine is an ideal alternative to computer output microfiche (COM) and printed system reports. It provides record-level support for applications and system products that generate COM and printed report data.

The Storage Machine at the Marine Corps Research Center contains ninety slots, each able to hold a 7-gigabyte (billion byte) optical disc platter, for total capacity of 630 gigabytes. This easily satisfies the Marine Corps' current storage requirements and leaves room for future growth. It is the equivalent of 234,000 three hundred-page books in which 10 percent of the pages are images and 90 percent of the pages are text.

FileTek and the Marine Corps worked together to develop a software interface between the Storage Machine and MCCAT (Marine Corps Catalog), an electronic catalog. The interface allows users to browse through the full text of the Marine Corps library's holding and to mail text and graphical images electronically to any location within the Corps' worldwide network.

The Storage Machine:

Mainframe-Attached Storage

Archive Server

The Storage Machine's main architectural components include a channel network interface, a Storage and Transfer Processor, and a layered storage hierarchy composed of magnetic disk, multiple terabytes of optical discs mounted in a robotic library, and shelf storage. The Storage Machine provides direct channel bus and tag connectivity from IBM MVS processors using FileTek's direct channel adapter. It also provides connectivity through a high-speed local are network (LAN).

A LAN gives users a number of advantages: support for different host environments, easy attachment to multiple hosts, and installation flexibility resulting from support of long cable lengths. The system has an aggregate performance rate of 750K per second from the host channel to actual storage on magnetic and optical media, providing there are enough optical drives.

The Storage Machine is an intelligent subsystem that manages storage resources automatically. Using a hierarchy of optical and magnetic devices, it determines optimal storage locations for full-text documents based on access frequency, document size, user-specified parameters, and media capacity. By automating migrations and back-ups, the Storage Machine eliminates or minimizes the need for operator intervention and reduces the changes of losing or misplacing documents.

Back-up is automatic. There is no need for users to execute special utilities to copy volumes or files or to worry that back-ups are incomplete due to files spanning platters that were inadvertently omitted from the copy procedure. The software supports a second or alternative back-up that can be removed from the Storage Machine for off-site storage. In the unlikely event that primary volumes become corrupted, the system accepts back-up copies with minimal operational intervention and no modifications to current applications.

Managing storage on most mainframe-based systems is an expensive task. Industry experts estimate that without system managed storage, it takes one person to manage every 10 gigabytes of data.

The Storage Machine system software dramatically reduces this ratio because it controls gigabytes and terabytes of information automatically for the user. Software, rather than additional personnel, controls the migration, or movement, of files and volumes throughout the storage hierarchy.

FileTek's Virtual Records Access Manager (VRAM) software is an example of the Storage Machine's integrated hardware and software design. VRAM capitalizes on the direct record-level access capability of optical technology by providing users with critical data access and management functions.

For indexed files, VRAM automatically creates and stores indices with data on optical platters. This feature gives the Storage Machine considerable price performance advantages over alternative solutions that use expensive host DASD resources for archived file index information.

With the Storage Machine, file and index growth is not limited by the amount of host resources. Also, the VRAM software automatically manages file index information. Data and index management are in sync because both are controlled automatically and updated simultaneously by the same system. The software rather than the user is responsible for maintaining index integrity.

FileTek's Volume Storage Allocation and Control (VSAC) software manages data and storage allocation automatically according to user-specified attributes. VSAC keeps track of both the logical and physical placement of data within the Storage Machine. Users are freed from many responsibilities, including the need to correlate file names to special optical platter volume serial numbers.

The bottom line is that users no longer have to manage the placement of data on volumes to ensure meeting performance requirements, because VSAC provides this capability automatically.

System software is designed to ensure optimum performance automatically. For example, the software minimizes platter mounts through look-ahead queuing. All requests for a mounted optical disc are satisfied before the platter is dismounted.

The software also reduces access time for response-time sensitive requests through preemptive priority processing. There is no wait time for users in need of timely access because, if necessary, the software interrupts lower priority tasks to process higher priority ones.

A cycling feature enables users to fill up the "A" side of all platters before writing to the "B" side, ensuring that a greater percentage of related data is mounted in the disc drives and available for retrieval at the same time.

Storage Hierarchy

The Storage Machine architecture consists of a layered approach to storage management. Layers are composed of magnetic disk, optical discs in library devices, and shelf storage. The Storage Machine automatically manages file and volume movement between storage layers.

Magnetic Disk Layer. Standard configurations contain from 3 to 6 gigabytes of magnetic disk, which can be expanded to meet user requirements. Magnetic disks contain system files and directories, and the most frequently accessed user data. Architecturally, the magnetic layer serves two major purposes:

* Enhanced Performance. FileTek designed the Storage Machine so that users could choose whether to write data directly to optical. By using the magnetic layer in the storage hierarcy to buffer data from the host at faster speds, the write transfer rate for optical does not limit system performance. Users also can write concurrently to optical from the magnetic layer, and magnetic drives can be used as a staging area for higher frequency data retrievals.

* Ease of Recovery. Two magnetic disks come standard with every Storage Machine. System directories are shadowed on these separate physical devices to ensure data integrity and ease of recovery.

Optical Disc and Library Device Layer. In the optical layer of the storage hierarchy, the Storage Machine uses 12-inch WORM (write once read many) optical discs with a capacity of 2.6 or 7 gigabytes per platter. Optical discs reside in optical disc library devices, or jukeboxes, that can hold up to 288 platters. This provides a total library device capacity of over 2 terabytes.

Data Integrity. Media life is superior to tape and exceeds thirty years. Furthermore, optical read-after-write checks ensure data integrity.

Enhanced Performance. Optical platters are not married to slots within the jukebox. Movement and assignment of cartridges to slots is automatic. The most frequently accessed cartridges tend to reside directly over disk drives for fastest access.

Shelf Storage. The Storage Machine maintains directory information for optical volumes storage off-line on shelves as part of an integrated library management capability. Much like a magnetic tape management system, the system operator receives a message to load a specific off-line volume into the library device when a user requests information on that platter. The library device may be kept fully loaded. The least recently used volume in the jukebox is moved out to shelf storage to make room for an accessed shelf storage volume.

Global Access to the Marine


The network includes six IBM 3090 mainframe computers and twenty-five thousand IBM PC/AT or faster desktop computers in Banyan VINES local area networks at Marine Corps installations throughout the world. The LANs are linked with Quantico via a combination of leased telephone lines and satellite transmission.

According to Colonel Pedersen, many Marines may not return to Quantico for formal schooling after their basic training and thus may not be able to use the Marine Corps University Library in person. "Since they can't come here, our task is to bring all the resources of a world-class research facility to them. That is a challenge that every educational institution will appreciate. In fact, during conversations with publishers about our endeavor, they referred to our effort as the Adult Montessori."

"We have an enormous amount of information here that is relevant to our brand of warfighting, including battlefield interviews, video, and computer correspondence from the Persian Gulf War," Colonel Pedersen explained. "The electronic information repository, and the library On-Line Books and worldwide network that make it accessible, will make it feasible to sift through that tremendous mass of data quickly enough to make it of tactical value," he said.

"In less urgent applications, we see On-Line Boks as having tremendous educational benefits for Marines and friends of the Corps worldwide. It can also serve as a prototype for non-military informtion systems," Pedersen added, "The Marines, always first to fight, take great pride in spearheading the educational revolution at the edge of the twenty-five century."

Storage Capabilities Suit Other


The banking, insurance, and non-banking financial service industries are also logical targets for multimedia, multigigabyte storage technology. One of the more interesting applications in government and science is NASA's COBE project.

The Cosmic Background Explorer (COBE) space project, familiar to viewers of PBS's acclaimed "The Astronomers" series, collects cosmic background radiation data that is helping scientists learn more about the origins of the universe.

In June of 1989, the COBE satellite began transmitting data to Earth via the Tracking and Data Relay Satellite System (TDRSS). Data is sent via the Domestic Satellite (DOMSATj) fromthe TDRSS receiving station in White Sands, New Mexico to NASA's Goddard Space Flight Center (GSFC) in Greenbelt, Maryland.

A series of processors known as the COBE Data Capture and Editing System processes the data daily. Processed data is stored on the Storage Machine, ready to be analyzed by scientists at the COBE Science Data Room. The Storage Machine receives and stores 50 megabytes of data from the COBE satellite daily. Scientists have on-line access to this processed telemetry data from several different computer systems for the life of the mission, thus eliminating slow access time, incompatibilities between systems, and the need to reformat tapes.

Historically, the many millions spent on advanced technology to produce and collect data from outer space far outpaced the tools scientists used to store and manage the data. Data was stored on over 73,000 magnetic tapes kept in NASA's Records Center, a building spanning eighteen football fields. Thousands more tapes filled other laboratories across the country.

Scientists at the Data Center issued a request for tapes and then waited one or two days for delivery of the remotely stored tapes. With such slow turnaround time, scientists were unable to study most of the data. In actuality, they examined approximately 10 percent and analyzed only 1 percent carefully.

NASA also uses many different computer systems. Often tapes generated on one system for a specific project were incompatible with computer systems used by other projects. In some instanes, research was significantly slowed down because scientists had to wait months for tapes to be reformatted. FileTek's Storage Machine eliminates the need to reformat and store data on thousand of magnetic tapes and reduces data inquiry time to seconds rather than days.

Several other NASA space missions are also benefiting from the Storage Machine, including the International Solar Terrestrial Physics Program (ISTP) and the Upper Atmospheric Research Satellite (UARS). ISTP provides information about the sun and its effects on the Earth to a group of international scientists. UARS provides the first comprehensive data on the Earth's upper atmosphere, enabling scientists to study the interactions between the atmosphere and man-made elements, and to gauge the amount of ozone layer depletion in the atmosphere.

Bruce Flanders is the director of technology at the Kansas State Library in Topeka, Kansas.
COPYRIGHT 1992 Information Today, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1992 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Flanders, Bruce
Publication:Computers in Libraries
Date:Jan 1, 1992
Previous Article:Electronic publishing.
Next Article:Hands-on library computing.

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters