File systems and storage.Structured data storage gets most of the press--is this the realm of storage area networks, virtualization An umbrella term for enhancing a computer's ability to do work. Following are the ways virtualization is used. Hardware Virtualization Partitioning the computer's memory into separate and isolated "virtual machines" simulates multiple machines within one physical computer. . automated provisioning The ability to set up new communications services for customers automatically. Carriers use automated provisioning to set up their network based on customers' requirements. Such systems control all network devices from a central console and greatly speed up deployment time from days to . But this kind of well-behaved data only makes up 20% of enterprise data--an important 20%, but still a distinct minority. That leaves 80% of enterprise data as semi-structured (e-mail, combined database and file systems) and unstructured (word processing word processing, use of a computer program or a dedicated hardware and software package to write, edit, format, and print a document. Text is most commonly entered using a keyboard similar to a typewriter's, although handwritten input (see pen-based computer) and , spreadsheets, presentations, images). These files take up a large amount of storage capacity and can be difficult to manage, but they contain a lot of business-critical information. Jeff Erramouspe, president and CEO (1) (Chief Executive Officer) The highest individual in command of an organization. Typically the president of the company, the CEO reports to the Chairman of the Board. of Deepfile said, "There's a lot of attention paid to data center-centric data. There's very little information paid to the file system. That tends to be spreadsheets and PowerPoint files and Word files, or Access databases that are used by individuals or groups or departments. It could be creative data, things like source code if you looked at a software development company. This is a highly under-managed area." File system-based storage management is evolving to meet the needs of unstructured data Data that does not reside in fixed locations. Free-form text in a word processing document is a typical example. Contrast with structured data. See free-form database. Dave Howard David Austin Howard (born May 1, 1889 in Washington, DC; died January 26, 1956 in Dallas, Texas) played for Major League Baseball in 1912 and 1915. Teams
Baseball-Reference. , president of Colorado Software Architects, decided to base their storage management development efforts on file systems because of its potential and opportunities. "If the file system is intelligent enough, any application will work with it. Frankly, it was fertile ground because not a lot had been done in that area." Sun Microsystems Sun Microsystems, Inc. (NASDAQ: JAVA[3]) is an American vendor of computers, computer components, computer software, and information-technology services, founded on 24 February 1982. has been firmly in this camp since their acquisition of SAM-FS, which archives files onto secondary media but keeps them online and immediately accessible. Suzan Szollar, Sun's product line manager in marketing, said, "In terms of direction, I think we're looking at all environment where we're distributing more and more of the file system across the SAN and the network, and also being able to work in heterogeneous environments." Opportunities for file-based storage management exist throughout vertical markets. Some of the sectors with the highest file management needs include: * Broadcast and video on demand * Medical/healthcare * Government/military/aerospace * Education * Oil and gas * Manufacturing * Life sciences * Telco. Erramouspe said, "You would never think of treating your structured dam the way you treat your unstructured data. There are five times as much unstructured as structured, but we rarely manage it. Everyone has DBAs, but who has file system managers?" According to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. Marty Ward, director of product marketing, high availability Also called "RAS" (reliability, availability, serviceability) or "fault resilient," it refers to a multiprocessing system that can quickly recover from a failure. There may be a minute or two of downtime while one system switches over to another, but processing will continue. and storage management at Veritas, file-based storage management development continues to build on traditional volume management and file system management services. During the 1990s, developers worked on cluster-based file storage management. such as integrating structured and unstructured data through clustered volume management services. This work flowed into today's file-based storage resource management (SRM (1) (Storage Resource Management) The management of the storage resources in an organization in order to avoid duplication of files and to determine space utilization across all servers. ) research, integrating file and block-based backup and recovery, and tiered storage A data storage system made up of two or more types of storage based on their access speed. For example, magnetic disk and tape or magnetic disk and optical disc are widely used in a tiered storage system. See HSM. . SRM Erramouspe said that even the largest companies haven't managed their files and file systems well. "The reason they haven't gotten it right is they're trying to manage the wrong thing. They're trying to manage disk and volumes and partitions, Our view is they need to manage their files, and if they manage their files well, everything else will take care of itself." SRM is one way of simplifying and extending management resources to file systems. SRM sounds simple enough--a way of managing storage resources. However, the specifics differ wildly and range across physical and logical layers. All SRM packages monitor and report, some add automation and control features, and most are increasingly tied to application-specific service requirements. Physical layer SRM: Works at the fabric and/or device Level. Example: Monitors and reports on port connections and alerts the storage administrator to network congestion In data networking and queueing theory, network congestion occurs when a link or node is carrying so much data that its quality of service deteriorates. Typical effects include queueing delay, packet loss or the blocking of new connections. or failure. Logical layer SLIM: Manages stored data at the file level--files, file systems, volumes and volume groups (i.e., tracks and reports on the amount of space that an application's data has consumed). Some logical level SRM packages already report on volume and disk usage and other broad categories, but file-based SRM adds file level reporting, Sample queries might include the one hundred fastest growing files on a network, the one hundred oldest files on an array, or one hundred files that haven't been accessed for over a month but aren't part of a critical dataset. This kind of detailed query allows administrators to make intelligent decisions about file archiving and retirement, and to identify orphaned files. Backup and Recovery Backup and recovery has traditionally been file-based, but has become more challenging because file volumes have grown so much larger. Chris Van Wagoner The Van Wagoner was an American automobile manufactured between 1899 and 1900. Advertised as being "built on a simple plan that does away with several levers and push buttons", the car was built in Syracuse, New York, and could supposedly be "controlled with one hand". , director of product marketing at Comm Vault said, "The backup world in general, and Comm Vault in particular, has looked to block level technologies to provide better performance." Basing backup even on incremental Additional or increased growth, bulk, quantity, number, or value; enlarged. Incremental cost is additional or increased cost of an item or service apart from its actual cost. file changes can be time consuming: backup applications must check each individual file for modifications since the last backup. At an average of a sixtieth of a second this does not take long, but when the backup application must consider 60 million files it becomes prohibitive to maintain a reasonable backup window. Instead, backup applications can speed up (and shrink backup window times) by looking at changed blocks instead. However, block-based backup loses individual file definitions. This is a problem when a restore application must locate individual files on a backup. The restore application needs to know information such as what format the data is in--machine readable format? Has it been written by a backup product? If so, was it a block level or file level transfer? Has the data been copied--snapshots, replication, mirroring? If the backup is block-based, the short answer is: Who knows? There is no file structure index to tell the restore application. This also impacts semi-structured applications such as CAD/CAM CAD/CAM in full computer-aided design/computer-aided manufacturing. Integration of design and manufacturing into a system under direct control of digital computers. , which stores image information in databases. Backup applications must preserve and synchronize See synchronization. file-based images with their database index entries, or they can't be restored. One fix would be to add block to file mapping to backup products, similar to what a file system would have. Data protection would run at the block level, but would also preserve file intelligence. Van Wagoner added, "'Those are the kinds of things that are coming along that will give the end user the best of both worlds: the performance and speed of block by block, and the logical recovery based on the files." Ward also believes that that file-based and block-based data will converge. "'We're seeing the structured data stored in the same place as the unstructured data. This gives easier allocation and easier management of the data out to the application. The ease of management comes into play on the backend, by backing up data onto the same storage network." Tiered Storage Data lifecycle management, which is the process of managing data from creation to retirement, doesn't exist yet in a full-blown version. Existing tools do help to manage the cycle, including tiered storage. Tiered storage, which operates in both file- and block-based environments, uses policies to determine data characteristics. Depending on those parameters, it will then assign data protection levels. It will also match archgiving procedures according to policy. For example, IT might direct their tiered storage management software to archive an HR application onto on-line ATA (1) (AT Attachment) The specification for IDE drives. See IDE. (2) See analog telephone adapter. ATA - Advanced Technology Attachment disk where it is transparently available to end-users. The same application can migrate year-old Word documents from a senior management or legal group onto long-term tape storage. This allows IT to optimize storage by the value of the file data, and to save time by efficiently migrating data to secondary storage systems. File-based tiered storage particularly benefits verticals with petabytes of file storage, such as geological explorations, medical MRI 1. (application) MRI - Magnetic Resonance Imaging. 2. MRI - Measurement Requirements and Interface. files, and massive genome studies in the life sciences. Tiered storage is also helpful when dealing with compliance issues, where companies must be able to locate and quickly restore their archived data. Nor does it stop there. Bob Bingham Robert 'Bob' Bingham born 29 October 1946 in Seattle, Washington U.S. is an actor and singer. Bingham is most remembered for playing the role of Caiaphas in Andrew Lloyd Webber and Tim Rice's rock opera Jesus Christ Superstar. , chief marketing officer of TeraCloud, said about compliance, "It's bigger than just regulatory issues, every company bus their own criteria in which they want to manage their infrastructure. It's a balance between quality of service and cost. There are some companies that never want to delete a file. And to maintain that quality of service and best practices they're going to have a much higher cost." Ultimately, data management may come together under a utility computing (1) Pay-per-usage processing provided by a service organization that uses its own computers and facilities. Customers access the computers via a private network or over the Internet and are charged according to how much computing time they use, such as CPU seconds, minutes or hours. model. Utility storage spans the world of storage software and hardware: working across physical and logical layers; SAN, NAS (1) See network access server. (2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular and DAS; and block- and file-based data. By tying together unstructured, semi-structured and structured data management, its possible to virtualize To cause a virtual technique to be performed. See virtualization. and automate provisioning across all enterprise storage. Bingham said, "At the end of the day, applications and systems come and go, but the data ends up being the company's most important asset. The infrastructure is the means to an end, the data is the end to itself." www.commvault.com www.deepfile.com www.sun.com www.veritas.com |
|
||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion