Managing distributed data in the enterprise.
Key Considerations for Managing Remote Data
Effective remote data management requires consideration of a variety of factors, including:
Central Policy-Based Control:
Efficient control requires the ability to implement central policies, set a rule once and have it work company-wide, rather than managing activities individually at different sites. However, many products claiming "central control," in fact, require Administrators to set policy through a unique connection to each remote node whenever business requirements change.
Wide Area Network (WAN) Network Bandwidth Utilization:
Because they move data between locations, remote data solutions must accommodate bandwidth restrictions. They should have features that enable efficient bandwidth use, such as byte-level data transfer, bandwidth throttling, multi-streaming, and compression. Also, the less overhead information a product adds to transmitted data, the better. Since remote connections may become impaired, companies need the ability to restart at the point of failure and re-route information flow to alternate networks.
Security and Data Integrity:
Effective solutions should authenticate all sending and receiving nodes prior to data transfer, encrypt data during transmission, use a single firewall port, and minimize firewall rules. 100% data integrity is also essential, especially for remote backup using tape; corrupted data on the tape is one of the primary causes of recovery failure.
Remote Process Automation and Application Interfacing:
Automating processes and remote application interface can eliminate manual intervention at remote locations. The chosen solution must integrate with applications that have a native backup package and ensure integrity by accessing data through applications instead of at the database, filesystem, or disk levels. Additionally, remote solutions should automate any custom or script-based processes needed prior, during or after data transmission.
Heterogeneous System Support:
Since most companies have a variety of computing platforms and applications, it is important to choose a solution that supports a heterogeneous environment.
Additional Requirements for Remote Data Backup
Remote office backup requires more than just writing data to tape. It must address data integrity and accuracy, automatic operation, offsite storage, and of course, restoration. The best solutions eliminate manual effort at remote sites.
Since restoration quality is based on the integrity of the original data, it is important to preserve integrity when files are stored for backup. To ensure this, the disk-backup mechanism must have options (skip, open file transfer, or create error log) for handling open files.
The Case for Archiving
An often-overlooked, but critical component of remote data management and protection is archiving. User files and emails are seldom re-opened after the first three days of creation/receipt. Statistics show that files that haven't been accessed for 90 days have a 90%+ probability of never being accessed again. But, since its impossible to predict what 10% will be needed, companies need to keep it all, consuming valuable storage resources. The cost-effective method is to move unused data to lower-cost secondary storage (archive), with easy retrieval capabilities, for long-term retention. Archiving also helps businesses ensure compliance to federal document regulations. Similar to remote backup, data integrity is essential to any archival solution.
The Central Policy/Consolidated Approach to Managing Remote Data
The most effective solution for remote data is allowing central IT staff to control remote data management and backup. The staff should be able to set data policies, automate processes to execute those policies on remote servers, and move data between remote servers and central systems. Centrally controlled automated processes can decrease backup costs by as much as 75%.
Individual remote backup and archiving processes are replaced with a consolidated process that moves remote data to a hub site for backup or archive. This requires moving the data over available networks in an efficient, secure, timely fashion and requires technology that can handle the issues associated with controlling and moving data among sites and network connections.
Disk-to-Disk Consolidated Backup
The use of disk-to-disk backup is increasing due to the falling cost of disk storage, the elimination of physical limits, relative unreliability of manual tape storage programs, and the need for ready access to data.
A best practices model of disk-to-disk backup for remote data moves the data to be backed up over a network to a different location. Disk-to-disk backup performed at the same site does not provide appropriate protection for site disaster events (fire, flood, etc.). For businesses with multiple locations, consolidating disk-to-disk backup delivers operational and cost efficiencies as well as enhanced data security.
It's common for remote data to be periodically analyzed to determine what data has changed since the last backup. A copy of this changed data is moved to a central site to be stored on disk. Some technologies can minimize the size of data transfer by discerning and moving just the byte-level file modifications. It's best to store data at a central site as incremental data packets are reconstructed to provide full, up-to-date copies of files which are instantly accessible if a remote file is lost or destroyed.
The second approach enables output from backups performed on remote servers to be stored on local disk. The resulting backup image is transferred to disk at the central site. This works well for applications with native backup or snapshot features that can be utilized in the consolidated backup process.
These approaches can also be used together. For example, backing up user files may be best performed with changed data transfer, while backing up Microsoft Exchange data may be best performed using the consolidated backup image approach.
In either approach, the backup data on disk at the central location can be sent to tape if desired.
Consolidated archive involves archiving data that meets corporate archival policy from remote systems to a central disk. Archival policy determines what data is to be archived, and when, as well as parameters, including last date accessed, file type, content, location, ownership, and file size. Another essential element is a mechanism by which data can be retrieved from the archive by end users, preferably without involving IT.
Central policy-based processes, such as consolidated backup and archive, can significantly lower costs, eliminate risk, improve data consistency, and ensure better backup/retention policy compliance.
A Best Practices Guide to Managing Remote Data
Best practices for managing and protecting remote data involve both understanding and implementing technology that supports the automated processes. There are five primary steps toward implementing a company-wide solution:
* Identify and understand remote data and the network environment
* Select a remote data management solution
* Create a remote data management policy
* Implement the policy through centrally controlled automated processes
* Monitor and adjust as business conditions change
The questions below will help companies implement the first two steps of this process.
Assess the current system:
* Which data transfer applications are used? (e.g. FTP, xcopy, robocopy, DFS/FRS, public folder replication)
* What level of manual intervention is required?
* What are the failure rates and costs?
* How easily does it adjust to new business requirements?
Determine data movement goals:
Knowing the company's data movement goals helps in developing cost-recovery models that justify necessary purchases. Examples include:
* Reduce backup failure rates/increase data protection
* Reduce meantime to remote office restore
* Automate data transfers to and from remote sites
Determine the types of data that need to be moved:
* How much data is stored remotely?
* What are its characteristics?
* Which characteristics must be maintained?
* What applications are running at the remote locations?
* Should the data transfer system integrate with particular applications in 'real time'?
* What data are users not currently backing up?
* What types of data need to be sent to the remote office? How compressible is it?
Determine the data movement volume:
* What is the rate of data change on a day-to-day basis?
* How many sites need to be supported?
Assess your current network:
* What bandwidth is available to each location?
* How much bandwidth is required by the other applications currently using this bandwidth?
* Can traffic be segregated using quality of service (QoS) applications?
* Is the network traffic prone to bursts?
* How secure is the network?
Choose your solution:
Evaluate vendors and solutions against required remote data capabilities:
* Can the solution solve multiple remote data problems, such as backup, archive and distribution?
* Does the vendor have expertise with remote data application and integration? Can they help assess specific company requirements, data change and growth rates?
Managing remote data effectively requires dealing with network and platform variability, security, and data integrity, and implementing process automation. While many vendors offer remote data solutions, companies need to choose a solution that fits the needs identified in the best practices section of this article. With the latest remote data management and movement technologies, companies can cost-effectively solve the challenges of managing remote data with a single unified approach.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Disaster Recovery & Backup/Restore|
|Publication:||Computer Technology Review|
|Date:||Sep 1, 2006|
|Previous Article:||Playing Russian roulette with your business data: the importance of disaster recovery planning.|
|Next Article:||Traditional backup software is no match for Exchange.|