rainbow backups

Written by  on May 19, 2014 

 

rainbowRainbow backups refer to designing multiple, unique backup protocols, dealing with different data types, the backup requirements for the particular data and the frequency of storage which are unique to each enterprise.

A common misconception holds that the ‘one’ backup solution will cover all core data at any given site.  This is simply not true, and for several reasons.

Data File Types

Most data on a computer is contained in files which are in ‘human readable format’ – such as text files.  Even files created by common products such as Microsoft Office (Word/Excel) can be easily backed up by a simple file copy and paste. These files are visible to human eyes, even if specific software is required to create and edit the content.

Some data types are not only hidden from the human user, but can only be accessed using dedicated software.  SQL databases, such as the MySQL database on which WordPress, this website and many other applications run, can only be backed up using a tool inside the MySQL software.  The data is often stored in a format which can not be backed up with a simple file copy&paste.  Examples of such SQL databases include Microsoft SQL Server, MySQL and PostgreSQL.

Other data types, such as SQL database files used by Firebird and SQLite require exclusive access or a ‘file lock’ in order to safely copy the actual on-disk file.  Attempting to access or copy the file while in use by other users will typically result in corrupt data and failed backups.

 

Frequency of use and backup

Depending on the nature the business a variety of application software solutions may be in use.  Each of these systems will be populated by client data.  Some data can be described as highly dynamic, changing almost continuously, while other data may be highly static.  Consider a financial system, where for the ‘current’ set of books, each individual account all transactions must be available, while for historic data, only summary data is required, as in month-end balances for previous years.  It may not be necessary to backup the static data frequently, while a frequent backup of the dynamic data is imperative.

 

Backup media

Today, only two basic types of backup media exist: physical or Internet based.  Physical media cover familiar objects such as CDs, DVDs, Blu-Ray Discs, external hard-drives, flash drives and even a Network Attached Storage (NAS device) computer copy.  Internet or Cloud based backups refer to backups where the data is transferred offsite to a remote computer.  

Both these types of backups have pro’s and cons.

The most typical hassle with a physical media backup is that ‘somebody’ must remember to make the backup, insert the media into the PC, make sure the correct data is harvested and remember to remove the media offsite.  The advantage is that both the person responsible and the media are readily apparent.  This human element of making a backup is often where most backup protocols fail.  Data is mis-identified; the backup media is used incorrectly; the media containing the backup is not removed from the site.  Physical control of the data media guarantee data security.

The most common hassles with Internet based backups include lack of security and high cost.  Also, slow connections can negatively impact on the transfer of large data sets between the local site and the remote data storage server.  Popular Internet based drag & drop file transfer services are often not designed for secure data storage.  This means that putting sensitive data into the Cloud may expose the data to snooping hackers.  Without a guarantee as to remote server security, data should never be placed on a Cloud server.

 

Consider a multi-media approach where the computer hosting the data is subject to alternating backups such as

  • file system backup (files)
  • software data export (data dump)
  • incremental backup (RAID/cluster)

Low level data cluster backup tools may fail. Often minor changes to a bigger data set may go undetected by the backup tool. This means that the update data cluster is not backed up, resulting in the live data set and backup data being out of sync. Typical examples of this type of back are RAID systems and well known products such as the Microsoft Volume Shadow Copy. While these systems rely on detecting changes at a cluster level and may fail, a file copy, or a forced data dump will include all current data.

If we assign a color to each backup, such as red for the file system backup, yellow for the data dump and blue for the incremental backup, we can conceive of the various types of backups constituting different colours of a rainbow backup.

 

Use a best-fit approach to determine backup requirements. Use a backup mechanism that suits the data type. A file copy for files and folders can work well. A data dump tool for SQL databases. A ‘shadow copy’ tool to backup individual data clusters for a bare metal recovery.

In addition, once a backup is made, it still needs to be secured. The process of moving computer data off-site is known as ‘vaulting’ and may be as simple as having it driven off the site by a pick-up truck (‘the pick up truck method’) or more sophisticated Internet based solutions.

 

From these and other reasons it should appear that a one-size-fits-all backup solution is probably an illusion.  An effective backup protocol will include multiple backup solutions and rely on a combination of backup media.

Category : backupData

Tags :