G Saunders' Home Page


Overview of Storage

Storage is one of the essential components of information technology, right down there with Computers and Networks in today's infrastructure. Dell's site prominently categorizes Servers, Storage, and Networking. IBM's Marketplace has an IT Infrastructure product page that highlights Servers and Networks, with six links dedicated to Storage: Tape, Cloud, SAN, Flash, SDN, & Disk.

The term 'storage' used alone usually applies to 'secondary storage', not to 'primary storage' aka RAM or memory.

Storage usually refers to HDD-Hard Disk Drives and the SSD-Solid State Drives that are rapidly finding their way into systems today. Magnetic tape, Optical CDs and DVDs, and Flash Memory are also classed as 'secondary storage'. Magnetic tape remains a key component of enterprise and business transaction logging, backup, audit, and recovery procedures.

Storage devices and media can be managed to store data with nearly 100% reliability and availability, much more reliable than storing records on paper. Storage may also be mis-managed and lose data in a disaster, or with an errant or malicious keystroke or mouse-click. Best practices for storage include transaction logging off-site as business is conducted, periodic backups stored off-site, lots of redundency, and well-practiced procedures for recovery back to the point of failure.

Attachment of Storage: Storage is attached to computer systems in several ways: DAS-Direct Attached Storage, NAS-Network Attached Storage, SAN-Storage Area Networks, and modern web-scale and hyperconverged storage systems can manage redundency on a global scale. Data is also stored in 'the cloud', whether private or public.

Disk Geometry & Physics introduces an alphabet soup of abbreviations for concepts required for management of storage on disks: IDE, CHS, LBA, Clusters, Slack Space & Fragmentation, ZDR, &c...

HDD and SSD in a server environment are usually deployed in RAIDs - Redundant Array of Independent Disks, not one at a time. RAIDs increase the reliability of HDD and SDD and can also boost read/write performance relative to stand-alone disks.

Enterprise doesn't run on cheap disks! The ordinary, $79/Terabyte HDD would be beaten to death in a busy database server where the disks are exercised hard 24X7X365. SAS - Serially Attached SCSI drives cost a few times more, spin nearly 2X faster, are engineered for 15 years MTBF and heavy use, and with controllers that can handle 256 HDD or SDD they are born to be RAIDed.

SSD and HDD are struggling for price advantage in 2017, and we're sure SSD will eclipse HDD technology and make HDDs obsolete in the near future. Meanwhile, SSDs are already cheap enough and offer significant advantages like 10X or more faster access with less power-consumption, so they're being used more and more even though they're a few times more expensive TeraByte to TeraByte than HDD.

Off-site Transaction Logging and Backup can ensure as close to 100% durability of data as is possible, beyond 99.9999%, so that a server can be restored to the point of failure following a data disaster.

Modern techniques can provide quick or 'seamless' recovery in the wake of equipment failure or network room or regional disaster so quickly that system users may not be aware that there was a failure.

Without transaction logging and backup of data, systems may not be recovered after a disaster or mistake causes data loss. The cost of system failure without adequate backup can be so great that the business fails in most cases where there was not adequate backup.

Read on for more stuff about management of storage technologies...

Storage Technologies →

Note: HDD and SSD Firmware make 'supply chain' a vector for malware! Yet another area where vigilence is required!

Data Backup Systems

Why do we backup data?

The easy answer is 'to continue or recover business after a system disaster'. More than half of businesses that lose their computer system without a good backup fail.

The real answer must include 'and, to continuously prove the integrity of data'. If an organization is lucky there will never be a system disaster. But, backup sets and transaction logs will be used every day to audit and prove the integrity of data and investigate irregularities.

Backup sets and transaction logs support the 'I' in the classic Information Security triad CIA: Confidentiality, Integrity, and Availability. Without regularly examining backup sets and transaction logs and comparing them to the on-line records it can be impossible to detect or prevent loss or theft of data and impossible to get it back.

No organization wants a customer, employee, supplier, or the tax man to show them records produced by their system that they can't explain. That would demonstrate a lack of integrity and cast doubt and suspicion on all past and future dealings.

Hardware failure and local or regional disasters are _not_ the reason for most data disasters requiring recovery from backup media and transaction logs. Human error, ineptitude, or malice are much more likely. Here are some situations the instructor's observed:

  • Maybe somebody working on a system puts a semi-colon where a comma should be in an SQL update statement and accidentally wipes out the table holding all the transaction data for the past few hours, or days, or a year.
  • The network administrator, in a lapse of attention, typed cd / followed by rm -rf *
  • A eight or nine year-old system crashed at 5:00 on a Friday evening following a busy day selling, and the last backup was from the prior weekend.
  • Or, a consultant demonstrates 'SQL Injection' thinking he's working on a development system and wipes out the production database.
  • Or, a cracker finds his way into a system, has his way with it, and wipes it clean when he's done.
  • Or, there is nobody watching files that grow or a spooler and one grows so large it eats the entire file system, corrupting it...
  • An employee knows that nobody checks the reports from the credit card clearing house or the bank statement, so she posts credits onto a dozen credit cards she controls whenever sales are heavy and it won't be noticed.
  • The bookkeeper knows that nobody ever checks balances on anything, so she ships expensive items to crooks she knows, then deletes the orders after shipping.

Hardware failure and disasters in a network room, building, locality or region must be considered. Even if they're not as likely as human malice or failure they do occur. Here are some good reasons Why We Backup Stuff, negative examples of how to mitigate the risk of data center disasters. Here's another good look at Database Disasters. Tom's IT Pro is an excellent resource for real-world tech, including backup.

Components of backup:

  • Transaction logs are transmitted off-site, real-time, as records are modified. These are key to recovering to the point of failure, otherwise all data entered since the last full or incremental backup will be lost! They are also key to continuously proving integrity of data!
  • Full backups taken as often as practical when the system is quiet, are verified as readable, and are taken or transmitted off-site asap. Tape picked up and carried to a vault is the traditional way of getting backups -- sending it via secure means to a remote site is a more modern option.
  • Incremental backups when the system is quiet save everything that has changed since the last full backup.
  • These are features of good backups: Depth of backup, multiple copies of everything kept for a long time, regular, systematic examination of backup sets and transaction logs to ensure integrity of data.
  • More and more, we're seeing remote 'hot sites', 'parallel systems', 'grids', or 'clouds' that are synchronized so that fail-over after a data disaster is quick or even 'seamless'. None of these replace the need for backups and transaction logging onto sequential media off-site.

In the event a recovery is needed: the hardware is prepared and the operating system is restored; data from the last full backup is restored; data from incremental backups is restored; data on transaction logs brings the system back to the point of failure.

Modern 'de-duplication' techniques as engineered by companies like IBM or Barracuda can provide reliable copies of every version of every record without duplicating all the un-changed records, too. Outsourcing tape backup to a company that uses tape-storage jukeboxes or robots with or without de-duping is a good option as companies turn to IaaS-Infrastructure as a Service.

The sun never sets on some enterprises, so the system may never be quiet for a backup. One use of virtual servers is so that a multi-national organization can run a system with the clock set for each time zone. Companies like IBM, Oracle, or Barracuda can engineer a solution so that backups and can be taken while the system is not quiet and can be used with transaction logs to recover a system to any point in time.

More RAM, Faster Response, More Risk to Mitigate

Access to data in RAM is thousands of times faster than access to data on Disk or flash memory! Disk access is measured in micro-seconds, where RAM access is measured in nano-seconds. If data is kept in RAM a server can handle hundreds or thousands more processes. But, fast RAM is 'volatile' and a power-failure or other glitch would wipe out the data for all the processes currently using it!

RAM is the key issue in these 64-bit days where a server-class machine can reference as much RAM as midrange and mainframes could at the turn of the millennium. We already have server-class machines that can handle a TeraByte of RAM and 24 or 48 Cores. Access to data in RAM is thousands of times faster than access to Disk! RAM speed is expressed in NanoSeconds, DISK's relative sloth in MilliSeconds.

A problem with having huge RAM 'all in one basket' like a server-class machine is that if the machine fails all the data in memory is lost, posibly affecting hundreds or thousands of customers' or employees' orders or work.

Big machines, midrange and mainframe (but not servers yet), can hold _ really huge_ RAMs of a few or several TeraBytes in their big chassis, so they gain a huge speed advantage by keeping users' active data and programs cached in RAM.

To mitigate the risk of a memory unit failing, midrange and mainframe machines have mirrored or 'RAIDed' RAMs that allow them to continue working with one RAM if the other develops errors. In midrange and mainframe computers RAM modules can fail, or generate soft errors, and be replaced without taking the system down.

There are several types of RAM from the fastest, volatile, static ram on a mainboard to much slower, non-volatile memory on a USB-Drive, flash memory array, or SDRAM card. Here's a good Fast Guide to RAM that discusses the differences.

G Saunders,
Dept of Information Systems
VCU School of Business

G Saunders Wings

Content © 1999 - Today
By G Saunders
Images are Available on the Web