Disaster Recovery, the Workforce and the Swing of the Pendulum

DRI ANZ

To view this article in its original location, please click here.

Business started with people. Then came machines, followed by information technology. With IT running business (so to speak), disaster recovery was focused on IT. In fact, the one thing that was often conspicuous by its absence in DR planning and management was people. Now, with declarations like ‘our people are our greatest asset’, there’s a swing back towards emphasising the need to ensure the workforce is just as well-prepared for recovery as the IT systems and infrastructure it uses. Here’s our alphabetical list of items to check now to be even better prepared for any future IT incident or disruption.

  • Authority. Recovery is underpinned by employees having the authority to get things done, and knowing they have that authority. Disaster recovery plans must clearly state who does what and when, and who should step in if the first person in authority is cut off from the rest of the workforce.
  • Breadth of skills. Cross-training already offers the benefit of varying work activities and opening up new career possibilities. It may also be crucial in a time of crisis in order to start up strategically important IT activities again. Make sure employees regularly exercise additional skills they have learnt.
  • Connectivity. The workforce needs to be able to use essential IT systems, which means being able to connect to them from alternative locations or from home if required.
  • Discussion. Employees engage better in recovery procedures when they can voice suggestions and opinions during the planning process. Likewise, two-way communication for exchanging the right information at the right time between employees and management is crucial for optimal recovery.
  • Empathy. Disruption that hits IT, especially in terms of natural disasters, may also have an impact on employees and their families in the homes. Counselling and psychological support may also be of vital importance.

Many of these points must be addressed when disaster recovery planning is being done. You won’t have time to figure them out when an IT disaster hits, so you’ll need to make sure they’re properly in place beforehand.

5 Things that Can Go Wrong with a Disaster Recovery Plan

DRI ANZ

To view this article in its original location, please click here.

The biggest problem with a disaster recovery plan is when there isn’t one. If nothing has been prepared, planned or backed-up, then that’s what you can expect to salvage in the case of a serious incident – nothing. But even when the plan exists, too many organisations leave gaping holes. If you’re starting in a new position as disaster recovery manager, you have the advantage of bringing a fresh pair of eyes and seeing things that your colleagues have missed or dismissed as unimportant. Here’s a checklist to help you spot what might need to be fixed, and underlying causes of the problems.

  1. The disaster recovery plan is non-existent. If there is no plan, it’s possible that senior management is unaware or doesn’t care. You’ll have to use your DR management expertise to convince all concerned that DR planning is both vital and positive for the company.
  2. It’s incomplete. Disaster recovery goes beyond daily data backups. Backup sites for high priority operations like sales, home-working and communication plans for employees, and appropriate insurance policies are all part of the deal too.
  3. It’s too long. Often DR plans become bloated because the focus is on trying to provide a solution for every possible cause, instead of focusing on possible outcomes and what to do about them. You need to know what to do if your application servers are out, rather than how to react if a meteorite strikes your systems room.
  4. It hasn’t been tested. That means more than meeting-room ‘thought experiments’. You have to try restoring an entire server with all its applications and data and check it all really works, for instance. And you have to test regularly thereafter too.
  5. No backup for the DR plan actors. If key members of your organisation become unavailable in a disaster, your plan must define the backup contacts who will act in their place. Otherwise your recovery will stall for lack of decisive action.

You may discover other shortcomings too. Remember – it’s often the thing you didn’t check that breaks down on you just when you need it!

How Simple Can a Disaster Recovery Plan Be?

DRI ANZ

To view this article in its original location, please click here.

Sometimes it’s difficult to see the forest for the trees. Disaster recovery plans can rapidly grow in complexity, as organisations get larger and IT systems more intricate. The use of templates can sometimes help DR planners to focus on essentials, but even templates don’t always do the trick. As with many challenges, the way forward may be to break the problem down into component parts or to initially simplify it and build in any additional, necessary complexity afterwards. For example, larger entities might start with a small business approach to ensure that each department or business unit at least has the following items under control.

As a pragmatic approach to disaster recovery at the SMB level, three preparatory actions can be undertaken. Firstly, make sure that employees know where to go and whom to contact if their office becomes inaccessible or unavailable. Secondly, make sure that relevant insurance policies are current. And thirdly, maintain secure, current, replicated and above all tested data backups, including everything you need for regulatory compliance or that is of strategic importance to your business. While disaster recovery is by definition an IT-centric function, it may take an insurer to finance restarting IT. It may also take more than getting IT back on its feet to have everyone using it productively again.

Although this is a good start for individual small businesses, just trying to bundle a collection of SMB disaster recovery plans together for the larger corporation may be wasteful or ineffective. Group insurance policies are usually more cost-effective than individual ones for each unit, and cloud backup services to cover data backups encourage better control and coordination – not to mention assurance that each department is sufficiently well covered. But if initial attempts to draw up a global disaster recovery plan stall or get bogged down in detail, an approach to first instigate simple basic mechanisms for remedying disasters can be a company lifesaver while the finer points are being worked out.

How Can You Best Evaluate a Disaster Recovery Solution?

DRI ANZ

To view this article in its original location, please click here.

Figuring out which disaster recovery solution is best for you is likely to involve different criteria. Hard metrics that are typically quoted are RTO (recovery time objective) and RPO (recovery point objective). You’ll often see them used in service level agreements for data recovery for instance. However, while being a good start, these two well-known parameters may not be sufficient. For example, to recover just one crucial piece of data, you may need to recover all of your data, which may be a long time indeed. Additional metrics may therefore provide a more accurate picture of whether or not a solution will suit you or your organisation.

In terms of hard metrics, i.e. those that yield quantifiable, directly comparable information, it is possible to extend to three groups – recovery time characteristics, recovered data characteristics and recovery scalability characteristics. RTO is an old friend in the first group, and now accompanied by RTG (recovery time granularity). RTG defines recovery point options with regard to logical failures. This is the subtle difference between RTG and RPO (recovery point prior to a physical failure). RPO is then in the second group, together with ROG (recovery object granularity for the level of individual objects than can be recovered), REG (recovery event granularity for recovery from specific events).

Further criteria cover usability of data by an application, geographical scope within which protected data must be held ready, scalability over different numbers of applications, resiliency of the recovery solution itself and cost efficiency. This last metric takes into account how much system administrator effort is required to use the solution, as well as the amount of IT resources needed to implement it. It underlines another important point. The disaster recovery solution you choose must also be one with which you feel comfortable and that you can easily apply even when panic is all around you – a key point to remember at the same time as making measurements and comparing numbers.