Posts Tagged ‘Business Continuity Testing’

Sydney’s F3 Traffic Debacle has lessons for us in Business Continuity Management

Thursday, April 15th, 2010

This week, motorists were stranded for up to 9 hours on Sydney’s F3 Motorway due to a traffic incident.  Emergency plans to implement ‘contra-flow’ arrangements to get the traffic moving again were not implemented until many hours into this incident whilst people endured hours waiting in their cars with no water being distrubuted to them and no way out.

While the facts of the matter are yet to fully emerge and the reasons behind this failure to successfully execute the traffic emergency plan are not yet published, we can consider how this type of scenario can happen to any organization, even if they have business continuity plans in place if they are not thoroughly tested.

Often an organization will have a plan outlined on paper about how a given scenario will be handled. The reality, with all of the real life complications and human factors, is often quite different. This is why we exercise and test the plans.

Real life complexities are difficult to capture in your paper plans because you cannot always envisage the multiple factors that may impact on your recovery processes.

Consider factors that may affect how your recovery plan is executed and how your organization would handle it:

1.      An evolving status report

Initially you are told that the incident is not too severe and will be rectified within the hour but then as time progresses it worsens in severity and time frame estimates keep gradually creeping out.

Do you know what your ‘drop dead point’ is, how long can elapse before invoking your plan?

What is your ‘maximum tolerable outage’? How long can the ‘estimated incident recovery time’ be before it is worthwhile to invoke.

Are you getting your updates from a well informed primary source? Do they understand the need for an accurate estimate?

2.       Delegation of authority

 What if the CEO or appointed Business Continuity Command Team is un-contactable during this incident?

 Is there a backup person nominated who is definitely going to be available in their place?

 Does the backup person have the complete authority to make decisions which may involve the major ramifications and expenditure?

 Has this backup person been trained in how to co-ordinate the communication and oversight of recovery from an incident?

 3.      Communication Protocols

 Imagine the chaos created if various staff members were contacted by different media outlets. Because they have not been given clear guidelines that only the ‘Communications Manager’ may issue any statements to any external parties these well meaning staff members offer their understanding of where the current situation is at. Conflicting or incorrect information is then released to the public.

 How will staff react in an incident if they have not had their expectations set about who will communicate what to them?

 In a state of confusion people will try and contact their supervisor, their co-workers, whomever they can get a hold of to find out what they should be doing. Just like Chinese Whispers, various accounts of what is going on and what should be done are spreading throughout the organization.

Consider the alternative. All staff have been trained in your business continuity protocols and understand how communication will occur in an incident. There are clear roles for who will co-ordinate recovery efforts and known backup persons should the nominated person be unavailable.

All staff know that there is a communication tree whereby the status updates and requirements will be communicated to them by their business continuity team leader. They know there is a hotline number and an intranet site they can log onto where the ‘Communication Manager’ will post regular updates of information that staff need to know.

Testing your plans thrashes out the finer details, highlights shortcomings and also gets all of the parties involved familiar with the plan and their role.

It is during this process that chain of communication and authority issues can be uncovered and resolved before the plan needs to be enacted in real life.

Business Continuity Test Scenarios

Monday, March 29th, 2010

Testing BC and DR planning is an essential component of any “healthy” continuity management program and as such, should be undertaken on a regular basis.  While this is generally “good practice,” organisations are often under internal and external compliance and governance pressures to complete additional and more complex or mature testing regimes.

There is a broad range of testing options across the spectrum depending upon the maturity of the organisation’s planning.  If this is the first time that a test has been undertaken – (“green fields”) planning can start with a plan walk through (Table Top or White Board) test.  These are paper based scenario workshops with business and/or technical personnel attending.  The test is generally a few hours long and should question the information and logical sequence of priorities contained in the planning documentation.

At OpsCentre, we often are engaged by a client to assist with an upcoming test and use our experience to add complexity and interest to the activity.  The client organisation has successfully completed tested (often the same test) on a number of occasions and would like us to provide more in-depth rigor around the process in general.

Other than experiencing a full blown disaster (which by the way is the best form of test – although not recommended on an annual basis) we have orchestrated testing workshops to assist our clients as detailed below:

1. Applications Functional Testing

  • Technical failover of applications or services from the primary production facility to the alternate recovery site
  • Insure that the test is isolated from production and that no “cheating” occurs whereby test attendees liaise with production resources or documentation that would not be available in a disaster

2. BC and DR End to end process flow testing

  • Complete testing of the recovery facilities by business and technical units including up and downstream application restoration in the disaster recovery environment
  • This can be an expensive and resource intensive exercise. The results are extensive and recommended to establish detailed baselines for all aspects of BC/DR planning

3. Denial of Access Testing

  • Business site-wide tests for recovery personnel to perform a normal day’s work from their alternate recovery sites with applications/systems pointed to normal production services
  • Try this test at 3:00AM – convening disaster personnel and timing their response.  Disasters can occur at any time and if not possible to physically attempt this type of test, logically the process flow should include “out of hours” scenarios.

4. Facility Power Downs

  • During essential mechanical and electrical maintenance activities at key facilities.  Contingency plans are executed/tested concurrently
  • If the production infrastructure is going to be off-line due to maintenance that is predetermined, use this opportunity to test your planning and response mechanisms to their fullest.

Completing any of the scenarios illustrated will take a fair amount of project planning and management buy-in.  Considerations should be thought out well in advance of the test/audit/governance/compliance schedule so that the test exercises run as smooth as possible and the best results are achieved.

7 Habits of Highly Effective Business Continuity

Friday, January 29th, 2010

1. The Senior Executive actively supports Business Continuity

The CEO\Director\General Manager that believes in and wants a functional Business Continuity program in place is a critical success factor.

To have a senior Executive that is responsible for setting the priorities and vision for the organisation to stand behind BCP and communicate this to the staff is a powerful change motivator. 

2. A Whole of Business Approach

A business continuity program that prioritises the organisation from the Executive’s birdseye perspective as well as analysing business impacts across all business functions in a consistent manner will lead to a better informed business continuity strategy being proposed. It allows the Executive to see the requirements of the business in a single snapshot and make a cost benefit justified decision on the level of continuity required.

3. A Single Point of Business Continuity Management

Someone needs to be responsible for BCP at an organisational level. It needs to be in their job description and a priority for them, otherwise it runs the risk of falling between the cracks. With one person accountable for co-ordinating, aggregating, monitoring the overall Business Continuity program and reporting to the Executive, the program is more likely to stay visible and maintain momentum.

4. Testing, Testing, Testing

Business Continuity should be viewed as an ongoing continuous improvement program. And as such testing is vital. It highlights flaws and validates assumptions in your business continuity plans, giving opportunity to improve them. Testing builds confidence and competence within the business continuity team as it brings home how the strategy would actually work in a variety of scenarios and how the roles will interrelate. An untested Business Continuity Plan cannot be considered viable.

5. Embedding BCP into job descriptions and procedures

The various BCP roles such as BCP Manager, Command Team Leader, Business Unit Leader, etc should be written into position descriptions so that it is very clear that is a part of the responsibilities of the staff members. Procedures for new projects, business changes and IT changes should include provision for ensuring the change has BCP/ IT Disaster Recovery aspects taken into account. All changes should have an impact analysis conducted that includes impact on BCP/IT Disaster Recovery procedures.

6. Starting on the right foot

An induction training package that briefs new employees on the Business Continuity and Emergency Management strategies and plans in place is a great way to start them off on the right foot, highlighting the importance of this to the organisation.

7. Maintenance

The person responsible as BCP Manager should be tasked with ensuring maintenance of the documentation occurs on a regular basis. Outputs from changes and testing sessions all need to be fed into the plans.  Periodically the BIA should be revisited and organisation’s prioritisations and maximum tolerable outages reviewed.