|
Testing and Maintenance: 7 Frequently Asked Questions
by Jonathan Bronson, Brian Zawada, Brett Williams, and Alberto Jimenez
- What are the prevailing practices regarding the storage of
business continuity planning documentation?
Plan storage techniques range from storing printed plans off-site to
electronic plans stored off-site and available via the Internet. In
smaller organizations, up-to-date plans are printed, numbered (for control
purposes given the sensitive nature of plan content) and disseminated
to personnel named in the business continuity plan. Employees in general
may receive copies of emergency response procedures, to include evacuation
procedures, first aid reminders and bomb handling procedures. When a
plan becomes outdated, it is turned in for an updated version. When
an employee leaves the company, the plan is returned as part of the
exit process. Key members of business continuity teams typically have
copies of the plan available at homes, at work, and prepositioned at
recovery locations.
Business continuity planning software, knowledge management platforms
and off-site file servers, as well as PDAs, have resulted in fewer hard
copies and the growing use of electronic planning material. Technology
has made it difficult to control plan dissemination and duplication,
although the plans are much easier to store and update. Technology has
also allowed plans to be segmented easier, with team members receiving
only those components of the plan that applied to them. Of note, for
those organizations that do use electronic plans, management still stores
hard copy documentation (typically in off-site storage) in the event
of network downtime or a sustained power outage.
- How often should business continuity related documentation
be updated?
In general, business continuity documentation should be reviewed and
updated at least on an annual basis. However, a more frequent review
and update process may need to be utilized as changes to the organization
deem necessary. The Business Continuity Team should stay abreast of
changes that may impact the current plan in place. Key review and update
activities to consider include the following:
• Business unit and associated function listing and validation of criticalities
as determined in the BIA
• Risks / threats that may impact key business operations
• Business unit/function dependencies/inter-dependencies (IT and non-IT)
• Open/closure of a key office locations
• Key employee/vendor contact information (e.g., call tree)
- How should the organization keep the plans current?
Given the decentralized nature of most business continuity programs,
a cross-functional team should be responsible for maintaining the crisis
management and crisis communications plans, as well as updating risk
assessments and business impact analyses. Business function and technology
owners should be responsible for their individual resumption plans.
Internal Audit should enforce a defined review and plan maintenance
schedule, as defined in the BCM policy.
Regardless of the process used to maintain the business continuity plans,
maintenance should be based first on a defined schedule. Business continuity
plans should be reviewed annually, or when significant changes to the
business occur, whichever is sooner. If an organizational change management
process is in place, BCM should be integrated into this program.
Human resource information, to include contact information, should be
reviewed quarterly.
- How often should the business continuity strategy be tested?
As often as possible. Management expectations, test objectives, the
maturity of the planning process and system/process criticality are
all factors when deciding how often to test. The majority of organizations
test business continuity processes one or two times a year, however
this can be increased by such factors as:
• Changes in business processes
• Changes in technology
• Change in BCP team membership
• Anticipated events which may result in a potential business interruption
Organizations may also choose to conduct more tests or exercises if
operations are decentralized across multiple locations. Additionally,
some business continuity coordinators choose to conduct testing in stages
given the size of their IT infrastructure, the size of the business,
or their relative inexperience with BCM testing. Others want to rotate
as many people as possible through the training experience given the
valuable benefits. Regulatory requirements may also influence the number
of tests performed annually. No matter how many tests are conducted
each year, be sure to schedule them in advance to ensure maximum participation.
Develop a progressive, incremental schedule that includes a timetable
of events. Note: The information printed in this answer was authored
by Protiviti and originally published on the ISACA website.
- What are available test options?
Conducting the same test twice a year will quickly lead to stagnant
outcomes and bored participants. It’s important to mix it up. This section
highlights the type of tests available to your organization, as the
well as the implications associated with each.
|
Test Type
|
Description and Implications
|
| Desk Check (a.k.a. Board Room Style
or Table Top Testing) |
Assemble recovery team members and walk through
the plan using test scenarios and a series of test scripts. Tabletop
testing is the safest to do but least useful because recovery
strategies are not really tested or operationalized. Visualizing
the BCP in action is part of the development process but the value
is limited. A more in-depth simulation will provide a stronger
understanding of how the response teams work together, as well
as a sense of the time needed for recovery and restoration activities.
|
| Simulation (a.k.a. Full Scale Interdependency
Testing and Walkthroughs) |
Simulate a disaster and determine how well
the plan responds to the specific event in the operational environment.
This method may be the most costly testing method and also the
most dangerous to the business if not isolated properly. |
| Procedure Verification Test (a.k.a.
Business Function Testing) |
Limited in scope to a specific process or
business unit, procedure verification testing evaluates the logic
of a specific procedure to determine if a deficiency exists through
a combination of desk checks and simulations. This approach is
useful following an isolated business continuity test failure. |
| Communications (Call Tree Testing) |
Communications is a key component of a BCM
process. Test the accuracy and completeness of the organization’s
employee call tree, customer contact information channels and
critical supplier/vendor/business partner contact information
as part of a table top exercise or simulation, or potentially
as a stand-alone activity. |
| IT Environment (Systems and Application)
Walk Through |
Conduct an announced or unannounced disaster
simulation and execute documented system recovery procedures.
The primary objective: Verify critical systems and backup data
can be recovered based on a specific timeline and documented application
interdependencies. This scenario exercises “active-active” and
“active-backup” IT continuity models. |
| Alternate Site Test |
A test of all restoration/recovery components
at an alternate site. This should include a test of the organization’s
ability to relocate staff to the alternate site, as well as a
validation that recovery processes and IT assets operate |
| End to End Testing
| A test of alternate site facilities, to include
both business and IT. An end to end test differs from an alternate
site in that critical suppliers/business partners and customers
– internal or external – are included within the scope. This test
typically validates connectivity to the businesses’ production
site. |
Regardless of the type of test employed, incorporate actual data and
simulate real-world conditions whenever possible. Additionally, develop
the test scenario based on the results from the risk assessment. Choose
a likely risk to which the organization may be vulnerable. And if you
or your organization is new to business continuity plan testing, start
small. As your business continuity process matures, increase the size
and complexity of the test.
Business continuity coordinators also have the responsibility to capture
the interest of test participants and be original. We have observed
one coordinator who operates his tests like a “monopoly” game, using
chance cards to insert anticipated variables into the test process.
Others insert a bit of realism by randomly selecting personnel to “sit
out” and observe tests to see how the rest of the team reacts. These
are just a few ideas to add realism and keep exercises interesting.
Note: The information printed in this answer was authored by Protiviti
and originally published on the ISACA website.
- Should the organization expand testing beyond IS?
Absolutely, there are several options available to expand testing throughout
the organization, although a necessary first step is to involve end
users in IT disaster recovery tests. We recommend the creation of a
testing policy that dictates standards and guidelines for exercise participants
and a schedule to include crisis management, business resumption and
IT disaster recovery.
Use a variety of methods for exercising plans:
• Incident-specific scenarios on unplanned exercise dates with unexpected
exercise scenarios
• Walkthroughs of existing plans with recovery teams
• Cooperative exercises with key partners and customers
• Industry-wide exercises administered by local industry organizations
or service bureaus
• Local response procedures to a regional crisis
• Exercise interdependent recovery plans simultaneously
Require Internal Audit (IA) participation for each test. IA and BC personnel
should perform the following tasks:
• Document observations
• Perform follow-up (which should be performed within 3 months when
major deficiencies are noted)
• Repeat until noted issues are satisfactorily resolved, as per Internal
Audit’s observations.
- Why does it take so long to adequately plan for an IT disaster
recovery test?
Often, one of the best ways to measure the adequacy and completeness
of IT disaster recovery strategies and documentation is the ease of
test planning. Although coordination with business functions potentially
impacted by a test is critically important (and when a third-party is
utilized, coordination is needed there as well), planning for available
technical resources should be eased due to defined, up-to-date recovery
strategies, recovery teams and recovery/verification/restoration procedures.
Some organizations struggle with test planning because they “reinvent”
the IT disaster recovery strategy for each test – going so far as writing
recovery procedures from scratch to meet the scope of the test – as
opposed to defining and maintaining an “everyday” IT DR process.
If the organization is investing six months of a FTE for each test,
our recommendation would be to reallocate that time into developing
repeatable procedures with necessary resources that could enable recovery
strategies designed to meet business-validated RTOs.
Lastly, organization should ensure that each test focuses on validation
of business continuity strategies and supporting plans - “Test the plan,
not the people”.
About the Authors Jonathan Bronson, Brian Zawada, Brett Williams and Alberto Jimenez are Senior Managers for Protiviti Inc. Protiviti is a leading international provider of independent internal audit and business and technology risk consulting services. For more information, visit www.protiviti.com
|