Continuity at the Core: Continuity Strategies

284 views
0

In the previous article in this series, we introduced the Business Impact Analysis (BIA), a valuable tool to help us identify and categorize the business functions (a.k.a. activities) based on the impact to business operations in the event of a disruption. The categories range from “critical” to “non-essential.”  Using the BIA, we are able to measure the impact in quantitative (dollars) and qualitative (high, medium, low) units and establish key recovery objectives (RTO, RPO, MTD) for each function. A key component of the BIA is a thorough inventory of the resources required to perform each function. Resources include technology, equipment, people, processes, and information.

To complement the BIA, a Threat/Risk Assessment is performed to identify likely events and incidents that threaten to disrupt business operations, measuring both the probability and impact based on existing controls and residual risk.

In this article, we will discuss the process of selecting and implementing cost-effective strategies that minimize risk of downtime and enable the organization to continue performing those functions determined to be most critical, or to recover them in a time frame that meets the desired recovery time objectives (RTO).

For the purpose of this article, we will divide the strategies into four groups or perspectives:

  • Facilities
  • Technology
  • Process
  • Personnel

Facilities

As a CUSO or credit union, you need workspace for your staff to perform both front-office and back-office activities, secure space for technology, and space for engaging with clients and members. If all of these areas are contained within a sole facility, multiple threats such as fire or a severe weather event could put your ability to maintain critical business functions in jeopardy.

There are several strategies available to mitigate the risk, including:

  • Adding a permanent second facility (branch) in a geographically disperse location
  • Arranging for a temporary alternate facility (reciprocal agreement, hot-site, mobile site, etc.)
  • Strengthening the resilience of the facility (multiple power sources, fire suppression systems, etc.).

Technology

Now more than ever, the availability of secure and reliable technology is crucial to business operations. More interaction with members is occurring over online and mobile channels. There is more integration with third-party vendors. Expectations of 24/7 access is the new normal. Identifying single points of failure and implementing redundancy for critical components is necessary to ensure uninterrupted service. Wherever redundancy is utilized, regular failover testing is recommended to confirm the capabilities of the technology and validate procedures.

Examples of strategies to mitigate risk of disruptions to technology include:

  • A backup generator along with UPS to provide alternate source of electricity in the event of a utility power outage
  • Redundant firewalls with auto-failover capabilities
  • Redundant data communications providers (ISP)
  • Server high availability with data replication

The point at which technology and facilities overlap is addressed in the data center strategy. Often servers that store data and run the applications used by staff and members are housed in a secure room (data center) at a single location. While effective for cost control and security, the risk of an interruption at that site impacts the entire network.

Depending on the tolerance for downtime and the organization’s RTO and RPO objectives, alternate data center recovery strategies include:

Cold site: A backup facility with basic utilities (power, HVAC, etc.) and space for IT equipment. This strategy requires acquisition of hardware, and the installation and configuration of software and data, requiring a significant amount of time for recovery. The cold site is the least expensive solution (upfront costs) for an alternate data center.

Warm site: A backup facility with basic utilities, data communications, and IT equipment, but no data. This strategy is more costly than a cold site and requires the installation and configuration of software and data, but requires less time for recovery than a cold site solution.

Hot site: A backup facility with similar utilities as the primary site with data communications and IT equipment, including a copy of the data from servers at the primary data center. Recovery time is significantly reduced. The hot site is typically the most expensive solution for an alternate data center.

Mobile site: A backup facility that provides the features and recovery capabilities of a warm site but requires delivery and setup. Often managed by a vendor in a subscription-based business model, the mobile site can be a cost-effective solution for an alternate data center.

Colocation site: A production or backup facility that is hosted by a vendor or technology service provider. IT equipment may be provided and managed by the client (leasing rack space from the vendor) or provided and managed by the vendor.

DRaaS (Disaster Recovery as a Service): A cloud-based solution for replicating IT equipment (servers) and data.

Selecting the appropriate data center strategy should be performed carefully with the understanding of the initial and ongoing costs compared to the value it provides.

Processes

While the facilities provide the space for staff to work and the technology provides the data, applications, and connectivity to vendors and clients, it’s the processes that enable the organization to conduct business through the products and services it offers.

Strategies to enhance the availability of processes include:

  • Developing manual or workaround procedures for critical business functions and processes
  • Implementing a distributed workforce to minimize the risk of an interruption at any one site
  • Outsourcing select business activities such as call center, collections, and bookkeeping
  • Participating in a regional or national shared branching/ATM network for basic member functions

People

While many tasks can be automated using technology, the accumulated skills, knowledge, and experience of the staff are required to execute the processes, interact with clients, and respond to incidents that threaten to disrupt operations.

Ensuring adequate staff size, well-documented procedures, and effective cross-training are instrumental to mitigate the risk of interruption to critical business functions and to enable a prompt and effective response.

Additional strategies include:

  • Clearly defined roles and responsibilities
  • Skills gap analysis to identify opportunities for increased training or outsourcing
  • Succession planning for all key positions
  • Regular testing and exercising to validate procedures and provide hands-on experience

Each new strategy considered should address a weakness or gap in the organization’s ability to meet its recovery objectives and/or provide value that is equal to or greater than the cost.

Once the continuity and recovery strategies have been evaluated, selected, and implemented, it’s time to develop the process and procedures required in a response to disruptive threats as identified in the Threat/Risk Assessment and BIA. In the next article of this series, we’ll look at the process to document these processes and procedures in the form of the Business Continuity Plan.

If you have any questions about building a Business Continuity Management program at your organization, I can be reached at my contact information below.

Author

Your email address will not be published. Required fields are marked *