A challenge for any organization is dealing with single points of failure (SPOF). By its classic definition, a single point of failure refers to a part of the system with no back up. Or in other words, an operational vulnerability that will cause the entire system to stop working in the event of a failure.
This may look like a single server with no backup server. If it goes down, all access to that application goes down with it.
But there is also a people side to the equation when it comes to single points of failure. A SPOF from a staff perspective might come in the form of the size of the organization (i.e. small ones where one individual wears many hats with no backup), specialty tasks performed by one individual, or simply an organization’s inability or failure to cross train.
Let’s break this down a bit further. If you think hard enough about your own organization, you may find there are more SPOFs than you would care to admit. The key, however, is criticality, the driving force behind determining if the SPOF requires additional consideration or not. Whether a single point of failure should be considered a vulnerability does not depend solely on whether one individual in your organization performs a task that no one else can do. How it would impact the business should it fail is more important.
Let’s take for example a credit union with only one real estate loan officer. Assuming the organization depends on this business, this very likely would require a plan to address the SPOF. This is where mitigating controls should be implemented. In our example, this might look like a contract to have underwriting performed externally, cross training of another loan officer, and perhaps most importantly, requiring that procedures used in the process are well documented. Compare that to a staff member who performs a task which does not require special skills. In this case an unexpected absence or resignation could be replaced easily and without impact to core business.
When critical SPOFs are identified there must be a call to action. Considerations should revolve around how badly the system would fail including promises made to members or clients that will go unfulfilled, loss of income based upon the specific business line the individual was running, reputational risk associated with the marketplace perception, and the effect on those in the organization dependent upon the line of business for their livelihood.
Identifying single points of failure is not difficult and in many cases are well known but simply not addressed appropriately. The first thing to do is to bake the identification of SPOFs into the culture of the organization. It is tempting to ignore SPOFs. It requires no additional effort to say, “I trust that the process will not break.” But it does require extra work to build controls around one to mitigate their effect should the worst happen.
In some cases, management may not want to admit one exists in their department for fear of retribution or understanding that additional resources must be added as a loss mitigation control. To bake it into the culture requires a cooperative tone with documented outline for managers to use for addressing these situations.
So what does the plan entail? First, a question set regarding the criticality of the SPOF to determine if a call to action is required. For example, how will our organization be affected if the task cannot be completed for a day, a week, a month? Will it affect income or member needs? Could the task be easily picked up by somebody untrained?
Once you have determined the criticality of the task, you can move on to creating a list of loss mitigation controls. Assuming the task is indeed one that would be dire should it fail, these loss mitigation controls might include:
- Cross training plans
- Ability to outsource
- Documentation of processes necessary to complete the tasks associated with the individual
- Requiring SPOFs to document procedures, promises, pipeline sales, new initiatives, and business plans
- Having a plan to replace and understand the potential resource pool available
- Understanding what systems and subsystems will need to be retooled and the length of time it will take to perform that function
Single points of failure live in every organization. Identifying them and developing a management culture to address them proactively is the key to making sure no one person could dramatically affect the organization’s existence, reputation, or long-term strategic goals.