We live in a world where data breaches and destructive cyber attacks have become a daily headline. By now, everyone has received a breach notification letter or an email apology from a company impacted by cyber bad guys. Boards are asking, customers are asking, employees are asking, the whole world seems to be asking one simple question: “why?”
The Old Talk Track
I’m not going to kid you: there are a lot of reasons. In fact, the commonly accepted narrative explaining why has almost become a cliché.
“The volume and sophistication of cyber attacks has increased while companies’ ability to defend against the attacks has decreased. Security budgets are insufficient and the talent pool for cybersecurity professionals is in deficit. It’s not a matter of “if” we will be hacked, it’s “when.” And in all likelihood, we are being breached right now and don’t even know it.”
This talk track is naturally recited by CISOs and CIOs, and is usually:
…followed by a security framework, assessment, and benchmark, which…
…aligns to industry leading practices and standards, and is…
…accompanied by a huge budget request for a portfolio of roadmap initiatives which will save the company from impending doom.
Don’t get me wrong, this pattern for pitching cybersecurity is normal – and even necessary – for most companies seeking to stay competitive and secure in a digital world. In fact, recently, Boards have been asking, “Are we doing enough?” In most cases, the answer is clear: “No.” But it is a difficult question to definitively answer with any level of specificity. After all, how much security is ever enough? Like all investments, establishing the “right” level of cybersecurity is a constant balance between value, cost and risk. Perhaps a better question to ask is:
“What are we doing to make sure we can bounce back quickly when our company is attacked?”
In other words…
“Are we resilient to cyber attacks?”
A New Definition
Cyber Resilience is defined as “an organization’s ability to prepare for, respond to, and recover from cyber-triggered business disasters.” This definition has a few key components:
- Preparation – the proactive steps an organization takes to ready itself for an adverse event.
- Response – the reactionary measures implemented to counter the impacts of an adverse event.
- Recovery – restoring service during and after an adverse event.
The last critical component is the idea of a cyber-triggered business disaster [synonymous with “adverse event” terminology above].
In recent years we have seen cyber attacks shift from experimentation, fraud, extortion, blackmail, and data exfiltration to more damaging impacts such as system destruction, data eradication, and data manipulation. The Petya and NotPetya malware showed the world how quickly computer viruses can spread and how damaging they can be to a company’s core mission and operations. The damages caused by NotPetya reached an estimated $10 billion, exceeding the $4-8 billion estimated losses caused by the WannaCry outbreak one month earlier. That is why we have shifted away from labeling these events “incidents, cyberattacks, or hacks” – that does not capture the severity of their impact on the business. We must recognize that these events are really an attack on the business itself and they can have disastrous effects that fundamentally threaten the going concern of a company.
Today’s Resiliency Function
There are many debates about what constitutes an effective and holistic set of resilience functions but there are four commonly accepted disciplines within a company that have come to be associated with resilience.
1. Security Incident Response (SIR)
Security incident response (or “SIR” and sometimes called incident response or “IR.” For the purposes of this article we are differentiating SIR from general IT incident response) typically exists within any high-performing Security Operations Center (SOC). As events are logged, correlated, and analyzed, SOC analysts escalate suspicious activity for investigation. Events may turn into security incidents which are formally dealt with by trained responders. It is critical to have an effective SIR program in place as seemingly-insignificant events may quickly escalate to massive breaches and have destructive consequences for a company. SOC analysts and security incident responders are the frontline troops in the battle against malware and hackers.
2. Business Continuity Management (BCM)
Business Continuity focuses on keeping the business operating. It is the process of developing and documenting arrangements and procedures that enable the organization to respond to an event that lasts for an unacceptable period of time and to resume critical functions after an interruption. Effective BCM results in the creation and practice of a business continuity plan, which outlines a company’s critical business processes and designs plans for overcoming events and scenarios (such as natural disasters, epidemics, supply chain disruptions, and potential geopolitical risks, just to name a few). While BCM may sound similar to Cyber Resilience, BCM’s mission includes a wider array of business disruptions. By contrast, Cyber Resilience is acutely focused on cyber-triggered business disasters. I believe that high impact cybersecurity events have become an ever-growing chapter in the book of business continuity plans, and what was once a chapter, now deserves a book itself.
3. Disaster Recovery (DR)
Disaster recovery focuses on getting the technical infrastructure up and running in the event of a disaster. It is the technical (e.g. application, network, platform, and storage) component of business continuity planning to recover a data center, service, or application. Disaster recovery can be at odds with security incident response functions. While DR personnel’s objective is high availability and their mission is to restore service as quickly and seamlessly as possible, Security Incident Responders care more about threats to system/data confidentiality and integrity. Though restoring system availability is critical, Security Incident Responders work to understand the root cause and source of the attack in order to implement the appropriate countermeasures, which can include quarantining and isolating portions of the network or infected systems.
4. Crisis Management (CM)
Crisis management focuses on responding to extreme disruptions that threaten the financial, operational or reputational assets of a company. It is a coordinated plan of responding to, and managing through damaging events. Crisis management helps companies respond to widespread, rapidly-escalating, high impact events not traditionally covered by BCM.
There are also a number of supporting functions that contribute to a company’s resilience agenda such as Enterprise Risk Management (ERM), Internal Audit (IA), Legal, Public Relations (PR) [when crisis hits], and Fraud/Investigations. While I described the capabilities above in silos, more advanced companies are fusing together resilience-related functions across the multiple “lines of defense.” This convergence has helped organizations move from detective to preventive and from reactionary to predictive, but it takes significant and deliberate effort to get there.
Until recently, this level of integration between resilience functions was difficult, to the point of being impossible in large organizations. Advances in technology (e.g. artificial intelligence and machine learning), system monitoring/sensors, analytics and reporting, and platform integrations/APIs have enabled us to bring together data faster to make smarter decisions with less effort.
Yet even as the people, processes and technology supporting these resilience functions are converging, there still remains a critically missing piece of the puzzle that I believe is one of the biggest culprits for companies failing to effectively respond to cyber-triggered business disasters.
The Missing Piece
The SIR-BCM-DR-CM model has remained relatively unchanged since their respective disciplines were formalized into corporate functions. Yet I believe there is a critically missing piece in this model; a function that is absent from many companies and is one of the biggest (but not only) contributing factors to the sharp increase in cyber-triggered business disasters.
It is the discipline of Cyber Crisis Management.
This discipline is the missing link between security incident response and business continuity management. It serves as the coordinating function when a cyber-triggered business disaster occurs that exceeds the severity, impact, duration, or organizational reach of traditional security incident response functions. It allows SOC operators and security incident responders to focus on defending against the attack while yielding command and control and, perhaps most importantly, coordination to another organization empowered to resolve the cyber crisis end-to-end.
Many security operations centers do a fine job of triaging security incidents that arise in the daily course of monitoring the corporate environment. However, when large-scale cyber-attacks occur that result in loss of critical business services – and require coordination of many corporate functions – the SOC does not perform well in resolving the problem. And nor should they! It is beyond their remit for two reasons: skill and scale.
On the skill front, to use a medical analogy: it’s like asking your family doctor to perform bypass heart surgery; though he or she may be a doctor, one needs a specialist with a different set of skills to ensure a more successful outcome. Likewise, a cyber-triggered business disaster also requires specialists with a different set of skills.
Sticking with the medical analogy, on the scale aspect: it is akin to one’s local emergency room (ER) handling a widespread outbreak of Ebola. Though the front-line physicians may initially be the ER, it won’t be long before the epidemic is escalated to the Center of Disease Control (CDC) for additional resources with a broad reach to manage and quarantine the major health crisis. Cyber-triggered business disasters operate at a scale of epidemic proportions yet we often respond to them like ordinary visits to the ER.
Integrating Cyber Crisis Management
Traditionally, large-scale cyberattacks may have resulted in activation of a contingency plan or scenario described in the business continuity plan. The issue with this is: cyber attacks are unpredictable, indicators of co