Why disaster recovery plans fail in geopolitical crises

Geopolitical tensions have risen around the globe. While the Russia-Ukraine conflict has done little to change business continuity and disaster recovery practices in "safe" locations, the war in Iran has served as a wake-up call, causing more CIOs to rethink the scope of their disaster recovery plans.According to Kapish Vanvaria, risk consulting leader at EY Global and Americas, traditional disaster recovery planning often underestimates rare and extreme events because such plans are built around "known knowns," such as hazards, fixed scenarios and predictable timelines when geopolitical disruption is rarely contained. A disaster originating in one domain can rapidly cascade into supply chain constraints, regulatory changes, vendor limitations and connectivity disruption that traditional disaster recovery plans tend not to anticipate."CIOs are also recognizing that the boundary between nation-state dynamics and private enterprise is increasingly blurred," Vanvaria said in an email interview. "Organizations of all sizes can be directly impacted; challenging assumptions about who is exposed and how quickly impacts can materialize."Related:How CIOs can build an evolving crisis strategyFor example, in March and April, Amazon AWS data centers in the United Arab Emirates and Bahrain were hit by Iranian kamikaze drones, disrupting service. An Oracle office in Dubai also suffered minor damage from Iranian missile debris in early April.While the Big Tech companies are obvious targets, no organization should consider itself safe from disruption or supply chain woes caused by geopolitical unrest. The problem is that IT departments have been so focused on cybersecurity and insider threats that they've overlooked other threats that could cause serious disruption. "[The] Russia-Ukraine [war] didn't wake them up, nor did the 12-day war last year [between Iran and Israel]. It literally took the entire Middle East to be in flames, to [realize], maybe we should have thought about this," said Stefano Ritondale, chief intelligence officer at AI-driven intelligence firm Artorias.What organizations need to do is anticipate the possibilities before disaster strikes, such as making sure cybersecurity teams are aware of foreign hacking techniques and the IP addresses they tend to use for attacks. More fundamentally, when disaster strikes, a common reaction is to invest aggressively in technology, whereas organizations should start with the in-depth conversations necessary for risk mitigation. EY's Vanvari says organizations are increasingly breaking down silos among technology, legal, risk, cyber and compliance functions to proactively map complex scenarios, making business resilience a cross-functional business capability rather than just an IT playbook.Related:Redefining incident response strategies beyond the breachBernard Brantley, CISO, CorelightThe impact of cloud, internet and telecom disruptionMost organizations have hybrid cloud infrastructure in part to ensure resiliency. While the major cloud providers have data centers around the world and are continuing to build more, the density of those data centers varies by region, making some areas more vulnerable to service disruption. More broadly, cloud, internet and telecom infrastructure are primary targets because they can hamper an enemy's ability to communicate and operate effectively. This is not only a threat to companies with a presence in war zones, but also one that should be considered by all CIOs today.Bernard Brantley, CISO of network detection and response solution provider Corelight, would not have considered the obliteration of an AWS data center in a risk model a year ago, but times have changed. "Have we updated our mental model to include the disaster risk, which now, in the current geopolitical climate, includes large [numbers] of services going offline, potential full-scale disruption of infrastructure due to military action? We would not have [included that in the] calculus and assessment [previously]. I think it's very important for us to do now." Related:Ask the Experts: When ransomware strikes, who takes the lead -- the CIO or CISO?According to EY's Vanvaria, organizations need the ability to fail over operations by taking advantage of geographic redundancy, alternative connectivity routes and architectures that allow critical functions to continue running under constrained network conditions.While enterprises have redundancy for failover, an important point has been overlooked: What would happen if everything was down simultaneously? While some organizations may have considered this, they haven't necessarily backed it up with an exercise that simulates such a scenario. "People plan for scenarios that make take a few days or a week, but that's not reality. If you're running from a scratch backup, it can take six months to a year to get up and running," said Kim Larsen, group CISO at cloud-native data protection and recovery solution provider Keepit and a former member of the Danish police force.In today's geopolitical climate, CIOs are wise to start thinking more like military strategists when it comes to business continuity and DR.For example, in 2025, cloud connectivity company Cloudflare experienced a few outages, the worst of which was a complete six-hour outage that directly affected many organizations including Canva, ChatGPT, Uber and X. The incident also served as a wakeup call for Chris Campbell, CIO at DeVry University. "A Cloudflare outage may not affect my website, but it could take down three or four applications that rely on it," Campbell said. "If you don't have a well-documented understanding of your technology stack and its interaction with your customers and your internal processes, you're probably operating in a high-risk scenario."There is also the question of what to do with employees who live and work in a war-torn region. For example, Corelight has offered relocation services or stipends to employees who need to get to a safe place. After witnessing a similar scenario at Amazon previously, Brantley said it's important to understand what employees will need to stay safe, prioritizing their health and building programs around that. EY's Vanvari said continuity depends on the ability to redistribute work rapidly if a location becomes unavailable. This includes remote work readiness, cross-training, clear authority handovers and, where appropriate, prearranged third-party capacity to temporarily absorb key functions, with employee safety and well-being as a priority.Darren Cassidy, CIO, SitecoreOther surprises in 2026The current world situation demands that CIOs adapt to it, but it's not like they don't have anything else to do. For example, data analytics company FICO has been running into infrastructure limits because of the GPUs it uses for AI. "We've gone from a world where cloud capacity felt unlimited to one where silicon and power are rationed," said Mike Trkay, CIO at FICO. "Energy procurement and grid stability are now two of our top three operational priorities."Before, FICO would solve model scaling processes with specialized hardware. Now, it is using underutilized CPUs and rethinking model efficiency as a core part of how the company deploys its solutions. By squeezing high-performance execution out of standard x86 and ARM architectures, FICO has been able to bypass the GPU bottleneck for a significant portion of its predictive modeling.It isn't just a technical fix; it's also a strategic hedge against hardware volatility. GPUs are treated like a fluid, precious resource that uses AI-driven agents to dynamically provision spot instances and shift workloads across providers in real time based on cost.For disaster recovery, FICO designed operational models using a multi-geo approach for cost optimization and GPU resource availability. This ensures that disaster recovery requirements become inherent to the operating model."Inference needs to become geo-agnostic. We need to move toward a model where workloads are stateless relative to the data center, and free to migrate wherever compute resources are most available, stable and cost-effective," Trkay said. "If we stay within the appropriate regulatory jurisdiction to satisfy data sovereignty requirements, where the 'thinking' happens shouldn't matter. IT Ops and workload distribution are now as much about resource optimization as they are about resiliency and DR."Darren Cassidy, CIO at AI-enabled digital experience software provider Sitecore, said his organization has had to accelerate the maturity of its resilience at a pace that was previously uncomfortable. "It has forced us to change how we treat risks and manage that," he said. "So, for example, it's not enough now just to have the piece of paper saying your audit process passed and you've got certification. You've got to go and robustly challenge those processes, do tabletop exercises and try to break things."

View Original Article

0 0 Share

0 people liked this

More from this channel

Why disaster recovery plans fail in geopolitical crises