In the digital age, where both businesses and consumers thrive on seamless connectivity and uninterrupted service, recent major outages have raised alarm bells. From ChatGPT outages to other tech giants facing unforeseen downtime, the financial repercussions of these outages can be staggering and extend beyond simple monetary loss. According to Dun & Bradstreet, 59% of Fortune 500 companies endure a minimum of 1.6 hours of downtime each week, with an average weekly cost ranging from $643,200 to $1,056,000.
Companies have also seen their reputations affected due to these costly moments. Beyond the immediate losses, there is a new concern: how can companies effectively protect themselves against the strong impact of future outages? Downtime, the period in which systems are inaccessible or not functioning optimally, severely disrupts user access to online services, halts employee productivity, and/or prevents customer interaction with a company. organization.
Because the Internet is an intricate web of interconnected systems, networks, and applications, these outages can escalate quickly and significantly damage an organization's reputation. The statistics paint a grim picture. Forrester's 2023 Opportunities Summary found that:
1/ 37% estimated their businesses lost between $100,000 and $499,000, and 39% lost between $500,000 and $999,999 due to internet outages.
2/ Disruptions also hurt companies internally by increasing employee turnover (55%) and reducing workforce productivity (49%).
3/ Without adequate visibility, companies are experiencing an average of 76 interruptions per month.
4/ 75% of respondents said that IPM would have a significant or large positive impact on their business.
The US AI market is worth an estimated $87.18 billion to $167.3 billion, and its growth is causing the digital landscape to evolve at breakneck speed. The growing reliance on AI-powered applications is highlighting the need for proactive monitoring against downtime. The ChatGPT outage on February 14 affected both the ChatGPT service and its clients running GPT-based chatbots through an API. Monitoring AI dependencies will be critical for all businesses, from startups to large enterprises.
Co-founder and CEO of Catchpoint.
Case in point
In December 2023, Adobe's extensive customer base was impacted by a series of outages to Adobe Experience Cloud that lasted 18 hours. While AI has not yet been added to their platform, many companies are starting to rely more on the technology, and this disruption serves as an example of what could happen once it is more deeply integrated. In fact, the overall disruption of Adobe Experience Cloud highlights the vulnerabilities inherent in relying on third-party services within digital infrastructure. This outage, caused by a failure in Adobe's cloud infrastructure, resulted in significant service disruptions, impacting critical functions across multiple platforms.
According to Adobe, data collection (segment publishing), data processing (cross-device analytics, analytics data processing), and reporting applications (analytics workspace, legacy report builder, data connectors, feeds) data, data warehouse, web services API) were all affected by the outage. During the outage, users experienced lags and slow performance across various Adobe services. Post-mortem investigations revealed that the root cause of the outage was due to issues within Adobe's cloud infrastructure, resulting in latency spikes and long loading times for users.
The failure within Adobe's infrastructure had far-reaching consequences, impacting businesses and users who rely on Adobe services for daily operations. On top of that, Adobe was at risk of incurring Service Level Agreement (SLA) violations for millions of customers. An SLA establishes a defined deadline within which tickets must be answered or chats and calls must be answered. Failure to respond or collect within the specified time frame results in an SLA violation. Payments often follow. Customer loyalty can also be tested.
Adobe's outage was more than a disruption: it served as a wake-up call for companies using its services to reevaluate their broader approach to digital resilience. The magnitude of the outage, affecting so many Adobe services, serves as a valuable reminder of the need for businesses to always make contingency plans and take proactive measures to protect against future disruptions.
So how can companies better navigate the risks and create a solid path to Internet resilience? This fundamentally requires a massive change, one that prioritizes real-time visibility into application performance, enabling the identification of potential bottlenecks or other weak points before they become full-blown crises. By monitoring AI (or other) dependencies with laser precision, organizations can preemptively address vulnerabilities, strengthen their digital infrastructure, and mitigate the consequences of unforeseen disruptions.
Protect against downtime in the age of AI
It is undeniable that in today's fiercely competitive landscape, even the briefest service interruption poses a significant risk to consumer confidence and brand trust. To counter these risks, organizations must take a proactive stance toward performance monitoring, particularly those related to AI-powered applications, which are quickly becoming part of everyday business. Unlike traditional applications, AI-powered systems often operate autonomously and make split-second decisions based on large amounts of data.
Any disruption to these systems can lead to a cascade of errors and delays, resulting in a disruption of user interactions and ultimately a loss of brand trust. Real-time visibility into application performance allows businesses to quickly detect anomalies, optimize functionality, and maintain seamless interactions with users. The ability to quickly identify and address issues as they arise allows IT teams to maintain operational continuity and mitigate potential damage.
Predictive analytics and AI-powered anomaly detection play a critical role in preemptively identifying potential issues before they disrupt the end-user experience. As reliance on AI technologies continues to grow, uninterrupted service will become an increasingly critical business imperative. However, achieving early detection can be challenging.
Many businesses still rely on basic uptime monitoring, often restricted to monitoring just their home page, leaving them vulnerable to intermittent or partial site failures when an AI-dependent service fails. To defend against AI-induced downtime, organizations must implement holistic monitoring strategies such as Internet Performance Monitoring (IPM), which spans the entire spectrum of AI-powered applications, from frontend interfaces to backend data processing pipelines. .
By proactively monitoring AI dependencies and implementing robust performance management frameworks, companies can mitigate the risks of costly downtime and maintain operational continuity in an increasingly AI-driven landscape. Consider this a call to action to think ahead and better protect the business by anticipating these challenges and equipping operations teams to better manage them.
We have the best network monitoring tool.
This article was produced as part of TechRadarPro's Expert Insights channel, where we feature the best and brightest minds in today's tech industry. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing, find out more here: