On February 22, more than 73,000 AT&T customers in the US reported a network outage that lasted more than eight hours. AT&T responded quickly, suggesting customers use Wi-Fi calling, AP reported. The same day, AT&T assured its customers that the outage was not the result of a cyberattack but rather a technical error.
The company's response to this widespread disruption offers lessons for organizations on how to communicate with internal and external stakeholders during and after a crisis and how to be prepared for potential technical issues that could become major obstacles to the business.
What caused the AT&T outage?
On February 22, AT&T wrote: “Based on our initial review, we believe that today's outage was caused by the application and execution of an incorrect process used while we were expanding our network, not by a cyber attack.”
AT&T contacted the Cybersecurity and Infrastructure Communications Agency, the Federal Communications Commission, the Department of Homeland Security and the Federal Bureau of Investigation regarding the outage, fueling some rumors of a possible cyberattack. CISA defines communications as a critical function.
SEE: CISA and IBM collaborated on a new cybersecurity certification course. (Technological Republic)
AT&T's response to outage shows effective communication
On February 22, AT&T quickly informed customers what happened and why through its social media mobile app, website, and virtual assistant. When information became available, AT&T informed all interested parties that the outage was not caused by a malicious actor. AT&T communicated this to its individual, business and employee customers at the same time in this public letter dated February 25 from CEO John Stankey.
Individual and small business customers affected by the outage are eligible for a $5 credit, likely in the next billing cycle. Business customers are invited to discuss the situation: “We are also working closely with our mid-market and enterprise customers and will address their concerns as those conversations occur,” according to Stankey's letter.
Stankey explained the reason for the exact credit amount (“For that reason, I think crediting those customers for essentially a full day of service is the right thing to do”) and apologized for the inconvenience. This transparency can help reduce the damage that loss of customer trust following an organization-wide incident could cause.
AT&T's communications, marketing, product and operations teams worked closely to coordinate data sharing and updates, AT&T told TechRepublic. Those teams also kept retail and customer service teams updated in case of customer calls and store visits related to the outage.
“In crises, speed is everything,” AT&T spokesman Jim Greer said in an email to TechRepublic. “We sought to put the customer first and acted quickly to provide them, along with employees, investors and regulators, with answers to what was a rapidly developing situation.”
What can IT in particular learn from the AT&T outage?
Human error happens to all of us. There's a reason why PEBCAK – “a problem exists between the chair and the keyboard” – is an established acronym. Whatever went wrong with the network upgrade, it appears to have been part of the normal course of business.
The AT&T outage emphasizes the importance of testing backups, redundant systems, and emergency preparedness plans. For cellular carriers, alternative channels such as Wi-Fi calling, satellite service, or a carrier-independent SIM card could be good backups in case of emergency. These actions help reassure customers and implement practical solutions. Additionally, the AT&T outage is a good reminder to report incidents to the correct agencies as appropriate.
SEE: Carrier-neutral SIM cards are among this year's highlights at Mobile World Congress. (Technological Republic)
It's important to keep software up-to-date and generally modernize technology to support the resilience and security of organizations overall, but disruptions like this emphasize that IT and CISOs likely play a role in communicating well with external stakeholders. and internal during an unexpected event. IT and cybersecurity leaders must be sure their software supply chain practices are up to date in the event of cascading issues or vulnerabilities in the chain, even when there is no malicious intent involved.
Even if IT leaders don't communicate directly with customers, they should have well-established channels of accountability within their department to respond to and potentially publicize issues that affect many customers.