Google's withdrawal of Internet archiving tool draws the ire of Chinese researchers | Technology news


Taipei, Taiwan – For China researchers, keeping up with the country's politics or economy is difficult enough due to its opaque leadership and widespread censorship.

Now they face a challenge from an unexpected source: Google.

Late last year, Google quietly began removing links to cached pages from its search results, a feature that had allowed Internet users to view old versions of web pages.

Danny Sullivan, Google's public search liaison, confirmed earlier this month that the feature had been discontinued.

“It was intended to help people access pages when a long time ago you couldn't depend on a page loading. These days things have improved a lot. So, it was decided to retire it,” Sullivan said in a post on X earlier this month.

Although originally introduced to improve Internet performance, Google's caching feature had the unintended effect of increasing transparency and became an invaluable resource for researchers.

Academics, journalists and others used cached pages to view past incarnations of websites and removed content, a particularly useful tool for China's Internet, which Beijing carefully edits to avoid embarrassment and guard against potential dissent.

“The loss of Google's cache feature will be a blow to Chinese researchers who have long relied on this feature to preserve access to information that can later be deleted, particularly in research citations,” Kendra Schaefer, head of technology policy research at Trivium. China, he told Al Jazeera.

A Google spokesperson confirmed the change to Al Jazeera.

“Google's Cached Pages feature was born more than two decades ago, at a time when pages might not be reliably available. “The web (and web service as a whole) has improved tremendously since then, making the need for cached pages less necessary,” the spokesperson said via email.

China's “Great Firewall” means popular sites from Wikipedia to Facebook are inaccessible without a virtual private network, while its government censors scour the web for sensitive content to remove.

Taboo topics

In addition to taboo topics such as the 1989 Tiananmen Square crackdown and criticism of Chinese President Xi Jinping, censors have targeted targets ranging from socially conscious Chinese rock band Slap to comments made by the late Prime Minister Li Keqiang on strengthening HIV/AIDS prevention. work.

During the COVID-19 pandemic, Beijing closely monitored and removed undesirable content and has since been trying to rewrite the post-pandemic narrative by suppressing scientific studies and politically inconvenient international news reports.

There are alternatives to Google's cached pages, specifically the nonprofit Internet Archive's Wayback Machine.

But Google's removal of cached links makes it harder to know what's missing in the first place, said Dakota Cary, a nonresident fellow at the Atlantic Council's Global China Hub.

“We're not going to know how much we're missing because we can't measure what was lost, because it's not something we can see anymore,” Cary told Al Jazeera.

Even dead links in Google search results could give researchers tips or show how a website has been modified, he said.

“Now you have to expand the ways in which you might think about doing or searching for certain items and maybe ask people who specialize in a particular place if they have access to or have a backup of a particular document. The way the investigation is conducted is going to be much more difficult,” Cary added.

Internet in China is subject to strong government censorship [Andy Wong/AP]

Graham Webster, editor-in-chief of Stanford University's DigiChina Project, said he was less worried about the impact, mainly because Western sites like Google and the Wayback Machine had not been as thorough in tracking the Chinese Internet as other domains.

“Cached pages have sometimes been a resource for Chinese researchers to access deleted pages usually for a short period after they go down. [The Internet Archive] Archive.org generally didn't crawl the web as thoroughly and sometimes didn't capture key parts of a page, but it's still a resource if you know the URL you're looking for,” Webster told Al Jazeera.

Cary said Google's decision to stop “backing up the Internet” raises questions about who should be responsible for keeping a record in the future.

“Archiving is an incredibly useful feature, and given the way so much of our lives have been transformed into this digital medium, I don't know if we've really taken steps to preserve the information that's posted on the Internet.”

Cary said he could draw inspiration from the U.S. government, which does extensive work archiving online content produced by foreign governments and other sources.

“There's a whole system for that and it seems like maybe this is a place where our systems could adapt to the era we live in now.”



scroll to top