How to stop the government from deleting itself

How the next wave of technology is upending the global economy and its power structures
Aug 14, 2024 View in browser
 
POLITICO's Digital Future Daily newsletter logo

By Steven Overly

With help from Mohar Chatterjee

Select Committee to Investigate the January 6th Attack on the U.S. Capitol.

Internet archivists leapt into action to preserve websites, documents, and other digital content before a new Congress could disband the House Select Committee on January 6. | Sarah Silbiger/Getty Images

Think back to early January 2023. Republicans were preparing to take control of the House and disband a committee that Democrats set up to investigate the Jan. 6 insurrection. In the process, they were expected to scrub the committee’s website and all the evidence that had been collected there, including an interactive timeline of the day’s events.

A team of internet archivists had other plans.

In the days before the handover, they logged every website, video and document the committee had published online before it potentially disappeared forever. They worked against the clock to save the records, like a scene in the kind of nerdy political thriller that only captivates Washington.

That moment was a bit of unique drama in a longer — and very serious — effort to preserve digital records vital to our democracy. As more information is shared exclusively online, saving the political corners of the internet, particularly government websites, is crucial to capturing our collective history for future generations. And as the Jan. 6 committee example shows, it can also protect it from tampering in a hyper-partisan political climate.

Now, that same group of archivists — a coalition from government, academia and nonprofits — has begun capturing the Biden administration’s digital footprint.

The monthslong undertaking is called the End of Term Archive, and it has occurred every four years since the George W. Bush administration. Archivists first amass a sprawling list of public government URLs. They then catalog all of those websites (and the websites within those websites) and a snapshot of their content. In the end, it’s as much as 300 terabytes worth of material.

“Think of it like a spider,” Mark Graham, the project’s archivist-in-chief, said on the POLITICO Tech podcast. “You start at one place and then you begin crawling out as far as you can see on these different websites. And that's how you try to get a pretty good overview.”

Graham works for the Internet Archive, a nonprofit that aims to preserve digital history, perhaps best-known for its long running project called the Wayback Machine, which allows you to look up old versions of websites. He also leads the End of Term Archive, which is a joint effort with the Library of Congress, University of North Texas Libraries, Stanford University Libraries, the U.S. Government Publishing Office and the National Archives and Records Administration.

Right now the End of Term Archive is preparing for its initial “crawl” of government websites next month, and will then do another around the inauguration in January, Graham said. And a digital copy of those websites will be available almost immediately to the public via the Wayback Machine.

But this particular data is also offered in bulk to academics and computer scientists for research projects. That’s different from most websites archived through the Wayback Machine, Graham said, which can only be accessed through individual URLs. Government websites are considered “publicly accessible” and therefore can be downloaded in bulk.

In recent years, huge datasets like these archives have been drafted for another use: training artificial intelligence. Similar data has already been used to create AI systems that can explain the U.S. legislative process or talk through the content of particular bills, for example.

“This is a relatively new area, but it's certainly something that everyone is paying attention to,” Graham said.

The most traditional use — turning ephemeral websites into a firm historical record — is still the most important for the future, in Graham’s view.

For historians, government watchdogs and journalists, the End of Term Archive has become a tool for seeing how government websites change over time and across administrations. Many of those changes are innocuous, Graham said. Some websites simply go dormant or get replaced.

In other instances, however, politics are at play. During the Trump administration, for instance, the Environmental Protection Agency made headlines for eliminating a website focused on fighting climate change, and for removing references to climate change across other websites and government documents.

But for Graham, and his band of archivists, the objective is simpler.

“We're just doing our part to try to preserve the digital artifacts of our time, to help preserve and make available and make useful our cultural heritage,” Graham said. “And if that has the effect of causing people to think twice when they may try to change that record, so be it.”

 

DON’T MISS OUR AI & TECH SUMMIT: Join POLITICO’s AI & Tech Summit for exclusive interviews and conversations with senior tech leaders, lawmakers, officials and stakeholders about where the rising energy around global competition — and the sense of potential around AI and restoring American tech knowhow — is driving tech policy and investment. REGISTER HERE.

 
 
DEFENSE TECH COMPANIES GOBBLE EACH OTHER UP

Emerging tech companies in business with Washington seem to be going through a maturation cycle: Multiple start-ups involved in the DOD’s push to advance autonomous systems are now acquiring other companies that help them consolidate their operations.

Hadrian — an autonomous producer of parts for space and defense companies — said Wednesday that it bought software firm Datum Source, founded by SpaceX alumni (h/t TechCrunch). This comes after venture-backed Anduril Industries bought Blue Force Technologies, which makes the Fury unmanned fighter jet, in September. Earlier this year, ShieldAI bought the Australian Sentient Vision Systems.

The American defense tech sector is sustained almost definitionally by the Pentagon, which has traditionally struggled to foster start-ups and tends to contract from a handful of big companies. But the DOD is aiming to change that partly through the Replicator initiative, a program to field thousands of cheap autonomous drones over two years.

Doug Beck, director of the Defense Innovation Unit, said last week that the department is handing out awards for systems it picked for the second stage of the program, according to Defense News.

The DOD, meanwhile, is chafing at how much oversight Congress wants over the program. Deputy Defense Secretary Kathleen Hicks said the demand by lawmakers for an average of a briefing a week on Replicator “isn't scalable for Congress across the breadth of what we're trying to accomplish."

— Mohar Chatterjee

TWEET OF THE DAY

Screenshot of a tweet that reads: The latest upgrade to LinkedIn's email spambot is now delivering sick burns.


The Future in 5 links

Stay in touch with the whole team: Derek Robertson (drobertson@politico.com); Mohar Chatterjee (mchatterjee@politico.com); Steve Heuser (sheuser@politico.com); Nate Robson (nrobson@politico.com); Daniella Cheslow (dcheslow@politico.com); and Christine Mui (cmui@politico.com).

If you’ve had this newsletter forwarded to you, you can sign up and read our mission statement at the links provided.

 

A YEAR OF CALIFORNIA CLIMATE: A year ago, the California Climate newsletter was created with a goal in mind — to be your go-to source for cutting-edge climate policy reporting in the Golden State. From covering Gov. Newsom's crucial China trip to leading the coverage on California's efforts to Trump-proof its climate policies, we've been at the forefront of the climate conversation. Join us for year two if you haven’t already, subscribe now.

 
 
 

Follow us on Twitter

Daniella Cheslow @DaniellaCheslow

Steve Heuser @sfheuser

Christine Mui @MuiChristine

Derek Robertson @afternoondelete

 

Follow us

Follow us on Facebook Follow us on Twitter Follow us on Instagram Listen on Apple Podcast
 

To change your alert settings, please log in at https://login.politico.com/?redirect=https%3A%2F%2Fwww.politico.com/settings

This email was sent to salenamartine360.news1@blogger.com by: POLITICO, LLC 1000 Wilson Blvd. Arlington, VA, 22209, USA

Unsubscribe | Privacy Policy | Terms of Service

Post a Comment

Previous Post Next Post