Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Update on the 2024/2025 End of Term Web Archive

“Every four years, before and after the U.S. presidential election, a team of libraries and research organizations, including the Internet Archive, work together to preserve material from U.S. government websites during the transition of administrations. These “End of Term” (EOT) Web Archive projects have been completed for term transitions in 2004, 2008, 2012, 2016, and 2020, with 2024 well underway. The effort preserves a record of the U.S. government as it changes over time for historical and research purposes. With two-thirds of the process complete, the 2024/2025 EOT crawl has collected more than 500 terabytes of material, including more than 100 million unique web pages. All this information, produced by the U.S. government—the largest publisher in the world—is preserved and available for public access at the Internet Archive. “Access by the people to the records and output of the government is critical,” said Mark Graham, director of the Internet Archive’s Wayback Machine and a participant in the EOT Web Archive project. “Much of the material published by the government has health, safety, security and education benefits for us all.” The EOT Web Archive project is part of the Internet Archive’s daily routine of recording what’s happening on the web. For more than 25 years, the Internet Archive has worked to preserve material from web-based social media platforms, news sources, governments, and elsewhere across the web. Access to these preserved web pages is provided by the Wayback Machine. To support the EOT Web Archive project, the Internet Archive devotes staff and technical infrastructure to focus on preserving U.S. government sites. The web archives are based on seed lists of government websites and nominations from the general public. Coverage includes websites in the .gov and .mil web domains, as well as government websites hosted on .org, .edu, and other top level domains.  The Internet Archive provides a variety of discovery and access interfaces to help the public search and understand the material, including APIs and a full text index of the collection. Researchers, journalists, students, and citizens from across the political spectrum rely on these archives to help understand changes on policy, regulations, staffing and other dimensions of the U.S. government.  As an added layer of preservation, the 2024/2025 EOT Web Archive will be uploaded to the Filecoin network for long-term storage, where previous term archives are already stored. While separate from the EOT collaboration, this effort is part of the Internet Archive’s Democracy’s Library project..” [h/t Pete Weiss]

Sorry, comments are closed for this post.