The Library of Congress uses open source and custom-developed software to manage different stages of the overall workflow. The Library has developed and implemented an in-house workflow tool called Digiboard, which enables staff to select websites for archiving, manage and track required permissions and notices, perform quality review processes, among other tasks. To perform the web harvesting activity which downloads the content, we primarily use the Heritrix archival web crawler. For replay of archived content, the Library has deployed a version of OpenWayback to allow researchers to view the archives. Additionally, the program uses Library-wide digital library services to transfer, manage, and store digital content. Institutions and others interested in learning more about Digiboard and other tools the Library uses can contact the Web Archiving Section for more information. The Library is continually evaluating available open-source tools that might be helpful for preserving web content.
Last Updated: May 05, 2025
Views: 15