Forbes: The Internet Archive Turns 20: A Behind The Scenes Look At Archiving The Web - Vereinigung Österreichischer Bibliothekarinnen und Bibliothekare

To most of the web surfing public, the Internet Archive’s Wayback Machine is the face of the Archive’s web archiving activities. Via a simple interface, anyone can type in a URL and see how it has changed over the last 20 years. Yet, behind that simple search box lies an exquisitely complex assemblage of datasets and partners that make possible the Archive’s vast repository of the web. How does the Archive really work, what does its crawl workflow look like, how does it handle issues like robots.txt, and what can all of this teach us about the future of web archiving? …

Siehe den ganzen Artikel unter http://www.forbes.com/sites/kalevleetaru/2016/01/18/the-internet-archive-turns-20-a-behind-the-scenes-look-at-archiving-the-web/#2715e4857a0b22403ae67800

Schreibe einen Kommentar Antworten abbrechen