Talk:Community Tech/Migrate dead external links to archives

Latest comment: 8 years ago by DannyH (WMF) in topic WikiCache

WikiCache

edit

Martix made a similar proposal at WikiCache. Perhaps the ideas they drafted there could be used to guide the development of this project. Blue Rasberry (talk) 15:45, 16 May 2016 (UTC)Reply

Oh, thanks for pointing that out. I'll go and reply to him. -- DannyH (WMF) (talk) 20:42, 17 May 2016 (UTC)Reply

Adding query against alternative archive if not present in Wayback Machine due to robots.txt or other reasons

edit

Websites with robots.txt restrictions will not be captured by the Internet Archive's global Wayback crawls, and even content captured in the past from a given host will not be displayed if/when robots.txt restrictions are added. For this project, how often do the dead links not have corresponding versions in the Wayback Machine? If this happens a non-trivial amount, could be good to subsequently check against and/or Memento (http://timetravel.mementoweb.org/) or (more narrowly) Archive-It (wayback.archive-it.org); these archives may contain captures irrespective of robots.txt restrictions.

Return to "Community Tech/Migrate dead external links to archives" page.