Hacker News .hnnew | past | comments | ask | show | jobs | submitlogin

Wow, all the US Department of State files have just gone missing from archive.org. The servers hosting those files are conveniently down.


Check out the Internet Archive FAQ on how to remove a document from their archives. https://archive.org/about/exclude.php

It looks like they used robots.txt to do that.


Huh, so the wild-card user-agent will block not just searchbots, but also archivebots. Wonder how OP managed to get screenshots of archive.org having archives available for those documents.


They're there, at least the two I looked at.

https://web.archive.org/web/20130413152316/http://www.state....

Each line is missing `/documents` in the snippet of the `robots.txt`


I have been able to view multiple pdfs and view the page screenshotted by the author.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: