It looks like they used robots.txt to do that.
https://web.archive.org/web/20130413152316/http://www.state....
Each line is missing `/documents` in the snippet of the `robots.txt`