This happened a few weeks ago; but I make a blog entry as a reminder.
I got contacted by a friend. He got a warning that someone found personal information regarding a client of him.
I tried several search engines with the clients name. Most engines did not give a result, but Yahoo did. Yahoo redirected me to a text file on the website. So I tried a search for that filename in combination with the websites domain. Again only Yahoo gave a result.
So I scanned all HTML and PHP files; only a few PHP files did contain a reference to this text file but that was all from within PHP code. In other words, there was no link to that text file from anywhere of the (static and generated) HTML code.
I did found a reference to the text file in a WS_FTP.LOG file. The webmaster (by mistake I assume) had uploaded this file. Again I did a search for WS_FTP.LOG together with the domain name. And again only Yahoo reported a link. To be sure I also scanned all HTML and PHP pages for the name WS_FTP.LOG. None of them contained this text.
My conclusion, the crawler of Yahoo tries several standard names with the root and per discovered folder, including a few which imho are a bit questionable.