Is Google Reading Your Robots.txt File?
Posted by Mitch Mitchell on Mar 16, 2009
Last week, I happened upon a post by our buddy Caleb of The Market Secrets Blog titled How Search Engine Robots Read Your Site. Once I got past the image of the stupid spider (had to be a spider, didn’t it Caleb?), in the article, he has this box attached to something called Search Engine Spider Simulator, where you put in your URL (that’s the link to your blog or website, for the uninitiated), click “submit”, then it goes through and shows you how the search engines go through your site.
I put in the URL for this blog, and I got nothing. I knew that couldn’t be right, because I have the WordPress Google XML Sitemap plugin on the blog. But there is was, not a single thing. I then put in the URL of my two business sites, and all sorts of pages came up.
At that point, I decided to come back to my blog and go into the settings for the plugin. Truthfully, I’d never been in there before, just accepting however it’s default settings gave me. Near the top, under the Basic Options area, there’s a box that was unchecked for me that states “Add sitemap URL to the virtual robots.txt file.” I checked that, saved it, and waited a day. I then went back to Caleb’s article (forgetting about that stupid spider) and put in my blog’s URL once again. This time, all worked perfectly; wow!
It makes me wonder now if, later on, I might actually start attaining some kind of page rank on some of my internal pages, now that I know for sure that the search engines will be going through them more often than just when I first wrote them. I don’t know for sure, but I guess we’ll find out.
If you want to learn more about robots.txt files, you can check out the Web Robots Pages site, but I’ll tell you this truth. I don’t have a robots.txt file for most of my sites. Instead, I use a page called XML Sitemaps.org, where you can create many different types of files that you can create, then upload to your site, to help the search engines “spider” your site. It works just fine for my regular sites, but it doesn’t seem to work well for my blogs, probably because I’m uploading that file to the wrong place. No matter, now that I know better how to use the WordPress Google XML Sitemap plugin.