Just as knowing who is reading your blog, you need to know who is indexing your blog. The Google Search Engine, which feeds 2/3 of the known major search engines, provides Google Webmaster Tools, so we can monitor how well our blogs are being indexed.
If you're going to use Google Webmaster Tools / Search Console effectively, you need to know how to interpret the reports.
Specifically, you need to know what reports show problems needing attention, and what reports advise us of problems avoided.
Representing the latter, we have a message in the list of URLs restricted by robots.txt, which can be found from a link in "Overview", or from "Diagnostics" - "Web Crawl". Recently, we see email from the Search Console team.
Indexed, though blocked by robots.txt
Note the carefully worded advice.
This means that Index coverage may be negatively affected in Google search results.With a Blogger blog, "may be" should be "actually, is not".
http://blogging.nitecruzr.net/search/label/Search Engines URL restricted by robots.txt
This list, when run against a blog which has a lot of labels, will be rather long. This will cause panic in the hearts of many. Fortunately, this is unnecessary panic.
The keyword here is
/search/label/
This shows us a label search that is blocked, intentionally by Blogger, so the search engines won't follow it.
Were the search engines to index a label search, they would be indexing the same posts in the blog, that they find from the sitemap. They would then penalise both the sitemap indexed content - and the label search indexed content - in their indexing, for "duplicate content".
With "duplicate content" penalties applied, your posts appear lower in the search results - and your blog gets less new visitors.
So, when you see "... /search/label/ ... URL restricted by robots.txt" in your access report, don't panic. This restriction is to your advantage.
0 comments:
Post a Comment