Web Development Information

Effectively Using Robots Meta Tags


The "robots" meta tag, when used properly, will tell the search engine spiders whether or not to index and follow a particular page. For the purposes of this article, we will be using the "( )" symbols to represent the "< >" in html coding.

Some examples of robot usage are as follows:

(meta name="robots" content="index,follow")

(meta name="robots" content="noindex,follow")

(meta name="robots" content="index,nofollow")

(meta name="robots" content="noindex,nofollow")

Let us first examine what these terms mean before we explain the usage for each one:

"index"- This directive tells the search engine robots (or spiders) that it is okay to index the page. Another words, you are allowing the search engine to include your page within their search directory.

"noindex"- Using this tag, you are informing the robots that this page should not be indexed. Simply put, this page will not appear in their search directory.

"follow"- When you use this tag, you are telling the search engines that you want their robot to follow any links that are found on that page.

"nofollow"- The opposite of the above definition, this directive will tell the robots not to follow any links on your page.

Putting It All Together:

With the robots tags explained, let's examine the usage for each one.

1. (meta name="robots" content="index,follow")

This tag will be used when you want the search engine spiders to index the page and follow the links to other pages. Most search engines use this setting as a "default" setting. It is possible that you may not even need to use this tag if you want the search engines to follow and index the page. However, an article at Search Engine World (searchengineworld.com/metatag/robots.htm) suggests that Inktomi does not use this as their default setting. Instead, they use the "index, nofollow" tag.

Better safe than sorry!

There has been much debate over whether or not it is necessary to use this tag. If there is even a slight possibility that some search engines do not use this as the default setting, then it would only make sense to include this tag if you want your page included in their search directory AND your links to be followed. Do the research and decide for yourself.

2. (meta name="robots" content="noindex,follow")

This tag can be used to tell the search engines that you do not want the page included in their directory, but you DO want them to follow the links that lead to other pages. A good example of its usage would be your disclaimer or privacy policy pages. You may not want these pages to show up in the search engines if they are only important to your actual visitors. However, if the links on these pages point to other pages that you want the search engines to find, then you would still want the spiders to "follow" those links.

3. (meta name="robots" content="index,nofollow")

This tag will allow your page to be indexed in the search engines, but any links on that page will not be followed.

4. (meta name="robots" content="noindex,nofollow")

When using this tag, the search engine spiders will not include this page in their directory and will not follow any links on the page either.

Where does the "robots" tag belong?

The "robots" meta tag should be used within the (head) and (/head) tags of your page. These tags are located at the top of the html coding. It will look something like this:

(html)
(head)
(title)Title of your page goes here(/title)
(meta name="keywords" content="word1,word2,word3,word4")
(meta name="description" content="A brief description of the content of this page.")
(meta name="robots" content="index,follow")
(/head)
(body)
Your webpage information here.
(/body)
(/html)

More Robots Tags

Google automatically archives a page as it crawls it. This is called a "cached" version of the page. Visitors can retrieve the archived version of the page by clicking on the "cached" link within Google's search results. If you do not want your content to be archived, you can use the following tag:

(meta name="robots" content="noarchive")

*This will only prevent your page from being "cached". If you do not want your page to be indexed at all, you will still need to include the "noindex" tag.

Another alternative to the above tag is the tag that specifically addresses Google only. If you want other search engine robots to archive your site, but you would like to prevent Google from doing so, then you can use the following tag:

(meta name="googlebot" content="noarchive")

The Misuse of Robots Tags

Something that has been popping up on websites everywhere is the Google indexing tag. This is a silly little tag that is not necessary. Some people think this tag helps Google to spider your site, but this simply isn't true. The tag looks like this: (meta name="googlebot" content="index,follow"). Some website owners believe that by specifying "googlebot" that their site has the advantage of being spidered faster and listed by Google. According to Google's web crawler information, you may use the noindex, nofollow, or noarchive tags when you DO NOT want Google to cache, index, or follow that page. Google's default setting is to index and follow the links on the page, so this "so called" googlebot index/follow tag that some site owner made up one day is completely unnecessary.

Another silly little tag--- The "Revisit-After" Tag

(meta name="revisit-after" content="90 days")
(meta name="revisit-after" content="15 days")

I'm not sure where this myth was started. Today, you will find this tag all over the Internet. Webmasters have even promoted it, claiming that it actually works. Are we so naive to believe the search engine spiders need to know when to come back? I have never used this tag, and my site has no problem with being crawled on a regular basis. Even some SEO (search engine optimization) sites are claiming its value. This comes back to the importance of researching the topic thoroughly first. My research came from some interesting information on WebmasterWorld.

It is important to examine the correct usage of the "robots" tag before applying it to your website. Incorrect usage of tags could result in spidering errors that cause robots to completely ignore the page all together. If you are interested in learning more about web robots, this great little site will provide you with the information you need to use them effectively: www.robotstxt.org/wc/robots.html

Stephani Richardson is a work at home mother of 4 who has been actively involved with affiliate marketing and home business opportunities since December 1999. She owns and operates several business related websites including 1 Work At Home Dot Com.


MORE RESOURCES:
Unable to open RSS Feed $XMLfilename with error HTTP ERROR: 404, exiting
home | site map | contact us