Search Engine Facts
Search Engine Facts

Read our back issues

May 2017

December 2009

November 2009

October 2009

September 2009

August 2009

July 2009

June 2009

May 2009

April 2009

March 2009

February 2009

January 2009

December 2008

November 2008

October 2008

September 2008

August 2008

July 2008

June 2008

May 2008

April 2008

March 2008

February 2008

January 2008

December 2007

November 2007

October 2007

September 2007

August 2007

July 2007

June 2007

May 2007

April 2007

March 2007

February 2007

January 2007

December 2006

December 2006

November 2006

October 2006

September 2006

August 2006

July 2006

June 2006

May 2006

April 2006

March 2006

February 2006

Januray 2006

December 2005

November 2005

October 2005

September 2005

August 2005

July 2005

June 2005

May 2005

August 2005

March 2005

February 2005

January 2005

December 2004

November 2004

October 2004

September 2004

August 2004

July 2004


» Archive
All about software products and antivirus solutions.
Good deals and offers on computers & hardware.
AVG Antivirus offers top security solutions.

Home   Contact   Privacy policy    Partner sites

Google's new web page spider

Search engines use automated software programs that crawl the web. These programs called "crawlers" or "spiders" go from link to link and store the text and the keywords from the pages in a database. "Googlebot" is the name of Google's spider software.

Many webmasters have noticed that there are now two different Google spiders that index their web pages. At least one of them is performing a complete site scan:

The normal Google spider: - "GET /robots.txt HTTP/1.0" 404 1227 "-" "Googlebot/2.1 (+"

The additional Google spider: - "GET / HTTP/1.1" 200 38358 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +"

What is the difference between these two Google spiders?

The new Google spider uses a slightly different user agent: "Mozilla/5.0 (compatible; Googlebot/2.1; +".

This means that Googlebot now also accepts the HTTP 1.1 protocol. The new spider might be able to understand more content formats, including compressed HTML.

Why does Google do this?

Google hasn't revealed the reason for it yet. There are two main theories:

The first theory is that Google uses the new spider to spot web sites that use cloaking, JavaScript redirects and other dubious web site optimization techniques. As the new spider seems to be more powerful than the old spider, this sounds plausible.

The second theory is that Google's extensive crawling might be a panic reaction because the index needs to be rebuilt from the ground up in a short time period. The reason for this might be that the old index contains too many spam pages.

What does this mean to your web site?

If you use questionable techniques such as cloaking or JavaScript redirects, you might get into trouble. If Google really uses the new spider to detect spamming web sites, it's likely that these sites will be banned from the index.

To obtain long-term results on search engines, it's better to use ethical search engine optimization methods. General information about Google's web page spider can be found here.

It's likely that the new spider announces a major Google update. We'll have to see what this means in detail.

Copyright - Internet marketing and search engine ranking software

Home   Contact   Privacy policy    Partner sites
October 2004 search engine articles