Posted by admin on 04 3rd, 2010


Could The New Google Spider Be Causing Issues With Websites?

Make a list of what you want to know, what you need to know, and what you already know about this subject.

Around the time Google announced “Big Daddy,” there was a new Googlebot itinerant the web. while then I’ve heard stories from clients of webplaces and waitrs departing down and previously unindexed content receiving indexed.

I ongoing digging into this and you’d be shocked at what I found out.

First, let’s look at the timeline of meareliables:

Ask yourself a few simple questions to determine if you fully understand the concepts that we have went over so far.

In deceased September some wise spider watchers over at Webmasterworld blemished exclusive Googlebot activity. In detail, it was in this thread: http://www.webmasterworld.com/forum3/25897-9-10.htm that the bot was first reported on. It alarmed some posters who thought that perhaps this could be customary users masquerading as the renowned bot.

Early on it also appeared that the new bot wasn’t obeying the Robots.txt summary. This is the protocol which allows or denies crawling to parts of a webplace.

Speculation grew on what the new crawler was awaiting dull Cutts stateed a new Google analysis figures base http://www.mattcutts.com/blog/good-magazines/#state-5293. For those that don’t know, dull Cutts is a elder coax with Google and one of the few Google employees chatting to us “customary folk.” This state happened in November.

There wasn’t greatly state of Big Daddy awaiting early January of this year when dull again blogged about it asking for advice. http://www.mattcutts.com/blog/bigdaddy/

greatly advice was given on the accuracy of the fallout. There were also those that asked if the Mozilla Googlebot (known as “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” in your visitor wood) and Big Daddy were linked, but no comeback was made.

Now I’m departing to originate some of my own speculation:

I do in detail deem the two are linked. In detail, I think this new crawler will eventually supplant the old crawlers just as Big Daddy will supplant the recent figures infrastructure. http://www.textbooklinkbrokers.com/bwood/states/310_0_1_0_C/

Why is this important?

Based on my observations, this crawler may be able to do so greatly more than the old crawler.

For one, it emulates a newer browser. The old bot was based on the Lynx textbook based browser. While I’m reliable Google added skin as time went on, the central Lynx browser is just that central.

Which explains why Google couldn’t treaty with clothes like JavaScript, CSS and moment.

However, with the new spider, built on the Mozilla engine, there are so many possibilities.

Just look at what your Mozilla or Firefox browser can do itself render CSS, read and complete JavaScript and other scripting languages, even emulate other browsers.

But that’s not all.

I’ve talked to a few of my clients and their places are receiving hammered by this new spider. It has gotten so bad that some of their waitrs have spent down because of the part of passage from this one spider!

On the bonus border, I have clients who went from a few hundred thousand indexed pages to over 10 million in just a few weeks! exactly because December, 2005 there’s been a 3500% intensify in indexed pages over an 8 week time! Just so you know, this is also the client’s place that went down because of the colossal part of crawling event.

But that’s still not all.

I have another client which uses IP recognition to wait content based on a character’s geographic position. If you live in the US you get American content and pricing; if you live in the UK you get UK content and pricing. As you may envisage, the UK, US, Canadian and Australian content is all very akin. In detail about the only thing noticeably different is the pricing phase.

This is my disquiet if the duplicate content gets indexed by Google what will they do? There’s a good option that the place would be penalized or even banned for violation of the webmaster worth guidelines set forward by Google here: http://www.google.com/webmasters/guidelines.html#worth

This is why we implemented IP recognition so that Googlebot, which crawls from US IP addresses only sees one type of the place.

However, a assess of the waitr wood shows that this new Googlebot has been visiting not only the US content but also the content of the other sections of the place. easily, I sought to verify that the IP recognition was effective. It is. This leads me to doubt then; can this browser spoof its position and/or use a alternate?

visualize that the browser is smart enough to do some of its own analysising by viewing the place from numerous IP addresses. If that’s the folder then those who wrap places are departing to have harms.

In any folder, from the imperfect observations I’ve made, this new Google both the figures base and the spider are departing to change the way we do clothes.

If we have failed to answer all of your questions, be sure to check into other resources on this interesting topic.

Random Posts

    Post a Comment
    

    No Comments »

    No comments yet.

    RSS feed for comments on this post. TrackBack URL

    Leave a comment