Having problems with Google indexing?

April 28th, 2006

I’m going to try and cover the subject of Google’s recent indexing activity, which is a topic I’ve been asked about literally dozens of times over the past few weeks. If you haven’t experienced problems with Google dropping the amount of indexed pages on your site, you are certainly one of the lucky ones. I have seen a lot of people experiencing similar issues to the ones I have encountered myself.

This is an image from yourcache.com which is a neat tool I blogged about which will let you see the amount of pages you have index across different Google Datacenters. It will store the results from each enquiry you make and let you view the data over time. The screenshot is from one of the smaller sites we manage that has been live for a good time and has never had any problems with SERPs. Recently we have dropped from page 1 to page 15 for our main search term with a lot of other keywords and phrases suffering to. We are currently at rock bottom with 2 or 3 indexed pages, compared to the previous count of around 180. Normally, this would be very worrying but if you’re reading this the chances are you are in the same boat as a lot of people.

So here’s the million-dollar question, why is it happening? Well before we can start theorising about the cause we need to look at the problem in a bit more detail. Most people have started to report problems around the end of March or the beginning of April, around the time the Big Daddy upgrade was going up, so it would be logical to start our enquires there. I think it would be a fair assumption to say that not everybody was affected at the same time due to the new bots reaching their websites at different times. Looking at sites we manage the ones that have been affected have had a progression over 4-5 weeks, generally with them hitting rock bottom at week 3-4 then finally being re-indexed over the next week or so, but only attained around 50% of their previously indexed pages.

I have read three very interesting, if somewhat speculative, theories on what may be causing the underlying problem.

New Googlebot: The BD upgrade brought with it a Googlebot which indexes sites completely differently to the older versions. Such a large change could be bringing a whole spectrum of problems and bugs. An interesting thought would be, if the bots are gathering different information on the sites they are crawling - perhaps it is in Google’s best practise to wipe clean the old indexed pages first?

Google Cache: Many different bots such as the adsense-bot, finance-bot, news-bot and google-bot will be visiting your site. Google has recently changed how they work. Rather than each bot visiting your page separately now, the bots can read cache collected by other bots. For example, if your site is visited by the ad-sense bot at 9am it will also cache a copy of the pages it visits. Now, rather than the google-bot coming along at 10am and crawling your site, it can check the cache-copy that the adsense-bot previously stored, greatly reducing bandwidth usage. A new system like this could have introduced problems; I’ve seen regularly updated sites and Google holding a cache copy that is 6 months old.

64-bit Datacenters: The Big Daddy Datacenters are believed to be running on a 64-bit architecture which in itself could bring in many bugs. Converting systems to 64-bit certainly does not fall into my area of expertise, so I won’t comment on this - but if you have something to add, please leave me a comment.

Some people are very sceptical that Google is having any problems and this is merely an update running its course. I would point out that GoogleGuy has offered an e-mail address to those sites which are losing indexed pages, in the hope of finding some correlation between affected sites, very helpful indeed - but hardly normal practise for Google. Matt Cutts has also made numerous comments and answered questions saying that new datacenters are experiencing indexing problems, especially with supplemental results.

Recent activity like this can make it very hard for SEO agencies to explain to clients the problems that they are experiencing, even harder for those who have run successful e-commerce website which rely on their Google traffic for income. While we are on the subject of communication, I’d like to write a little about Google Sitemaps, I must say I am very impressed with the efforts Googlers are now going to, to communicate with webmasters. The new sitemaps console gives you information on your website in relation to Google’s quality guidelines and lets you know if the bots encountered any trouble getting around your site. This is a remarkably useful tool which takes some of the legwork out of scrutinizing webpages for potential crawling problems. The effort of individually communicating to webmasters why their site is penalised is nothing short of a godsend (although I feel sorry for Google the amount of ‘please look at my site’ requests they are destined to receive).

Back on topic, I am very interested in collecting anybody else’s index data, experiences or thoughts they have on this indexing topic so please leave me a comment or question and we will try and collect some data.

Related posts:
Back on the Search Engine Watch
Google cache update
Big Problems at Big G
Small update on Google problems
Google Admits “Bad Data Push” Error

5 Responses to “Having problems with Google indexing?”

  1. Ookami Snow says: MyAvatars 0.2

    I had problems with this around the start of April too. I was getting about 30-40 hits a day from Google, and then it just stopped, not completely but it dropped to about 4-5 a day with the majority of the traffic coming from Google Images.
    I have noticed a pick up in traffic recently, but still maybe only 9-10 a day.

  2. Harmony says: MyAvatars 0.2

    My fingers are crossed so tight waiting to come back. We fell from page one to page six on several vital keywords. Thanks for the news and advise!

  3. Mich says: MyAvatars 0.2

    Our site has seen about a 50% decrease in traffic of the last month. AND we don’t find our main domain as the source when we do come up. We have a Yahoo store, and if there’s ever a bad link, Yahoo serves… http://store.yahoo.com/waterproofcases/index.html instead of just our domain http://www.waterproofcases.net.

    SO, all I can figure, is at some point we had a bad link, and the googlebot followed to the yahoo domain YUK!

    When we search for our main terms, like waterproof cases - which has ranked in the top 10 for the last 2 updates… we were gone for 3 weeks. Now we’re back… but not as www.waterproofcases.net … but as the store.yahoo.com … AND when we site: command it, our site was primarily supplemental results. That seems to have gone away, but I don’t find our main domain when I look for us the way we used to rank.

    Open to thoughts and opinions.

    THANK YOU!

    I don’t know what’s going on, but I’d sure love to fix it.

  4. MarkZZ says: MyAvatars 0.2

    That’s a weird problem mich. The bots seem to have gone into overdrive over the weekend with a lot of people reporting a slight ease in their situation. I’ve had a few sites jump back onto their feet, my gut tells me they’ve cracked it.

  5. LB says: MyAvatars 0.2

    i have another problem,can some1 explain?
    when i do the site command in Google it gives me the amount of indexed pages returned(duh) but there is something weird going on…one day i have 20 pages indexed,the next day i have 225 …day after that i have 35….
    and today…weird as it may sound:
    i search 4 indexed pages…and i get 18 results ..when i go to the second page ..suddenly it changes to 125 results (without researching)…20 minutes after that..he gives me 12 pages
    I know its only doing an estimate but geez…how can one be sure of indexed pages like this?
    i could videotape the activity on screen if needed…

Add your comments!