Having problems with Google indexing?

April 28th, 2006

I’m going to try and cover the subject of Google’s recent indexing activity, which is a topic I’ve been asked about literally dozens of times over the past few weeks. If you haven’t experienced problems with Google dropping the amount of indexed pages on your site, you are certainly one of the lucky ones. I have seen a lot of people experiencing similar issues to the ones I have encountered myself.

This is an image from yourcache.com which is a neat tool I blogged about which will let you see the amount of pages you have index across different Google Datacenters. It will store the results from each enquiry you make and let you view the data over time. The screenshot is from one of the smaller sites we manage that has been live for a good time and has never had any problems with SERPs. Recently we have dropped from page 1 to page 15 for our main search term with a lot of other keywords and phrases suffering to. We are currently at rock bottom with 2 or 3 indexed pages, compared to the previous count of around 180. Normally, this would be very worrying but if you’re reading this the chances are you are in the same boat as a lot of people.

So here’s the million-dollar question, why is it happening? Well before we can start theorising about the cause we need to look at the problem in a bit more detail. Most people have started to report problems around the end of March or the beginning of April, around the time the Big Daddy upgrade was going up, so it would be logical to start our enquires there. I think it would be a fair assumption to say that not everybody was affected at the same time due to the new bots reaching their websites at different times. Looking at sites we manage the ones that have been affected have had a progression over 4-5 weeks, generally with them hitting rock bottom at week 3-4 then finally being re-indexed over the next week or so, but only attained around 50% of their previously indexed pages.

I have read three very interesting, if somewhat speculative, theories on what may be causing the underlying problem.

New Googlebot: The BD upgrade brought with it a Googlebot which indexes sites completely differently to the older versions. Such a large change could be bringing a whole spectrum of problems and bugs. An interesting thought would be, if the bots are gathering different information on the sites they are crawling - perhaps it is in Google’s best practise to wipe clean the old indexed pages first?

Google Cache: Many different bots such as the adsense-bot, finance-bot, news-bot and google-bot will be visiting your site. Google has recently changed how they work. Rather than each bot visiting your page separately now, the bots can read cache collected by other bots. For example, if your site is visited by the ad-sense bot at 9am it will also cache a copy of the pages it visits. Now, rather than the google-bot coming along at 10am and crawling your site, it can check the cache-copy that the adsense-bot previously stored, greatly reducing bandwidth usage. A new system like this could have introduced problems; I’ve seen regularly updated sites and Google holding a cache copy that is 6 months old.

64-bit Datacenters: The Big Daddy Datacenters are believed to be running on a 64-bit architecture which in itself could bring in many bugs. Converting systems to 64-bit certainly does not fall into my area of expertise, so I won’t comment on this - but if you have something to add, please leave me a comment.

Some people are very sceptical that Google is having any problems and this is merely an update running its course. I would point out that GoogleGuy has offered an e-mail address to those sites which are losing indexed pages, in the hope of finding some correlation between affected sites, very helpful indeed - but hardly normal practise for Google. Matt Cutts has also made numerous comments and answered questions saying that new datacenters are experiencing indexing problems, especially with supplemental results.

Recent activity like this can make it very hard for SEO agencies to explain to clients the problems that they are experiencing, even harder for those who have run successful e-commerce website which rely on their Google traffic for income. While we are on the subject of communication, I’d like to write a little about Google Sitemaps, I must say I am very impressed with the efforts Googlers are now going to, to communicate with webmasters. The new sitemaps console gives you information on your website in relation to Google’s quality guidelines and lets you know if the bots encountered any trouble getting around your site. This is a remarkably useful tool which takes some of the legwork out of scrutinizing webpages for potential crawling problems. The effort of individually communicating to webmasters why their site is penalised is nothing short of a godsend (although I feel sorry for Google the amount of ‘please look at my site’ requests they are destined to receive).

Back on topic, I am very interested in collecting anybody else’s index data, experiences or thoughts they have on this indexing topic so please leave me a comment or question and we will try and collect some data.

Google testing new feature: Expandable results

April 25th, 2006


Interesting news about Google search, bloggers are noticing that G seem to be testing a new search interface on a couple of datacenters randomly. The new ‘expandable’ search results allow searchers to view more content text from the search result site, and also to click through to related pages within that site. Underneath the expanded result is a ’search this site’ box.

So what does this mean in terms of SEO and getting visitors? Well more than ever, if this test goes into use, having a site with many relevent informative pages will mean more options for people to find and explore your site. It also hints at Google trying to move away from singular sites dominating the SERPs for certain keyterms, buy grouping relevent pages into one expandable listing. I would expect good internal linking techniques and keyword relevent site navigation would boost performance of these results, as well as the current benefits these methods provide.

Creative PPC

April 20th, 2006

With the Pay Per Click market reaching maturity, we are starting to see more creative ads written and keywords bidded on. There is currently research going on into the value of PPC ads which are just viewed and not clicked on, so a reversal in the copy writing technique - how to get your message over, without attracting a click. Personally I’m skeptical as to the value of a “view” of a paid advert, at least in the SERPs - can you remember the last PPC ad you saw?

Honda’s “element” PPC campaign bids on terms such as “Platypus“, which obviously are much cheaper than their mainstream keywords. The marketing campaign carries the tagline “see a Platypus in its element” and they have constructed an entire game around this idea. The game involves driving the Honda around an island and stopping to talk to several different animals. For instance, stopping to talk to the platypus will compare the diversity of a Platypus to the diversity of their Element. Although it may not appeal to all, it intrigues you enough to follow the course and be spoon fed information about their product, which sticks in your mind. Rather than viral, it is simply a more interactive method to deliver information to their consumer, I expect to see many offshoots of these kinds of campaigns.

SEM Presentation

April 13th, 2006

I attended a Norfolk Chamber of Commerce networking event last night, which I was asked to speak at. The networking event hosted food, drinks and a short speech from myself for 55 delegates from a wide cross-section of Norfolk companies. I had prepared a twenty minute presentation on Search Engine Marketing.

I wasn’t sure what level of knowledge my listeners would have so I prepared a presentation that tried to have something for everyone, which included:

  • Introduction to Search Engine Marketing
  • Search Engine Marketing Techniques
  • Why choose SEM
  • How can your business use SEM
  • Choosing an SEM Agencyas well as a healthy smattering of useful statistics. I got the feeling the delegates were more interested in the commercial application of SEM rather than technical details so I changed the direction of my talk on the fly as I went along. I was definately pleased with the response I recieved afterwards having spiked several companies interest. I’d look forward to talking at another Chamber event again.
  • April Fools

    April 4th, 2006

    In time honoured tradition the web was alive with April Fools on Saturday. The search engines got in the fun to with Google announcing Google Romance, Yahoo buying all Web 2.0 companies and of course Matt Cutts leaving to join Yahoo. We only wish the ongoing problems with Google’s datacenters were a joke. Still no quick-fix in sight, with old indexes popping up everywhere.

    [P=5 M=3]