Search Engine Optimisation (SEO) is a phrase / acronym that gets a lot of attention in web development. Most people involved in the industry will have heard of it at some point. It’s undoubtedly important, and this blog post is going to assume you have a basic idea of what it is (if not, give this great article from SEOmoz a read).

But even then, SEO can often be an overwhelming discussion to a novice. Sometimes it’s just useful to know that your site is doing OK; not all site owners are going to be concerned with squeezing every last drop of link juice from their pages, and tweaking keywords on hundreds of articles. You might just want to check that your latest update or redesign isn’t having a negative effect on your pagerank, or that the extra content you’re adding is gaining some traction and being well received by your audience.

This is the first part of my guide to doing a simple SEO health check. It’s a broad look at some of the most common indicators that there’s a problem, and how to fix them. It will also touch on a few key principles for knowing more about SEO, what’s important (and why), and positive signs that show you’re on the right track.

As you’ll probably have gathered from the title, this guide is going to assume you’re using Google Analytics and Webmaster Tools. If you’re not, you should be; they are incredibly powerful free tools developed by Google, and they go far beyond SEO. I’d highly recommend getting them both working with your site, and there are useful guides on both setting up Google Analytics and setting up Webmaster Tools.

If you are setting up these services on your site for the first time, you may want to wait a few weeks before diving into the stats, as you won’t have any to start with.

I’ll start by giving a basic run-down of how sitemaps work with your site, how Webmaster Tools can help you manage your sitemap, followed by a list of warning signs and what to do next.

 

Part 1: Sitemaps

You’ve got your site, registered it in Webmaster Tools, and you’re running Google Analytics tracking. Great! There’s a lot to see.

Let’s start with something quite straightforward. As this is Part 1, and I’ve already rambled on a bit in my introduction, we’ll just be looking at sitemaps. Sitemaps are XML files that list the pages of your site in a simple, organised way that search engines can crawl and index. They should contain the URLs for all of the pages you want search engines to index, but you can also include other information such as dates, the title, importance and more. They won’t affect your search ranking, but they’re vital for making sure that Google and other search engines are indexing all of your pages properly.

Most content management systems (CMS) will generate a sitemap automatically, though you may need to install a plugin / module first. Sitemaps are generally found at yourdomain.com/sitemap.xml, but this isn’t always the case. Locate your sitemap (or enable it through your CMS first), and take a look in your browser. You should be seeing something like this.

 

Adding your Sitemap in Webmaster Tools

In Webmaster Tools, you can navigate to the Sitemaps page via the menu on the left (Fig. 1) or directly from the main dashboard (Fig .2). The dashboard will show you a small graph of your sitemap statistics already, if you have one added.

Links to Sitemaps in Webmaster Tools

If you don’t already have a sitemap submitted, you can click the “Add/Test Sitemap” button near the top right corner of the page. Add the correct URL, and your sitemap will be submitted to Google. It may take a few minutes or more for any results to show up in Webmaster Tools.

 

Keeping an eye on your Indexed URLs

The dashboard graph and the main Sitemaps page will both show the URLs submitted and URLs indexed. The URLs submitted are those that are included in your sitemap. The URLs indexed are those that Google has accepted as worth indexing, which won’t be all of your submitted URLs at first. Depending on how many URLs are in your sitemap, it may take up to a few days (especially with large sites) for Google to finishing indexing.

You may find that Google stops indexing your submitted URLs before it reaches your submitted total. The numbers don’t have to be identical, but a large discrepancy between submitted and indexed would suggest that there’s an issue preventing some pages from being indexed. This can happen if Google’s algorithms don’t consider pages “valuable” enough; brief pages without much content, or duplicate / very similar content are likely to be excluded. For some pages this is perfectly acceptable, but if you notice that a large percentage of your submitted URLs aren’t being indexed then you might be submitting a lot of content-thin pages that you could possibly exclude from the sitemap.

 

Other SEO warning signs (and what to do about them)

Each sitemap will also list any errors under the “Issues” column. Any errors should be corrected, until the “Issues” column is clear.

The listed Sitemaps in Webmaster Tools

Errors here will usually relate to the sitemap itself, rather than your content. For example, you may get a warning if the XML of your sitemap isn’t formatted correctly. Fortunately if your CMS is producing the sitemap, you probably won’t see any of these issues. If you do, check your module / plugin is up to date, or contact the author.

There are a few more common issues you may encounter:

  • Your sitemap lists URLs that don’t exist
    If you’ve deleted a page or changed a URL, your sitemap may not have been rebuilt since the change and is still listing the invalid path. Rebuild and resubmit your sitemap, and consider increasing the frequency that your sitemap is generated.
  • Content is inaccessible by the Google Bot when it crawls your site
    Some of your content might only be accessible to registered users. If some kind of log in is required to view that page, and anonymous users get some kind of “Access Denied” message, the Google Bot won’t be able to view it properly either. These kinds of restricted pages should ideally be excluded from the sitemap.
  • Your sitemap is missing (or missing some of its pages)
    If you experience any site downtime or server problems, your sitemap may be unavailable when Google tries to re-index your URLs. Large sitemaps often get split into multiple pages, with the main sitemap.xml pointing to the sub pages. If you’re adjusting your sitemap by removing some URLs or re-factoring content, your main sitemap.xml may still point to a page that doesn’t exist when Google crawls it. Usually you can just resubmit your sitemap once your site is back online, or you’ve finished editing the sitemap. Google would often wait a while before trying to crawl your sitemap again by itself, so it’s best to clear these errors by resubmitting manually as soon as you can.

These three points cover the most common issues you’re likely to come across. As this is just a simple guide for a relative newcomer, I won’t go into any of the more advanced issues. Chances are if you encounter any, you’re probably building a site at the level where you know what to do anyway!

Once any issues with your sitemap have been resolved, always be sure to resubmit your sitemap. Just check the box next to it in the list and click the “Resubmit” button.

 

All Done!

And that’s it for sitemaps! They’re incredibly useful, and relatively straightforward to implement and maintain. Between monitoring the URLs you’re submitting, and making sure your sitemap is being crawled successfully by Google, keeping your site in good SEO health is nice and simple.

The Webmaster Tools service makes managing your sitemap really easy. By empowering site owners, Google gets more reliable data and we get a great way to track our content and how it’s indexed. Google themselves offer plenty of documentation on Webmaster Tools, if you’re interesting in learning more.

Continue to my Simple SEO Health Check (Part 2): Search Queries.