When Canonicals Attack
Anyone who’s spent a bit of time around the SEO game will know that duplicate content is a no-no, but how do you eliminate all of that, especially if the way your site’s been constructed means that you’ve got several of versions of the same page? In my opinion, the best way to fix this is to use a 301 redirect to send these duplicated pages to the ‘main’ one, making it physically impossible for a user or search engine spider to get there and index the duplicate content. But what if you can’t do that for whatever reason? That’s where the canonical tag comes in.
As with everything in the search engine optimisation world, there are always other ways to do things, but the canonical is the most common option if a redirect’s off the table. The downside is that sometimes, it gets used epically wrongly and causes much worse problems than some internal duplication.
How Should A Canonical Tag Be Used?
When you put a canonical tag into your page, the idea behind it is that you use it to tell search engines what URL it should be indexing in place of this one. For example, say I have https://www.ben-johnston.co.uk/2011/03/when-canonical-tags-attack/ and https://www.ben-johnston.co.uk/2011/03/when-canonical-tags-attack800269 or something like that on this site. Obviously, I don’t want the search engines to index the one with the query string on the end of it because it’s duplicated. Typically, I’d break out my shiny 301 redirect plugin or pretend I know how .htaccess works, but if neither of those are an option, I could put a canonical tag in place telling search engine spiders that the only version of this page on my site that matters is the ‘canonised’ one, (in this case, I’d set it to the one without a query string on the end of it).
If I’ve done that right, a search engine spider will come to the query stringed version, see the tag and think ‘Hey, this is a duplicated page – the webmaster’s aware of it and has told me to forget about it’. Correctly implemented canonical tags basically stop indexation of a duplicated page.
What Happens If You Do It Wrong?
Unfortunately, this is where some people seem to fall down. I’ve seen a lot of sites where the canonical tag has been used wrong and has caused all kinds of SEO problems.
To use a common example, say we have a site with a lot of pages and a lot of query-stringed versions of these pages. Obviously, we don’t want these query string versions to come up in SERPS when we’ve got nice, user-friendly versions out there too, especially when it comes to the homepage. In these cases, sometimes people set all the canonical tags to the homepage. See where we could have a problem?
In these examples (and there have been a lot of them), the canonical tag will tell the search engines that the only page that it should index is the homepage. Everything else on that site is a copy of said homepage. This means that you could potentially have an absolutely enormous site with lots and lots of interesting pages that you would just love people to land on, but they won’t because they won’t be indexed at all.
How Do I Use The Canonical Tag Correctly?
Essentially, you need to make sure that the tag points to the right URL. Not the homepage, not the wrong page, not one with a typo, the exact page that the URL in question is a duplicate of. Do it wrong and you can cause serious indexation issues, do it right and you should be able to effectively eliminate onsite duplication without the work involved in a 301 redirect, although the 301 would still be my preferred choice.
This might be my last post for a little while since I’m moving in a couple of hours. Hopefully you enjoyed this one.