Canonicalization

Canonicalization in computing means generating canonical data from non-canonical information. It is an important practice in search engine optimization (SEO) of websites.

If a website has several pages with similar content, it is difficult for the search engines to make sure which of them are the most important. This is a common case since search engines see every unique URL as a separate page. A canonical tag declares a preferred page of several choices to merge duplicate content. A simple example is a set of product pages of the same product but in various colours. In such a case, the website owner declares one of the colour options as canonical. As a general rule, search engines index this canonical page and show it in search results.

What do canonical tags look like?

A canonical tag is set within the <head> section of a page:

<link rel=“canonical” href=“https://exampleshop.com/product-page/” />

In this example, the link rel=“canonical” means that this is the canonical version of the page and href=“https://exampleshop.com/product-page/” shows that this URL locates the canonical version.

Why is canonicalization important for SEO?

As search engines’ purpose is to show exclusively relevant results to the users, they don’t like duplicate content. They only index one version of a page. With a lot of similar or duplicate content, the search engine spiders may miss the most important pages. Moreover, if there are several versions of the same page without canonicalization, the search engines pick the “master” page on their own. That, however, might not be the page that website owners would like to be indexed. 

Canonicalization helps search engines make sense of duplicate content and reduces the risk of picking the wrong URL as canonical. Additionally, it helps save crawl budget – the number of pages the search engine spiders crawl within a certain timeframe. This ensures that the important parts of the website are being crawled and indexed first.