Faceted search is a recurring issue on sites with a large numbers of pages that include product listings; if implemented correctly, faceted search can be very beneficial for a site. In fact, the creation of new, more specific pages makes it possible to respond to more search queries for increased visibility in search results.
In addition to providing a logical site architecture and optimized internal linking, faceted navigation also allows users to quickly find the product(s) they are looking for.
The implementation of faceted search must follow certain rules. Otherwise, it can lead to major problems such as the mass creation of unnecessary / duplicated pages or the appearance of spider traps.
What is a facet?
Faceted search can generally be found on listing pages of e-commerce or real estate sites: this type of search refers to the different combinations of characteristics that a user can select to refine a search.
Example of faceted navigation for men’s t-shirts on Zalando
Among the available combinations, it is important to distinguish between a facet and a filter.
Facet: This is a filtered category page that should be crawl-friendly and indexable. It corresponds to queries from users with a certain volume of search, and its creation brings value and potential traffic to the site.
Filter: This is a category page filtered only for the user. It can’t be matched to queries with monthly search volume; it only allows users to make a category page more accurate and to navigate through the different attributes of a product.
Why create facets?
As mentioned above, faceted navigation is beneficial for sites with a large number of pages that have product / property listings. An optimally managed facet strategy will have 3 main advantages:
- Target generic or long-tail keywords. It is therefore interesting to create facets to target specific requests and propose a list of corresponding properties.
- t shirt: 74,000 monthly search volume
- men’s t-shirt: 9,900 monthly search volume
- men’s black t-shirt: 590 monthly search volume
- Automate the creation of pages according to certain rules: since the applicable sites generally have a large number of pages, automating the creation of pages is an advantage;
- Automate the internal linking of these pages through their automatic creation.
How to choose which facets to create?
To choose the most beneficial facets to create, it is important to follow 3 steps:
Semantic study: Classical semantic research to collect the keywords related to the site;
Categorization: Categorization of keywords according to the usual method that takes into account the different relevant ways to break down facets (e.g. Price, size, brand, gender, material, etc.)
Analysis of results: Analysis of semantic research results with pivot tables that highlight the different categories and possible combinations. The idea is to determine the search volume associated with each possible crossover.
For example, it would be beneficialto create facets for some colors in the T-shirt category:
Crawl and indexing: Why is it necessary to control the creation of facets?
If faceted navigation is implemented correctly, it will increase the number of qualified pages for users and bots, but if it is not, it can lead to several types of problems:
- Risk of spider traps:
A spider trap is the creation of a very large number or an unlimited number of URLs that prevent a site from being explored correctly. As faceted navigation allows you to create a large number of important combinations, it can easily lead to spider traps if not managed properly.
- Crawl waste:
A large number of non-indexable links in a site structure will necessarily lead to crawl waste (even if, in the long run, these links will be crawled less).
- Dilution of internal popularity:
A large number of non-crawlable links within a site structure is can be harmful to the distribution of internal popularity.
- Creation of duplicate or near-duplicate content:
Some of the pages created automatically by faceted search have the same or very similar content. This should be avoided so as not to create internal duplicate content.
- Creation of empty pages:
Like pages with similar content, those without content should not be generated.
The rules to follow to control the creation of facets
Managing multiple facets
First of all, you will need to define whether a facet should be created if several variables are selected simultaneously (whether within the same category or not)
Example: Create gender + color facets
Example: Do not create gender facets when men’s + children’s are selected
Example: Do not create gender + pattern facets
Defining the minimum number of products/goods
A facet should only be created automatically when the number of products/goods is sufficient
Example: Create gender (men’s or women’s) facets when there are at least 3 t-shirts for sale
Texts:
Category page
Men’s facet
Women’s facet
There are at least 3 men’s t-shirts
There are not 3 women’s t-shirts
Setting up SEO tagging
Created facets must contain classic SEO-optimized tagging, so it is necessary to define automatic tagging rules.
Texts:
Men’s + Red
Men’s + M
Gender : Men’s ☑, Women’s, Children’s
Colors : Blue, Green, Red ☑
Gender : Men’s ☑, Women’s, Children’s
Size : XS, S, M ☑
Facets | H1 | Title Rules | Description Rules |
Gender + Color | [Gender] [Color] T-shirts | [Gender] [Color] T-shirts – My Brand | Discover all of our ➤ [Gender] [Color] T-Shirts on Mysite.com! ✅ Free delivery ✚ 1 500 styles! |
Gender + Size | [Gender] [Size] T-shirts | [Gender] [Size] T-shirts – My Brand | Discover all of our ➤ [Gender] [Size] T-Shirts on Mysite.com! ✅ Free delivery ✚ 1 500 styles! |
Set up URL rewriting
Since the facets are initially filters that you want to index, “ugly” URLs will be created when they are opened to indexing. These URLs must then be rewritten in order to obtain “clean” URLs (i.e. without special characters such as %, ? or &).
Example: I’m looking for a black t-shirt by Nike
These “clean” URLs are optimized for crawling and indexing
Managing URL stability
The URL structure must not change depending on the path followed by the user.
Example: Two people are looking for a black Nike branded t-shirt but in a different way.
It is, therefore, necessary to define a default order, for example: [Clothing category] > [Color] > [Brand] and keep this order regardless of the user’s pathway.
Optimizing internal linking
As with a traditional site structure, for an open facet to be crawlable and indexable, the URLs of the site must have a permanent link to the open facet. The latter must be present in the DOM and accessible even if JavaScript and CSS are disabled.
Example: Facets for Men’s + Color T-shirts have been created
I do have a “static” link <a href =”https://mysite.com/t-shirts/mens/blue”>Men’s blue t-shirts</a> from my men’s red T-shirts page to my men’s blue T-shirts page.
Several ways to make facets inaccessible
Now that we’ve discussed the rules to follow regarding the creation of facets, we need to define a way to make the facets that should not be created non-crawlable/non-indexable.
Generally, it is possible to block unwanted facets in several ways, each of which has its advantages and disadvantages.
- Adding nofollow on unwanted facet links + meta robots noindex
This solution limits crawl waste on unwanted pages and ensures that closed pages are not indexed (if they are known to search engines by other means). However, this does not solve the problems of internal popularity dilution because a large number of non-crawlkable links are present on the page.
- Adding a meta robots noindex on unwanted pages
With this approach, only indexing and duplicate content problems are solved. In fact, the crawl waste and the dilution of internal popularity will still be present on the site.
- Blocking facets with robots.txt
A simple-to-set-up solution through blocking the pattern of the unwanted facets with robots.txt. Although this option makes it possible not to waste crawl budget on useless pages, it does not provide solutions where indexation, duplicated content and dilution of internal popularity are concerned.
- JS / Ajax
Using Javascript / Ajax to block facets allows us to solve to all issues efficiently. In fact, links to unwanted facets are accessible only to users and are not present in the page’s source code, so they are inaccessible to robots. Note that Google executes Javascript and that an ideal implementation of this solution is done on the client side: the filtering of the category page should occur directly in the browser and no new pages are created.
- PRG (Post-Redirect-Get):Just like the use of JS / Ajax, this method makes it possible to solve all problems efficiently. As a reminder, GET requests allow information to be transmitted in the URL and are executable by Google. On the other hand, for POST requests, the information is included in a form and is not executable by Google.
The purpose of the PRG method is therefore to use a form in POST mode for unwanted facets so that Google does not execute them. This would yield:
Step 1 POST: the user clicks on a filter of an unwanted facet and the request is sent with the POST method.
Step 2 REDIRECT: the server responds to the request with a redirection to the filtered URL.
Step 3 GET: the redirection is followed and the filtered URL is returned with the GET method. The user sees the filtered results.
[Case Study] Monitoring and optimization of a website redesign following a penalty
To summarize
In conclusion
In order for facet creation to be carried out smoothly, it is necessary to follow several rules and to plan for all possible cases in a pre-production setting. It is also important to note that facet management is specific to the CMS used on a site and that there are different solutions to manage creating and restricting facets, each with advantages and disadvantages.