Sitecore Cross-Language Contextual Search

Sitecore 7’s new Content Search API provides a great way to leverage an underlying search provider like Lucene or Solr to search content items with the ability to leverage native filters such as the content language. One thing you may want to consider is how searching works on a multilingual site and how the visitor of the site may be entering in their terms relative to the language context.

Take for example a multilingual site that has English and Spanish content. Say the language switcher on the site is configured for English by default and all content exists in both English and Spanish. If the visitor opts to flip the language to Spanish, all of the content will render in Spanish. This is a great way to target the multilingual visitor by having 1-to-1 mappings of content across languages. Now say the visitor searches for “productos” while they are on the site with the English language as the context language. Depending on how your search code is written, it may search only within the context language, e.g. if you filter by the context language in your query you won’t get a result for the “productos” term:

[csharp]
query.Filter(i => i.Language = Sitecore.Context.Language.Name);
[/csharp]

Also, as Mike Robbins points out, you can even search using a CultureExecutionContext so it will dynamically leverage the right analyzer of the selected language.

This is a nice way to search within the selected language that the site is in, but what about that multilingual visitor? What if they speak English and Spanish and don’t know the site will filter to English content? This is where searching across languages can be useful. Here’s what you can do:

  1. Search for the term without a language filter
  2. For each SearchResultItem check to see if the item is in the context language and if it is, return it.
  3. If not, check to see if there’s a language version of the same item in the context language and return that.

Here’s an example with a direct 1-to-1 mapping of content in English and Spanish:

English version Spanish version
products productos

If the context language is English and the visitor search for “productos” the code will then:

  1. Finds the Spanish document as a SearchResultItem
  2. Determines that the SearchResultItem‘s language (Spanish) is not the context language (English)
  3. Tries to get the item in the context language e.g. GetItem(sri.ItemId, contextLanguage)
  4. If the item is not null, display it in the results. In this case it will render “products” in the results.

Here’s another scenario where there is not a true 1-to-1 mapping between languages:

English version Spanish version
<no version exists> productos

Now in this scenario, if the context language is English and the visitor search for “productos” the code will then:

  1. Finds the Spanish document as a SearchResultItem
  2. Determines that the SearchResultItem‘s language (Spanish) is not the context language (English)
  3. Tries to get the item in the context language e.g. GetItem(sri.ItemId, contextLanguage)
  4. The item is null because it does not exist in English, so return nothing since we want to maintain the results in the context language.

Here’s some sample code to get started:

[csharp]
var results = context.GetQueryable<SearchResultItem>().Where(p => p.Content.Contains(term));

foreach(var r in results)
{
// from the SRI, try to get the underlying item in the context language
var item = r.GetItem(r.ItemId, contextLanguage);

if(item != null)
{
// if the item is not null it has a counterpart version in the context language, so return it
}
}
[/csharp]

 

Mark Ursino

Mark is Sr. Director at Rightpoint and a Sitecore MVP.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.