Options For Querying Items from Sitecore

Sitecore’s API provides several ways to query its databases for content items, but not all approaches are the same. I’m going to cover some of the options available to developers for querying Sitecore. These options vary primarily based on the expected number of items to query. Some CMS-enabled sites only requires a small number of items to be returned from a single query where as larger sites with more content would need larger result sets. The three main approaches that I’ll cover are Sitecore Query, Fast Query, and Lucene Search.

Sitecore Query

Sitecore Query is probably the most commonly used approach as its the standard way to query content. Sitecore-certified developers likely all know Sitecore query as a starting point. An easy way to help build up your queries correctly is to use the XPath builder in the Developer Center.

Unfortunately, Sitecore Query has performance issues depending on how you build your queries. Creating very specific queries has a performance hit because the query magic built into Sitecore has to do more checks on each item. A recommended practice when using Sitecore Query is to make your queries as generic as you can and use .NET to loop over the results and filter them further.

Let’s consider the following type of query and how it can be adjusted. Say you want to retrieve all items at a specific path of a certain template type where each item has specific field values. A good approach would be to not include the check for the field values in the query, but filter the results using C#. So for example, let’s find all Employees under the About section and filter them with C# based on some field criteria. The example below makes a fairly generic Sitecore query and uses LINQ to filter the results:

[csharp]
Item[] allEmployees = db.SelectItems("query:/sitecore/content/home/about/*[@@templatename=’Employee’]");

IEnumerable<Item> employeesWithNames = allEmployees.Where(item => item.Fields["First Name"].Value != "" && item.Fields["Last Name"].Value != "");

[/csharp]

The lines of code above will: (1) get all immediate children of template “Employee” under the “about” item, and (2) will use LINQ to loop over each resulting Sitecore Item and will check if both the “First Name” and “Last Name” fields are not empty strings. The resulting IEnumerable<Item> now contains only Employee items with those fields filled in. The above code is fairly useless but it shows how you can use LINQ to filter your general result sets as opposed to writing a long Sitecore Query to be very specific.

One important thing to note is that a self-or-descendant selector (i.e. //*) is not recommended. This performs a recursive query down all subitems and will cause performance issues. It is recommended that both front-end code and template field queries not use self-or-descendant as they can slow down the front-end of the site or even the content editor itself.

It’s also important to note that special characters in queries need to be escaped with a hash, for example, a hyphen in a query will break the parser and needs to be handled by being wrapped in #.

This query will not work:

[code]
/sitecore/content/home/about-us/*
[/code]

This one will:

[code]
/sitecore/content/home/#about-us#/*
[/code]

Fast Query

Fast Query is very similar to Sitecore Query but it is more restrictive on what you can query with, however is performs faster than a reulgar Sitecore query. Fast Query is recommended to retrieve many more items than a Sitecore query. The prefix for a fast query is “query:fast:”. Fast queries are actually translated directly into SQL statements and executed on the database, so there are limitations. I recommend you read the Fast Query doc on the SDN.

Lucene Search

My experience with querying Sitecore with Lucene is fairly new, but is the result of Alex Shyba’s work at Sitecore with Lucene. Alex has created a great toolkit called the Advanced Database Crawler to make it easier for developers to setup and query Lucene with some pretty simple C#. The best way to get started using Alex’s toolkit would be to watch his video series on sample search scenarios and configuration settings. In the future I will publish a summary of my findings with this new tool and the things I’ve learned while using it.

Determine How You Should Query Sitecore

Below is a table of recommended approaches to querying Sitecore based on the number of items in the result sets. Note that the web.config setting Query.MaxItem affects the number of items that a Sitecore query can return. Say if you want to return 150 items from a Sitecore query (not recommended), you’d need to adjust the setting. The higher this number is, the more you can get back from your Sitecore queries which means the worse your app can perform because of these large queries.

Query Approach Result Items
Sitecore Query 100 items or less
Fast Query 1000 items or less
Lucene Search 1000+ items
 

Mark Ursino

Mark is Sr. Director at Rightpoint and a Sitecore MVP.

 

6 thoughts on “Options For Querying Items from Sitecore

  1. Solid article, like the approach you’re taking. I also like the chart at the end, would suggest moving it to the top as it provides a hint toward which solution a reader should focus on depending on individual needs

  2. Great article Mark, I agree with Matt the table at the end is really helpful.

    The only thing I’d suggest is that people try to avoid querying as much as they reasonably can to begin with, if possible always have the ID you need on hand. One example would be items that are unlikely to change such as the site root, store those in a static configuration class. Another example is item relations, (eg article items pointing to author items) instead of using a query to find all articles that point to a given author use the Link database.

  3. Steve, I agree. Many things that are fixed within the tree (like globally re-usable elements) should be looked up in a static helper classes. I do that myself, but contrary to your recommendation, I actually do it with a query! Whoops — I’m now considering dropping in the GUID instead since these items don’t change, only the data stored in their fields. The only key thing is keeping exact syncs between dev and prod DBs since the GUIDs will change if they’re not exactly the same.

    I also agree about using the Link DB to find related items. Alex Shyba’s new tool uses the Link DB in one of his query types (using a query parameter called RelatedIds) to find related items that are referenced in the field(s) of other items. This makes it very easy to find any relations without querying everything then doing intensive loops thorough all the items to filter them. I recently used this on a project with many articles and author assignments exactly like you stated as an example.

  4. These are great options for getting a few items. It sounds as though if many items are required, storing them in SQL is the best option. Also, most of our content is stored in many folders. Children work for a single level but to transverse multiple levels it sounds as though a descendant query is required (//*).
    I’m concluding that storing all content in SQL is the best option.
    Thanks for the blog.

Leave a Reply to Matt Cancel reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.