Sitecore Avanced Database Crawler Occasionally Provides Null Results

If you’ve ever used the Advanced Database Crawler as your toolkit to do some sophisticated searching and querying for Sitecore content, you may have noticed that sometimes your results may include null/empty results. This is a common issue that I’ve faced many times and I’ve tracked it down to the de-coupled nature of the search index and publishing to the Sitecore database. This blog post explains the issue and how you can resolve it.

The crux of this issue it that the process of rebuilding the index is not directly sync’d or timed to the publishing process. Alex Shyba has discussed this before when syncing your HTML cache clearer to the index build. Additionally, if you have an index that uses the master database or a location in the master database, it will incrementally edit the index frequently.

If you look at the SkinnyItem.cs class, there’s a GetItem() method to convert it into an Item. You can see it uses the database to get the item by its ID, language, and version number. Its possible that when you publish from master to web, you are publishing a new version # of an existing item and thus the new version exists in the web DB, but the index is not updated and references the old version. So, this GetItem() call would use the previous version # and the item would be null. One way to fix this is instead of calling that GetItem() method, just use your own code to get the latest version of that item from Sitecore, e.g.

[csharp]
Item item = Sitecore.Context.Database.GetItem(someSkinnyItem.ItemID);
[/csharp]

Instead of

[csharp]
Item item = someSkinnyItem.GetItem();
[/csharp]

If you have any other tips and tricks when using the Advanced Database Crawler and Lucene.NET with Sitecore, feel free to share in the comments!

 

Mark Ursino

Mark is Sr. Director at Rightpoint and a Sitecore MVP.

 

2 thoughts on “Sitecore Avanced Database Crawler Occasionally Provides Null Results

  1. One thing that I found out Mark is that in your config file you can set that how often the index should be getting updated. Default, I guess is 30 min. Sometimes it takes time to update the crawler.

Leave a Reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.