Tuesday, August 31, 2010

Adding custom fields to the index

In this post I want to show how to address a missing feature that was a part of “old” lucene index implementation. This article will provide an example how one can customize Lucene search configuration so that it’s possible to add custom fields to the index.

First off, let’s create a configuration that would allow us to add additional fields to the indexed data.

<index id="News" type="Sitecore.Search.Index, Sitecore.Kernel">
<
param desc="name">$(id)</param>
<
param desc="folder">_news</param>
<
Analyzer ref="search/analyzer" />
<
locations hint="list:AddCrawler">
<
examples-news type="LuceneExamples.DatabaseCrawler,LuceneExamples">
<
Database>web</Database>
<
Root>/sitecore/content</Root>
<
IndexAllFields>true</IndexAllFields>
<
include hint="list:IncludeTemplate">
<
news>{788EF1BE-B71E-4D59-9276-50519BD4F641}</news>
<
tag>{4DD970FB-2695-4E50-96F3-A766F7D6CAF1}</tag>
</
include>
<
fields hint="raw:AddCustomField">
<
field luceneName="author" storageType="no" indexType="tokenized">__updated by</field>
<
field luceneName="changed" storageType="yes" indexType="untokenized">__updated</field>
</
fields>
</
examples-news>
</
locations>
</
index>


There is a new configuration section in this example. It’s <fields> section that introduces two fields “author” and “changed”. These fields will be added to a fields collection of each indexed item. Basically, there is AddCustomField method that gets called for every <field> configuration entry to identify a custom field that is going to be added to the fields collection.


Description of configuration attributes:


  • luceneName  is a field name that appears in lucene index.
  • storageType  is a storage type for lucene field. It can have the following values:
    • no
    • yes
    • compress
  • indexType  is an index type for lucene field. It can have the following values:
    • no
    • tokenized
    • untokenized
    • nonorms

Refere to Lucene documentation to find out what each of these options mean: store and index.


Now all you need to do is to loop through the collection of custom fields in the overridden AddAllFields method and add them to the indexed data.


I created a custom class called CustomField that helps to manage custom field entries. Below is the example of this class as well as additional methods for extended DatabaseCrawler. Since code for the DatabaseCrawler was already published in this blog post, I’m not going to duplicate it here.


Here is a code for CustomField class.


using System.Xml;
using Sitecore.Data;
using Sitecore.Data.Items;
using Sitecore.Xml;
using Lucene.Net.Documents;

namespace LuceneExamples
{
public class CustomField
{
public CustomField()
{
FieldID = ID.Null;
FieldName = "";
LuceneFieldName = "";
}

public ID FieldID
{
get;
private set;
}

public string FieldName { get; private set; }

public Field.Store StorageType { get; set; }

public Field.Index IndexType { get; set; }

public string LuceneFieldName { get; private set; }

public static CustomField ParseConfigNode(XmlNode node)
{
CustomField field = new CustomField();
string fieldName = XmlUtil.GetValue(node);
if (ID.IsID(fieldName))
{
field.FieldID = ID.Parse(fieldName);
}
else
{
field.FieldName = fieldName;
}
field.LuceneFieldName = XmlUtil.GetAttribute("luceneName", node);
field.StorageType = GetStorageType(node);
field.IndexType = GetIndexType(node);

if (!IsValidField(field))
{
return null;
}

return field;
}

public string GetFieldValue(Item item)
{
if (!ID.IsNullOrEmpty(FieldID))
{
return item[ID.Parse(FieldID)];
}
if(!string.IsNullOrEmpty(FieldName))
{
return item[FieldName];
}
return string.Empty;
}

private static bool IsValidField(CustomField field)
{
if ((!string.IsNullOrEmpty(field.FieldName) || !ID.IsNullOrEmpty(field.FieldID)) && !string.IsNullOrEmpty(field.LuceneFieldName))
{
return true;
}
return false;
}

private static Field.Index GetIndexType(XmlNode node)
{
string indexType = XmlUtil.GetAttribute("indexType", node);
if (!string.IsNullOrEmpty(indexType))
{
switch (indexType.ToLowerInvariant())
{
case "no":
return Field.Index.NO;
case "tokenized":
return Field.Index.TOKENIZED;
case "untokenized":
return Field.Index.UN_TOKENIZED;
case "nonorms":
return Field.Index.NO_NORMS;
}
}
return Field.Index.TOKENIZED;
}

private static Field.Store GetStorageType(XmlNode node)
{
string storage = XmlUtil.GetAttribute("storageType", node);
if (!string.IsNullOrEmpty(storage))
{
switch (storage.ToLowerInvariant())
{
case "no":
return Field.Store.NO;
case "yes":
return Field.Store.YES;
case "compress":
return Field.Store.COMPRESS;
}
}
return Field.Store.NO;
}
}
}


And the code for additional methods for DatabaseCrawler.


/// <summary>
///
Loops through the collection of custom fields and adds them to fields collection of each indexed item.
/// </summary>
/// <param name="document">
Lucene document</param>
/// <param name="item">
Sitecore data item</param>
private void AddCustomFields(Document document, Item item)
{
foreach(CustomField field in _customFields)
{
document.Add(CreateField(field.LuceneFieldName, field.GetFieldValue(item), field.StorageType, field.IndexType, Boost));
}
}

/// <summary>
///
Creates a Lucene field.
/// </summary>
/// <param name="fieldKey">
Field name</param>
/// <param name="fieldValue">
Field value</param>
/// <param name="storeType">
Storage option</param>
/// <param name="indexType">
Index type</param>
/// <param name="boost">
Boosting parameter</param>
/// <returns></returns>
private Fieldable CreateField(string fieldKey, string fieldValue, Field.Store storeType, Field.Index indexType, float boost)
{
Field field = new Field(fieldKey, fieldValue, storeType, indexType);
field.SetBoost(boost);
return field;
}

/// <summary>
///
Parses a configuration entry for a custom field and adds it to a collection of custom fields.
/// </summary>
/// <param name="node">
Configuration entry</param>
public void AddCustomField(XmlNode node)
{
CustomField field = CustomField.ParseConfigNode(node);
if (field == null)
{
throw new InvalidOperationException("Could not parse custom field entry: " + node.OuterXml);
}
_customFields.Add(field);
}


Last thing that is left to do is to call AddCustomFields method from AddAllFields one.


protected override void AddAllFields (Documentdocument, Itemitem, bool versionSpecific)
{
    ………………………………………
    AddCustomFields(document, item);
}


You can take it even further and add support for some field interpreter for each field configuration entry.


Hope you'll find it useful.

Tuesday, August 3, 2010

Language filtered Multilist field

Recently I happened to help one of our clients to create a custom Multilist field that gives you selecting options only if item has  at least one translated version in a current content language. From content author’s point of view it makes a lot of sense. Why to give them an option that is not useful?

It’s being a while since I created a custom field that has to take into account content language selection. The main challenge for me was to retrieve that content language. I did remember that there is a property that the Content Editor (CE) sets at run-time for every field but could not recall its name. So after several minutes of browsing my code storage of all samples for all Sitecore versions, I finally found it. So, here are the properties that you need to define in your custom field if you are planning to use them afterwards:

- ItemLanauage – represents a content language selected in the CE.

- ItemID – contains the item ID the field belongs to.

- ItemVersion – contains selected item version.

- ReadOnly – indicates if field is a readonly. If it is, then it will be grayed out and not editable.

- Source – represents the source value from field definition on a data template.

- FieldID – contains the field ID.

In my example I needed only ItemLanguage property. When I put things together the code looked like this:

using Sitecore.Shell.Applications.ContentEditor;
using Sitecore.Data.Items;

namespace Sitecore.Shell.Applications.ContentEditor.CustomExtensions
{
public class LanguageFilteredMultilist : Sitecore.Shell.Applications.ContentEditor.MultilistEx
{
#region Overrides

protected override Item[] GetItems(Item current)
{
Item[] items = base.GetItems(current);

var filteredItems = items.Where(item =>
{
string lang = ItemLanguage;
if (!string.IsNullOrEmpty(lang))
{
var versions = Sitecore.Data.Managers.ItemManager.GetVersions(item, Sitecore.Data.Managers.LanguageManager.GetLanguage(lang, item.Database));
return versions.Count > 0;
}
return false;
}
);

return filteredItems.ToArray<Item>();
}

#endregion Overrides

// Content Editor sets this property. It has a content language from the Content Editor.
public string ItemLanguage
{
get;
set;
}
}
}


Simple enough isn’t it. Don’t forget to add your custom field to /App_Config/FieldTypes.config file if it’s supposed to contain references to other items. In my case it should be added since it’s a multilist. If you forget to do it, the LinkDatabase won’t update references for your field.



That’s all for now.