Personalising Search using Sitecore CDP and Personalise (Boxever)

Personalising Search using Sitecore CDP and Personalise (Boxever) - Part 1

- October 26, 2021

I had the opportunity recently to present at a Sitecore User Group session with Sameer Maggon, the founder of SearchStax. We've used SearchStax for many years to provide cloud hosted managed Solr Cloud for customers -- Sitecore even use SearchStax in their own Managed Cloud offering -- and the now have a new SaaS offering called Search Studio that lets you define different models to apply to searches which alter how relevance scoring is calculated.

In part one of this blog post, I'll cover the main moving parts of SearchStax and Sitecore Personalise used to drive the solution. In part two I go through some example code that uses the two configured products together to render personalised search results.

Relevance scoring

When content is indexed for searching, the contents are tokeninsed and put into an index. Information is stored in a "Document" (either tokenised or stored directly) with term vectors associated. Term vectors are essentially the individual words from the document (with stop words ignored) and how many times they appear. This is generally captured per field in the document. These term vectors form the basis of relevance scoring when you submit a searfch term

The algorithm usually used to score a document's relevance to a particular seach is called "Term Frequency, Inverse Document Frequency" (TF-IDF). This generally works by giving a higher relevance score to documents where the saerch terms occur more often in a particular document (term frequency) than they do on average across all documents (inverse document frequency). The relevance score is calculated as a number beween 0 and 1, and relevance normally dictates the order of results so the more relevant ones are displayed first (and results under some relevance criteria aren't displayed at all). Search engines then nromally apply boosting to tune the relevance scoring, such as the importance of the term being found in certain fields in the Document. This applies a multiplier effect to certain relevance score components.

This is where relevance models come in.

Applying a relevance model to a search

More advanced search angines allow you to define different models used to adjust the way that relevance scoring is modified for returned query results. These can include manually configured settings for boosting for fields or specific documents, promoted results, or even machine learning models fed through client data coming back from search results and even external data.

In this case, Search Studio has been set up with a number of relevance models including one aimed at Developers, which boosts any content that has the content type "Documentation" as we know that is what most developers are saerching for

I can now call an endpoint for my SearchStudio instance using a URL to get JSON search results. I can also pass a model in the querystring to ask SearchStax to change how the relevance is calculcated


  https://xxxxxxxxx-us-east-1-aws.searchstax.com/solr/xxxxxxx-SearchStudioCorpSite/emselect?q=Keyword&model=Developer

Chosing a relevance model based on personalisation

Sitecore's new CDP offering allows you to collect customer information centrally, and build models that utilise this information to help drive personalised experiences. The decisioning logic can utilise business rules encoded into decsision tables, along with custom code and machine learning models, to make a decision as to what sort of experience should be served up for a particular customer. This can be done using either identified or unidentified users, although the more you know about a user the more information you can utilise to determine the best experience to serve.

For my demonstration I've created a simple decision model that takes in guest data, and feeds it into a decision table. That decision table uses the gender of the user (as this is simple data to use!) to determine whether to specify which decision model to use.

Male - no relevance model, to get general results
Female - Developer relevance model, to get results targeted at developers as outlined above
Where no gender is supplied, this gets a sepecific but different model too in this case

To expose the decision model through an API, I now need to set up a full stack experience. I attach the decision model that I created earlier, and supply the API code that will expose the output column of my decision model given the user I provide.

The code to pull the decision table output column is as follows. The output column "search_model" corresponds to the ID of the output column in the decision table that I created above

1:  <#assign DecTable = getDecisionModelResultNode("Decision Table 1")>  
2:  <#-- Construct the API response using Freemarker -->  
3:  <#-- For your Experience to run your API tab must have, at a minimum, open and closing brackets -->  
4:  {   
5:   "decisionOutput": "search_model",  
6:   "searchModel": "<#if DecTable.outputs[0].search_model??>${DecTable.outputs[0].search_model}<#else></#if>"  
7:  }

The full stack experience can then be called using the Boxever Callflow endpoint using HTTP POST


  https://api.boxever.com/v2/callFlows

I pass a payload including the name of the full stack experience I set up above, and an identifier for the user that I want to personalise for

{
        "clientKey": "xxxxxxxxxxxx",
        "channel": "WEB",
        "language": "EN",
        "currencyCode": "AUD",
        "email": "joe@bloggs.com",
        "friendlyId": "search_segment"
}

Putting it all together

This post outlines the main moving parts required to provide a personalised search experience with Search Studio and Sitecore Personalise. In part two I work through example code that pulls these products together to show end to end search capability using what we have set up here

Search This Blog

Digital Learnings in Sitecore & Cloud