Nuxeo Server

Elasticsearch Hints Cheat Sheet

Updated: November 13, 2017 Page Information Edit on GitHub

This page is scheduled for review and update. Check back soon for updated content!

This page lists interesting use cases of Elasticsearch Hints.

 

Fuzzy Search on Full Text Index

Configuration

  • Drop any string field on your content view
  • Use the following values for the ES hints configuration:
    • Index: all_field
    • Analyzer: fulltext
    • Operator: fuzzy

Test case

  • Create a new document that contains a text file which itself contains the string "Nuxeo rocks"
  • Search for "Nuxo", the document created previously appears in the results

Using the Common Operator on the Main Attachment Content

Extract from the course What's New in Nuxeo Platform LTS 2015? in Nuxeo University

Suppose you want to be able to search using the common operator on your documents' main attachment content. This Elasticsearch operator is interesting for two reasons:

  • The common operator can be seen as an alternative to the full-text search. One notable difference is that it allows to search on terms that would have been removed by the full-text analyzer. If I absolutely want to search for the “Not Beyond Space Travel Agencies”, I’d like to be able to search for the “Not” keyword.
  • The common operator is smart. It divides query terms between those which are rare into the index, and those which are commonly found into it. Rare terms will get a boost, common terms will be lowered. Let's say you have lots of contracts in your repository, and you search for "confidentiality clause". If both query terms were considered of same importance, most relevant results might be drowned. The common operator will understand that the term "confidentiality" is rare and boost it, while lowering the importance of the "clause" term, that is common. This will help you getting the most relevant results first.

To implement this use case:

  • In the analyzer configuration, add an analyzer that will be used to index the main attachment's content:
"my_attachment_analyzer" : {
  "type" : "custom",
    "filter" : [
      "word_delimiter_filter",
      "lowercase",
      "asciifolding"
    ],
  "tokenizer" : "standard"
}
  • In the properties configuration, update the ecm:binarytext field mapping configuration to the following:
"ecm:binarytext" : {
  "type" : "text",
  "analyzer": "fulltext",
  "copy_to": "all_field",
  "fields":
    "common" : {
      "type": "text",
      "analyzer" : "my_attachment_analyzer",
      "include_in_all" : false
    }
  }
}

You can now configure hints in Nuxeo Studio using the common operator when querying on the ecm:binarytext.common index.

Nuxeo Studio Configuration

  • Drop any string field in the search layout of your content view
  • Use the following values for the ES hints configuration:
    • Index: ecm:binarytext.common
    • Analyzer: my_attachment_analyzer
    • Operator: common

Test case

  • Create a new document that contains an attachment which itself contains the string "Not Beyond Space Travel Agency"
  • Search for "Not", the document created previously appears in the results

Please note this is a basic test case. The common operator is best used on very large indexes.


5 days ago manonlumeau Added content-review-lts2017 label
10 days ago manonlumeau NXDOC-1347: Update documentation for Nuxeo 9.3 and Elasticsearch 5.6
10 days ago manonlumeau Review format
a month ago manonlumeau NXDOC-1346-FT review screenshot
2 years ago Bertrand Chauvin 10 | ormattin
2 years ago Bertrand Chauvin 9 | Add note about common operator
2 years ago Bertrand Chauvin 8 | Improve explanations
2 years ago Bertrand Chauvin 7 | fix analyzer name
2 years ago Manon Lumeau 6
2 years ago Bertrand Chauvin 5 | fix anchor
2 years ago Bertrand Chauvin 4 | Added common operator on main attachment config
2 years ago Michaël Vachette 3
2 years ago Michaël Vachette 2
2 years ago Alain Escaffre 1
History: Created by Alain Escaffre