Server

Search Indexing Logic

Updated: April 14, 2025

Hyland University
Watch the related courses on Hyland University:
Configuring Searches in Nuxeo Studio Modeler & Designer.

Indexing

The indexing process involves stacking indexing commands when manipulating sessions to create, update, or delete documents. These commands are then emitted as indexing domain events into a source/indexing stream when the transaction is committed.

There are two indexing processors:

  • A Synchronous processor: When a UI request is involved, some indexing commands are marked as synchronous. After transaction commit, the thread waits for these specific commands to be processed by the synchronous processor before returning. This way the next UI request is able to search updated documents, giving a real time indexing appearance. The synchronous processor reads the source/indexing stream and processes simple events marked as synchronous. It also refreshes the index to ensure updates are searchable.
  • An Asynchronous processor: It reads the same source/indexing stream and processes all commands not handled by the synchronous processor. This includes heavy operations like indexing all children (recursive commands) when moving a folder or changing an ACL.

When indexing a document, the Nuxeo Platform sends a JSON representation to be indexed. A creation or update command submits the complete document. For OpenSearch/Elasticsearch engines, the JSON document can be viewed in the _source field. It is possible to override the default JSON writer (DefaultIndexingJsonWriter).

Note that this is a new indexing logic implemented in LTS 2025, which no longer relies on WorkManager.

Searching and Limitations

NXQL Queries are translated by the SearchClient. Some implementations may have some limitations or different behavior. they are documented in the NXQL documentation. and below.

OpenSearch 1 Search Client

This search client provides access to OpenSearch 1.x, Elasticsearch 7.x and 8.x Search engines.

When the query does not specify an ordering, the results are sorted by descending order of relevance as described in Elasticsearch documentation. There are multiple ways to tune relevance:

Operators and Mapping

Some operators need an explicit mapping to work properly. This is the case for FULLTEXT, LIKE and ILIKE operators (STARTSWITH for ecm:path has a special mapping setup by default). See the page Configuring the OpenSearch Mapping for more information.

Security and ACLs

The security clause is automatically added to match the principal and its groups. Each document contains the list of the users or groups that have permission to browse the document.

Only the simplified ACL is supported with Elasticsearch (this is the default security mode since 6.0). Simplified ACL means we only handle DENY on Everyone (block all rights) and not DENY on principals.