Search Setup

This page provides several configuration use cases for Elasticsearch and Opensearch.

Setting up an OpenSearch 1.x, Elasticsearch 7.x or 8x Cluster

To support OpenSearch 1.x, Elasticsearch 7x or 8x clusters, you need to install nuxeo-search-client-opensearch1 package.

OpenSearch 1 is a fork of Elasticsearch 7, except some advanced features (not used by Nuxeo) they are fully compatible.

The nuxeo-search-client-opensearch1 package defines index settings, mappings and uses the Rest API according to Elasticsearch 7 version (equivalent to OpenSearch 1 version). It relies on OpenSearch 1.x client library to access the search cluster.

Note that for historical reasons you may find some "Elasticsearch" occurrences in configuration properties. Because they are compatible for these versions, Elasticsearch and OpenSearch could be used interchangeably in the documentation.

In addition to OpenSearch 1 and Elasticsearch 7, Nuxeo also works with Elasticsearch 8 cluster, as Elasticsearch 8 being backward compatible and able to honor Elasticsearch 7 API.

Please refer to Compatibility Matrix page for more information on the exact supported versions.

Embedded Mode

Unlike previous versions, there is no default embedded mode in Nuxeo LTS 2025. If you want to set up an OpenSearch server that runs in the same JVM as the Nuxeo Platform's, you have to install explicitly the nuxeo-opensearch1-embed package.

This embedded mode is only for testing purpose and should not be used in production, neither OpenSearch nor Nuxeo can support an embedded installation.

For production you need to setup a Search cluster.

Installing an Elasticsearch Cluster

Refer to the Elasticsearch documentation to install and secure your cluster. Basically:

Don't run Elasticsearch open to the public.
Don't run Elasticsearch as root.
Secure the connection between Nuxeo and Elasticsearch:
- Elasticsearch 7 requires the X-Pack extension to enable secured communication between Nuxeo and Elasticsearch. Please follow this guide to Securing Elasticsearch.
- Elasticsearch 8 security is enabled by default. Follow this guide for further security configuration.
Follow the Elasticsearch REST Security APIs documentation for configuring a user and role.

An example on how to create a role:

curl -XPOST -u elastic 'localhost:9200/_security/role/nuxeo_role' -H "Content-Type: application/json" -d '{
  "cluster" : [
    "all"
  ],
 "indices" : [
   {
     "names" : [ "nuxeo*" ],
     "privileges" : [ "all" ]
   }
 ]
}'

An example on how to create a user for that role:

curl -XPOST -u elastic 'localhost:9200/_security/user/nuxeo_user' -H "Content-Type: application/json" -d '{
  "password" : "nuxeo_secret_password",
  "full_name" : "Nuxeo User",
  "roles" : [ "nuxeo_role" ]
}'

Recommended Tuning

If you have a large number of documents or if you use Nuxeo in cluster you may reach the default configuration limitation, here are some recommended tuning:

Consider disabling the OS swapping or using other Elasticsearch option to prevent the heap to be swapped.

In /etc/default/elasticsearch file you can increase the JVM heap to half of the available OS memory:

# For a dedicated node with 12g of RAM
ES_JAVA_OPTS="-Xms6g -Xmx6g"

Installing an OpenSearch Cluster

Refer to the OpenSearch documentation to install OpenSearch. Basically:

Don't run OpenSearch open to the public.
Don't run OpenSearch as root.
Secure the connection between Nuxeo and OpenSearch, the security plugin is enabled by default with demo values which need to be replaced. See OpenSearch Security Configuration for guidance.
Follow the OpenSearch Access Control API documentation for configuring a user and role.

An example on how to create a role:

curl -XPUT -u admin http://localhost:9200/_plugins/_security/api/roles/nuxeo_role -H "Content-Type: application/json" -d '{
  "cluster_permissions" : [
    "all"
  ],
 "index_permissions" : [
   {
     "index_patterns" : [ "nuxeo*" ],
     "allowed_actions" : [ "all" ]
   }
 ]
}'

An example on how to create a user for that role:

curl -XPUT -u admin http://localhost:9200/_plugins/_security/api/internalusers/nuxeo_user -H "Content-Type: application/json" -d '{
  "password" : "nuxeo_secret_password",
  "description" : "Nuxeo User",
  "backend_roles" : [ "nuxeo_role" ]
}'

Recommended Tuning

If you have a large number of documents or if you use Nuxeo in cluster you may reach the default configuration limitation, here are some recommended tuning OpenSearch options

You can increase the JVM heap to half of the available OS memory:

# For a dedicated node with 12g of RAM
OPENSEARCH_JAVA_OPTS=-Xms6g -Xmx6g

Configuring Nuxeo to Access the Search Cluster

Nuxeo uses the Rest client protocol, you have to configure the access:

nuxeo.opensearch1.client.server=http://somenode:9200,https://anothernode:443

Where:

nuxeo.opensearch1.client.server is a comma separated list of URLs.

This property supersedes elasticsearch.addressList.

Basic Authentication

If you have chosen to configure Basic Authentication then you can setup Nuxeo using nuxeo.conf with the follow properties:

nuxeo.opensearch1.client.username=your_username
nuxeo.opensearch1.client.password=your_password

These properties supersede elasticsearch.restClient.username and elasticsearch.restClient.password.

TLS/SSL Configuration

If you have chosen to configure Elasticsearch TLS/SSL or OpenSearch TLS/SSL then you can set up Nuxeo using nuxeo.conf with the following properties:

nuxeo.opensearch1.client.trustStore.path
nuxeo.opensearch1.client.truststore.path
nuxeo.opensearch1.client.truststore.password
nuxeo.opensearch1.client.truststore.type
nuxeo.opensearch1.client.keystore.path
nuxeo.opensearch1.client.keystore.password
nuxeo.opensearch1.client.keystore.type

These properties supersede all elasticsearch.restClient.* properties.

If you are using TLS/SSL then the nuxeo.opensearch1.client.server will need to be updated to include https.

See the Trust Store and Key Store Configuration page for more.

Index Names

Nuxeo manages 3 Elasticsearch indexes:

The repository index used to index document content, this index can be rebuild from scratch by extracting content from the repository.
The audit logs index to store audit entries, this index is a primary storage and can not be rebuild.
A sequence index used to serve unique value that can be used as primary keys, this index is also a primary storage.

To make the connection between the Nuxeo Platform instance and the Search cluster, check the following options in the nuxeo.conf file and edit if you need to change the default value:

nuxeo.search.client.default.opensearch1.index.name=nuxeo
nuxeo.search.client.default.opensearch1.settings.numberOfReplicas=0
nuxeo.audit.backend.default.opensearch1.index.name=nuxeo-audit
nuxeo.audit.backend.default.opensearch1.settings.numberOfReplicas=0
nuxeo.uidsequencer.default.opensearch1.index.name=nuxeo-uidgen

Where

nuxeo.search.client.default.opensearch1.index.name is the name of the OpenSearch index for the default document repository.
nuxeo.search.client.default.opensearch1.settings.numberOfReplicas is the number of replicas. By default you have 1 shard and 1 replica. If you have a single node in your cluster you should set the indexNumberOfReplicasto 0. Visit the Elasticsearch Scalability documentation for more information on shards and replicas.
nuxeo.audit.backend.default.opensearch1.index.name is the name of the OpenSearch index for audit logs.
nuxeo.uidsequencer.default.opensearch1.index.name is the name of the OpenSearch index for the uid sequencer, extensively used for audit logs.

You can find all the available options in the nuxeo.defaults.

These properties supersede: elasticsearch.indexName, elasticsearch.indexNumberOfReplicas, audit.elasticsearch.indexName, seqgen.elasticsearch.indexName.

Index Aliases and Reindexing without Service Interruption

This feature is planned but not available in the first release of Nuxeo LTS 2025.0.

Translog Tuning

To reduce disk IO you should consider changing the default translog durability from request to async. This can be done from nuxeo.conf:

nuxeo.search.client.default.opensearch1.settings.indexTranslogDurability=async
nuxeo.audit.backend.default.opensearch1.settings.indexTranslogDurability=async
nuxeo.uidsequencer.default.opensearch1.settings.indexTranslogDurability=async

These properties supersede the elasticsearch.index.translog.durability.

If your indexes are already created you need some manual operation to change the translog:

curl -H "Content-Type: application/json" -XPUT "http://localhost:9200/nuxeo-uidgen/_settings" -d '{
  "index.translog.durability" : "async"
}'

curl -H "Content-Type: application/json" -XPUT "http://localhost:9200/nuxeo-audit/_settings" -d '{
  "index.translog.durability" : "async"
}'

curl -H "Content-Type: application/json" -XPUT "http://localhost:9200/nuxeo/_settings" -d '{
  "index.translog.durability" : "async"
}'

Disabling Elasticsearch

Simply don't install the nuxeo-search-client-opensearch1 package.

Disabling Elasticsearch for Audit Logs

Simply, don't install the nuxeo-audit-opensearch1 package.

Rebuilding the Repository Index

If you need to reindex the whole repository, you have different possibilities:

Re-index Repository Using Bulk Service

Use the management API to re-index the repository, the command id is returned:

curl -X POST -u Administrator:<PASSWORD> "<SERVER_URL>/nuxeo/api/v1/management/search/reindex"

{"commandId": "21aeaea1-0ef0-4a89-a92d-fa8f679361de"}

At any time, you can request the status of the re-indexing using the previous command id:

curl -X GET -u Administrator:<PASSWORD> "<SERVER_URL>/nuxeo/api/v1/management/bulk/21aeaea1-0ef0-4a89-a92d-fa8f679361de"
{
  "entity-type": "bulkStatus",
  "commandId": "21aeaea1-0ef0-4a89-a92d-fa8f679361de",
  "state": "RUNNING",
  "processed": 200,
  "error": false,
  "errorCount": 0,
  "total": 42932,
  "action": "index",
  "username": "Administrator",
  "submitted": "2020-11-16T15:26:50.346Z",
  "scrollStart": "2020-11-16T15:26:50.432Z",
  "scrollEnd": "2020-11-16T15:26:50.446Z",
  "processingStart": null,
  "processingEnd": null,
  "completed": null,
  "processingMillis": 0
}

Changing Mappings and Settings of Indexes

Updating Repository Index Configuration

Nuxeo comes with a default mapping that sets the locale for full-text and declares some fields as being date or numeric.

For fields that are not explicitly defined in the mapping, Elasticsearch will try to guess the type the first time it indexes the field. If the field is empty it will be treated as a String field. This is why most of the time you need to explicitly set the mapping for your custom fields that are of type date, numeric or full-text. Also fields that are used to sort and that could be empty need to be defined to prevent an unmapped field error.

The default mapping is located in the ${NUXEO_HOME}/templates/opensearch1-search-client/nxserver/config/opensearch1-search-client-config.xml.nxftl.

To override and tune the default mapping:

Instead of overriding the extension point you can simply override the default mapping or settings JSON files:

Create a custom template like myapp with a nuxeo.defaults file that contains:
```
 myapp.target=.
```
In this custom template create a file named {NUXEO_HOME}/templates/myapp/nxserver/config/opensearch1-doc-mapping.json to override the mapping. You can create a file named {NUXEO_HOME}/templates/myapp/nxserver/config/opensearch1-doc-settings.json.nxftl to override the settings.
Important: You must add your custom mapping/settings to the existing one. You cannot just set your custom mapping in the file as Nuxeo does not merge your mapping with the default one. So, you must duplicate the original file and modify the copy.

Update the nuxeo.conf to use your custom template.

 nuxeo.templates=default,opensearch1-search-client,myapp

Restart and re-index the entire repository (see previous section). A re-indexing is needed to apply the new settings and mapping.

For mapping customization examples, see the page Configuring the Elasticsearch Mapping.

Updating the Audit Logs Index Configuration

Here the index is a primary storage and you cannot rebuild it. So we need a tool that will extract the _source of documents from one index and submit it to a new index that have been setup with the new configuration.

Update the mappings or settings configuration by overriding the {NUXEO_HOME}/templates/opensearch1-audit/nxserver/config/opensearch1-audit-config.xml.nxftl(follow the same procedure as the section above for the repository index)
Use a new name for the nuxeo.audit.backend.default.opensearch1.index.name (like nuxeo-audit2)
Start the Nuxeo Platform.
The new index is created with the new mapping.
Stop the Nuxeo Platform

Copy the audit logs entries in the new index using the _reindex endpoint. Here we copy nuxeo-audit to nuxeo-audit2.

 curl -X POST http://localhost:9200/_reindex -H 'Content-Type: application/json' -d '{
 "source": {
 "index": "nuxeo-audit"
 },
 "dest": {
 "index": "nuxeo-audit2"
 }
 }'

Configuration for Multi Repositories

You need to define an index for each repository. This is done by adding an elasticSearchIndex contribution.

Create a custom template as described in the above section "Changing the mapping of the index".

Add the following contribution:

 <extension target="org.nuxeo.ecm.core.search" point="searchIndex">
   <searchIndex name="enhanced-repo2" searchClient="opensearch" repository="repo2" default="true" />
 </extension>

 <extension target="org.nuxeo.ecm.core.search.client.opensearch1" point="searchClient">
   <searchClient name="opensearch">
     <searchIndex name="enhanced-repo2" technicalName="nuxeo-repo2" />
   </searchClient>
 </extension>

 <extension target="org.nuxeo.runtime.opensearch1.OpenSearchComponent" point="index">
   <index name="nuxeo-repo2">
     <client id="search/default" />
   </index>
 </extension>

Where repo2 is the name of the second repository and nuxeo-repo2 the OpenSearch index name.

Investigating and Reporting Problems

Activate Traces

To understand why a document is not present in search results or not indexed, you can activate a debug trace.

Look at the lib/log4j2.xml you will find commented configuration to trace OpenSearch requests and responses.

Reporting Settings and Mapping

It is also important to report the current settings and mapping of an Elasticsearch index (here called nuxeo)

curl localhost:9200/nuxeo/_settings?pretty > /tmp/nuxeo-settings.json
curl localhost:9200/nuxeo/_mapping?pretty > /tmp/nuxeo-mapping.json
# misc info and stats on Elasticsearch
curl localhost:9200 > /tmp/es-info.txt
curl localhost:9200/_cluster/stats?pretty >> /tmp/es-info.txt
curl localhost:9200/_nodes/stats?pretty >> /tmp/es-info.txt
curl localhost:9200/_cat/health?v >> /tmp/es-info.txt
curl localhost:9200/_cat/nodes?v >> /tmp/es-info.txt
curl localhost:9200/_cat/indices?v >> /tmp/es-info.txt

Testing an Analyzer

To test the full-text analyzer:

curl -s -X GET "localhost:9200/nuxeo/_analyze" -H 'Content-Type: application/json' -d' {
  "analyzer" : "fulltext",
  "text" : "This is a text for testing, file_name/1-foos-BAR.jpg"
}'

To test an analyzer derived from the mapping:

curl -s -X GET "localhost:9200/nuxeo/_analyze" -H 'Content-Type: application/json' -d' {
  "field" : "ecm:path.children",
  "text" : "workspaces/main folder/sub-folder"
}'

Viewing Indexed Terms for Document Field

This can be done using tool like Luke to analyze at the Lucene index level. It is also possible to use aggregate on fields that are not text or text with fielddata option:

# view indexed tokens for dc:title.fulltext of document 3d50118c-7472-4e99-9cc9-321deb4fe053
curl -XGET 'localhost:9200/nuxeo/doc/_search?pretty' -H 'Content-Type: application/json' -d'{
 "query" : {"ids" : { "values" : ["3d50118c-7472-4e99-9cc9-321deb4fe053"] }},
 "aggs": {"my_aggs": {"terms": {"field": "dc:title", "order" : { "_count" : "desc" }, "size": 1000}}}}'

You may need to change the size parameter to get more or less indexed terms.

Explain and Profile Elasticsearch Queries

When trace level logs are active, Elasticsearch curl command will be present in the server log file. Getting more details on what is happening during the query execution, can either be done using explain or profile. Those two approaches will help to understand the mapping and the field scoring, it can also gives inputs about unmapped fields for example.