Backup and Restore

Backing Up

Nuxeo supports hot backup of your data.

If you have followed the recommendations, then you have configured Nuxeo to use a production-safe database (instead of the default embedded H2) and have set a path for nuxeo.data.dir in your nuxeo.conf. In that case:

Simply first backup your database (make a SQL dump),
Then backup your data on filesystem.

Performing the backup in that order (the database first, then the filesystem) will ensure backup consistency.

If you didn't configure Nuxeo to use a database, then the default database is embedded in the data directory: Stop the server before backup.

If you didn't configure Nuxeo data directory (nuxeo.data.dir in nuxeo.conf), then the default path is $TOMCAT/nxserver/data. If you're not sure, look at the data directory value in the Admin Center.

Restoring

Restore the database and data filesystem you had previously backed up.
Configure Nuxeo to use this database and data directory.
Start Nuxeo.

Backing Up and Restoring the Audit Elasticsearch Index

If Elasticsearch is used as a backend for audit logs, meaning the following properties are set in nuxeo.conf:

elasticsearch.enabled=true
audit.elasticsearch.enabled=true

you need to backup / restore ${audit.elasticsearch.indexName} Elasticsearch index defined in nuxeo.conf, following the Elasticsearch Snapshot And Restore documentation.

Note that since Nuxeo 9.10, the sequence index ${seqgen.elasticsearch.indexName} can be regenerated quickly at startup, so it is not mandatory to backup this index.

This is really important if as if you decide to use Elasticsearch as a backend for audit logs it will become the reference (no more SQL backend), so backuping a Nuxeo instance implies backuping the audit Elasticsearch index.

Reminder: as stated in Setting up an Elasticsearch Cluster, the embedded Elasticsearch mode is only for testing purpose and should not be used in production.

Yet if you decide to use it for development or tests, to perform the backup / restore operations you will need to make the embedded Elasticsearch server accept HTTP request on port 9200 by setting elasticsearch.httpEnabled=true in nuxeo.conf.

Make sure you set back elasticsearch.httpEnabled=false when the backup / restore operations are over.

Additional Information

Two elements allow saving the filesystem once the database has been dumped:

When you add a document in the repository, VCS computes the digest of the blob: it is this digest which is used as the filename of the document stored in the filesystem. That way, if a user uploads a different document but which has the same filename, the blob stored on filesystem won't be changed: a new blob with a different digest will be put in the blobstore.
Blobs are not deleted as soon as the document is removed from the repository.

These two points ensures that no data will be modified (or deleted) after dumping your database. Only creation could happen. So the backup of the filesystem will be consistent with the backup of the database.

Some remarks about VCS:

As VCS uses the digest of the blob, this ensures a document will be stored only once in the blobstore, even if it is uploaded several times.
As VCS doesn't delete blobs once a document is removed from the repository, you should run a clean-up regularly from the Admin Center in the menu System Information / Repository binaries.

Some remarks about Nuxeo Stream:

Nuxeo 9.10 introduced Nuxeo Stream, it makes sense to backup streams so a restored instance can continue the stream processing.

When the underlying streams are stored using Chronicle Queues files, they are located inside ${nuxeo.data.dir} and they are already taken in account by the procedure described above,

When the underlying streams are stored using Kafka the case is more complex because records older than backup date need to be discard. So restoring means replicate existing topics until the backup date and reconfigure Nuxeo kafka access to use the new topics.