Nuxeo Server

File Storage Configuration

Updated: March 18, 2024

The files attached to a document in Nuxeo are usually stored separately from the main document database. The way in which they are stored is configurable using the concepts of a Binary Manager, Blob Provider and Blob Dispatcher.

This page gives operational information on targeted configurations. For a full understanding on how Nuxeo Platform stores binaries and what possibilities are available, please read the dedicated documentation page in the main documentation section.

Configuring the Default BlobProvider

The default blob provider for Nuxeo Platform stores files on the local filesystem at a configurable location and with filenames based on their hash (digest).

Standard DefaultBinaryManager Configuration

<extension target="org.nuxeo.ecm.core.blob.BlobManager" point="configuration">
  <blobprovider name="default">
    <class>org.nuxeo.ecm.core.blob.binary.DefaultBinaryManager</class>
    <property name="path">binaries</property>
  </blobprovider>
</extension>

The path property is used to specify the filesystem path at which the binaries will be stored. A relative path will be resolved under $NUXEO_HOME/nxserver/data, but an absolute path can be used as well if needed.

The format of the binaries (or other configured path) directory is:

  • config.xml a file containing the configuration used.
  • data/ the hierarchy with the actual binaries dispatched in subdirectories.
  • tmp/ a temporary storage location during creation (this must be on the same filesystem as data/).

The config.xml file looks like this:

<?xml version="1.0"?>
<binary-store>
  <digest>MD5</digest>
  <depth>2</depth>
</binary-store>

It is automatically generated by Nuxeo when initializing an empty binary storage, but you can put it manually and change the configuration:

  • digest is a Java MessageDigest name, for example MD5 or SHA-256. 
  • depth is the depth with which the files are nested in subdirectories to avoid having too many in a single directory.

Registering Another BlobProvider

To register a new blob provider, use the blobprovider extension point and with the class for your binary manager:

<extension target="org.nuxeo.ecm.core.blob.BlobManager" point="configuration">
  <blobprovider name="default">
    <class>org.nuxeo.ecm.core.blob.binary.DefaultBinaryManager</class>
    <property name="path">binaries</property>
  </blobprovider>
</extension>

You will find existing implementations in the binary store main documentation page, with links to specific configuration instructions:

Usually, if you don't use the advanced Blob Dispatcher capabilities, you will need one binary manager per Nuxeo repository. By default Nuxeo uses a binary manager with the same name as each repository, for instance the "default" repository will use the "default" binary manager. For a standard Nuxeo instance with a single repository, this is all taken care of for you by the default template.

Blob Dispatcher

Without specific configuration, the DefaultBlobDispatcher stores a document's blob's binary in a blob provider with the same name as the document's repository name.

Advanced dispatching configuration is possible using properties. Each property name is a list of comma-separated clauses, with each clause consisting of a property, an operator and a value. The property can be a document property XPath, or ecm:repositoryName, ecm:path, or, to match the current blob being dispatched, blob:name, blob:mime-type, blob:encoding, blob:digest, blob:length or blob:xpath. Comma-separated clauses are ANDed together. The special property name default defines the default provider, and must be present.

Available operators between property and value are =, !=, <, >, ~ and ^. The operators < and > work with integer values. The operator ~ does glob matching using ? to match a single arbitrary character, and * to match any number of characters (including none). The operator ^ does full regexp matching.

For example, all the videos could be stored somewhere, the attachments in a different area, the documents from a secret source in an encrypted area, and the rest in a default location. To do this, you would need to specify the following:

Example Blob Dispatcher Configuration

<extension target="org.nuxeo.ecm.core.blob.DocumentBlobManager" point="configuration">
  <blobdispatcher>
    <class>org.nuxeo.ecm.core.blob.DefaultBlobDispatcher</class>
    <property name="dc:format=video">videos</property>
    <property name="blob:mime-type=video/mp4">videos</property>
    <property name="blob:xpath~files/*/file">attachments</property>
    <property name="dc:source=secret">encrypted</property>
    <property name="default">default</property>
  </blobdispatcher>
</extension>

This assumes that you have four blob providers configured, the default one and three additional ones, videos, attachments and encrypted. For example you could have:

Defining Additional Binary Managers

<extension target="org.nuxeo.ecm.core.blob.BlobManager" point="configuration">
  <blobprovider name="videos">
    <class>org.nuxeo.ecm.core.blob.LocalBlobProvider</class>
    <property name="path">binaries-videos</property>
  </blobprovider>
  <blobprovider name="attachments">
    <class>org.nuxeo.ecm.core.blob.LocalBlobProvider</class>
    <property name="path">binaries-attachments</property>
  </blobprovider>
  <blobprovider name="encrypted">
    <class>org.nuxeo.ecm.core.blob.AESBlobProvider</class>
    <property name="key">password=secret</property>
  </blobprovider>
</extension>

Separating Binaries
It is CRITICAL to keep the binaries separated between each provider. Otherwise, this will result in a shared storage configuration that will prevent the Orphaned Blobs GC from running efficiently.

Always define different path when using local blob providers. When using Amazon S3 Online Storage (or any other cloud provider), always define different bucket_prefix (or container prefix) if using the same bucket (or container).

Such WARN message is displayed at server start up otherwise:

Shared storages detected: [path] this must be avoided, review your blob providers configuration.

The default DefaultBlobDispatcher class can be replaced by your own implementation.