Repository and BinaryManager
Each content repository has to be associated with a BinaryManager
implementation. The BinaryManager
is a low level interface that only deals with binary stream.
Binary getBinary(InputStream in) throws IOException;
Binary getBinary(String digest);
As you can see, the methods do not have any document related parameters. This means the binary storage is independent from the documents:
- Moving a document does not impact the binary stream;
- Updating a document does not impact the binary stream.
In addition, the streams are stored using their digest, thanks to that:
BlobStore
does automatically manage de-duplication;BlobStore
can be safely snapshoted (files are never moved or updated, and they are only removed via aGarbageCollection
process).
From Simple FS to S3 Binary Manager
The default BinaryManager
implementation is based on a simple filesystem: considering the storage principles, this is safe to use this implementation even on a NFS like filesystem (since there is no conflicts).
You can also use the S3 Binary Manager to use AWS Cloud File System.
The Temporary storage is used to avoid delays when using the Stream several times (ex: multiple conversions) inside the Nuxeo Server.
Encryption
A common question regarding BinaryManager
is the support for encryption. See Implementing Encryption for more on the configuration options.
AES Encryption
Since Nuxeo 6.0, it's possible to use a BinaryManager
that encrypts file using AES. Two modes are possible:
- a fixed AES key retrieved from a Java KeyStore,
- an AES key derived from a human-readable password using the industry-standard PBKDF2 mechanism.
While the files are in use by the application, a temporary file in clear is created. It is removed as soon as possible.
Built-in S3 Encryption
If we take the example of the S3 BinaryManager, AWS S3 Client library supports both client side and server side encryption:
With Server side encryption, the encryption is completely transparent.
In Client side encryption mode the S3 Client manages the encrypt / decrypt process. The local temporary file is in clear.
Custom Encryption
You can contribute custom implementation of the BinaryManager: since the interface is very simple, the implementation is simple too.
The first possible approach is to handle custom crypt / decrypt on top of AWS S3 Client library:
In that case, the local temporary file is in clear.
The second possible approach is to handle the crypt/decrypt process on the fly.
This means that the temp file is crypted, but as a trade off:
- Decrypting should be run on the fly each time the stream is read.
- Determining the stream size requires more work.