Immediate Garbage Collection
Since LTS 2021-HF35 (see NXP-31594), the Nuxeo Platform deletes orphaned blobs whenever a:
- document is removed
- document blob property is edited
- document blob property is dispatched to another blob provider
Preconditions
Only deployments using the MongoDB backend can benefit from this feature. The following conditions must also be met:
- repositories must have the
queryBlobKeys
capability - repositories must use
LocalBlobProvider
orS3BlobProvider
Repository with queryBlobKeys
capability
This new GC implementation only works for repositories having the queryBlobKeys
capability.
Since LTS 2021-HF02 and NXP-29516, the blob keys referenced by a document are stored in its ecm:blobKeys
field.
If ALL documents of a repository have this field computed, then the repository has the queryBlobKeys
capability. In other words, a repository with documents created by a nuxeo server with a version prior to LTS 2021.2 / LTS 2021-HF02 does NOT have this capability.
You can query the capability endpoint to check whether a repository has the queryBlobKeys
capability.
In case of multi-repository configuration, all the repositories must have this capability.
Supported Blob Provider implementations
This GC implementation only works with Blob Providers extending BlobStoreBlobProvider which are:
- S3BlobProvider (when using amazon-s3-online-storage)
- LocalBlobProvider
nuxeo.core.binarymanager=org.nuxeo.ecm.core.blob.LocalBlobProvider
See NXP-31876.
Disablement
Immediate Garbage Collection is enabled by default. You can disable it with the following configuration property:
nuxeo.bulk.action.blobGC.enabled=false
Full Garbage Collection
Since LTS 2021-HF38 (see NXP-28565), a new Full GC implementation is available to scan your blob store in order to detect and delete the blobs that are no longer referenced in your repository.
This Full GC leverages the Bulk Action Framework. Like other bulk actions, the following configuration properties can be tweaked to fit your environment:
nuxeo.bulk.action.garbageCollectOrphanBlobs.defaultConcurrency=2
nuxeo.bulk.action.garbageCollectOrphanBlobs.defaultPartitions=4
Please see the dedicated Blobs Management Rest endpoint to invoke and monitor a Blob Full GC.