Class BlobsExtractor


  • public class BlobsExtractor
    extends Object
    Extractor for all the blobs of a document.
    • Field Detail

      • LIST_ONLY_DOC_TYPE_BLOB_PROPERTY_NAME

        public static final String LIST_ONLY_DOC_TYPE_BLOB_PROPERTY_NAME
        Framework boolean property name to fall back on legacy behavior. If true, only blobs referenced by static schemas (attached to the doc type of a document) will be listed i.e. blobs added through dynamic facets will be ignored.
        Since:
        2021.37
        See Also:
        Constant Field Values
      • docBlobPaths

        protected final Map<String,​List<String>> docBlobPaths
        Local cache of blob paths per doc type.
      • docBlobPathsPerSchema

        protected final Map<String,​List<String>> docBlobPathsPerSchema
        Local cache of blob paths per schema.
    • Constructor Detail

      • BlobsExtractor

        public BlobsExtractor()
    • Method Detail

      • isInterestingPath

        protected boolean isInterestingPath​(String path)
      • normalizePaths

        protected Set<String> normalizePaths​(Set<String> paths)
        Removes the "/data" suffix used by FulltextConfiguration.

        Adds missing schema name as prefix if no prefix ("content" -> "file:content").

      • getBlobs

        public List<Blob> getBlobs​(DocumentModel doc)
        Gets the blobs of the document.
        Parameters:
        doc - the document
        Returns:
        the list of blobs
      • getBlobsProperties

        public List<Property> getBlobsProperties​(DocumentModel doc)
        Gets the blob properties of the document.
        Parameters:
        doc - the document
        Returns:
        the list of blob properties
      • getBlobPaths

        public List<String> getBlobPaths​(DocumentType documentType)
        Gets the blob paths of the document type. Extractor properties are ignored.
        Parameters:
        documentType - the document type
        Returns:
        the list of blob paths
        Since:
        8.3
      • getBlobPaths

        public List<String> getBlobPaths​(Schema schema)
        Gets the blob paths of the document's schemas. Extractor properties are ignored.
        Parameters:
        schema - the schema
        Returns:
        the list of blob paths
        Since:
        2021.32