Class StringsExtractor


  • public class StringsExtractor
    extends Object
    Finds the strings in a document (string properties).

    This class is not thread-safe.

    Since:
    10.3
    • Constructor Detail

      • StringsExtractor

        public StringsExtractor()
    • Method Detail

      • findStrings

        public List<String> findStrings​(DocumentModel document,
                                        Set<String> includedPaths,
                                        Set<String> excludedPaths)
        Finds strings from the document for a given set of included and excluded paths.

        Paths must be specified with a schema prefix in all cases (normalized).

        Parameters:
        document - the document
        includedPaths - the paths to include, or null for all paths
        excludedPaths - the paths to exclude, or null for none
        Returns:
        a list of strings (each string is never null)
      • isInterestingPath

        protected boolean isInterestingPath​(String path)
      • findStrings

        protected void findStrings​(Property property,
                                   String path)