Class FullTextUtils

java.lang.Object
org.nuxeo.common.utils.FullTextUtils

public class FullTextUtils extends Object
Functions related to simple fulltext parsing. They don't try to be exhaustive but they work for simple cases.
  • Field Details

  • Method Details

    • parseFullText

      public static Set<String> parseFullText(String string, boolean removeDiacritics)
      Extracts the words from a string for simple fulltext indexing.

      Initial order is kept, but duplicate words are removed.

      It omits short or stop words, removes accents and does pseudo-stemming.

      Parameters:
      string - the string
      removeDiacritics - if the diacritics must be removed
      Returns:
      an ordered set of resulting words
    • parseWord

      public static String parseWord(String string, boolean removeDiacritics)
      Parses a word and returns a simplified lowercase form.
      Parameters:
      string - the word
      removeDiacritics - if the diacritics must be removed
      Returns:
      the simplified word, or null if it was removed as a stop word or a short word