Package org.nuxeo.ecm.platform.pdf
Class PDFTextExtractor
java.lang.Object
org.nuxeo.ecm.platform.pdf.PDFTextExtractor
Extracts raw text from a PDF.
- Since:
- 8.10
-
Constructor Summary
ConstructorDescriptionPDFTextExtractor
(Blob inBlob) PDFTextExtractor
(DocumentModel inDoc, String inXPath) Constructor with aDocumentModel
. -
Method Summary
Modifier and TypeMethodDescriptionextractLastPartOfLine
(String string) extractLineOf
(String inString) void
setPassword
(String password)
-
Constructor Details
-
PDFTextExtractor
-
PDFTextExtractor
Constructor with aDocumentModel
. The default value forinXPath
(if passednull
or "") isfile:content
.- Parameters:
inDoc
- Input DocumentModel.inXPath
- Input XPath.
-
-
Method Details
-
getAllExtractedLines
- Throws:
NuxeoException
-
extractLineOf
- Throws:
IOException
-
extractLastPartOfLine
- Throws:
IOException
-
setPassword
-