Package org.nuxeo.ecm.platform.pdf
Class PDFTextExtractor
- java.lang.Object
-
- org.nuxeo.ecm.platform.pdf.PDFTextExtractor
-
public class PDFTextExtractor extends Object
Extracts raw text from a PDF.- Since:
- 8.10
-
-
Constructor Summary
Constructors Constructor Description PDFTextExtractor(Blob inBlob)
PDFTextExtractor(DocumentModel inDoc, String inXPath)
Constructor with aDocumentModel
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
extractLastPartOfLine(String string)
String
extractLineOf(String inString)
String
getAllExtractedLines()
void
setPassword(String password)
-
-
-
Constructor Detail
-
PDFTextExtractor
public PDFTextExtractor(Blob inBlob)
-
PDFTextExtractor
public PDFTextExtractor(DocumentModel inDoc, String inXPath)
Constructor with aDocumentModel
. The default value forinXPath
(if passednull
or "") isfile:content
.- Parameters:
inDoc
- Input DocumentModel.inXPath
- Input XPath.
-
-
Method Detail
-
getAllExtractedLines
public String getAllExtractedLines() throws NuxeoException
- Throws:
NuxeoException
-
extractLineOf
public String extractLineOf(String inString) throws IOException
- Throws:
IOException
-
extractLastPartOfLine
public String extractLastPartOfLine(String string) throws IOException
- Throws:
IOException
-
setPassword
public void setPassword(String password)
-
-