Package org.nuxeo.ecm.platform.pdf
Class PDFInfo
- java.lang.Object
-
- org.nuxeo.ecm.platform.pdf.PDFInfo
-
public class PDFInfo extends Object
The class will parse the info embedded in a PDF, and return them either globally (toHashMap()
ortoString()
) or via individual getters.The PDF is parsed only at first call to
run()
. Values are cached during first call.About page sizes, see PDF page boxes for details. Here, we get the info from the first page only. The dimensions are in points. Divide by 72 to get it in inches.
- Since:
- 8.10
-
-
Constructor Summary
Constructors Constructor Description PDFInfo(Blob inBlob)
Constructor with a Blob.PDFInfo(Blob inBlob, String inPassword)
Constructor for Blob + encrypted PDF.PDFInfo(DocumentModel inDoc)
Constructor with a DocumentModel.PDFInfo(DocumentModel inDoc, String inXPath, String inPassword)
Constructor for DocumentModel + encrypted PDF
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
getAuthor()
String
getContentCreator()
Calendar
getCreationDate()
float
getCropBoxHeightInPoints()
float
getCropBoxWidthInPoints()
String
getFileName()
long
getFileSize()
String
getKeywords()
float
getMediaBoxHeightInPoints()
float
getMediaBoxWidthInPoints()
Calendar
getModificationDate()
int
getNumberOfPages()
String
getPageLayout()
String
getPdfVersion()
org.apache.pdfbox.pdmodel.encryption.AccessPermission
getPermissions()
String
getProducer()
String
getSubject()
String
getTitle()
String
getXmp()
boolean
isEncrypted()
void
run()
After building the object with the correct constructor, and after possibly having set some parsing property (setParseWithXMP()
, for example), this method will extract the information from the PDF.void
setParseWithXMP(boolean inValue)
If set to true, parsing will extract PDF.DocumentModel
toFields(DocumentModel inDoc, HashMap<String,String> inMapping, boolean inSave, CoreSession inSession)
TheinMapping
map is an HashMap where the key is the xpath of the destination field, and the value is the exact label of a PDF info as returned bytoHashMap()
.HashMap<String,String>
toHashMap()
Return all and every parsed info in a StringHashMap
.String
toString()
Wrapper fortoHashMap().toString()
-
-
-
Constructor Detail
-
PDFInfo
public PDFInfo(Blob inBlob)
Constructor with a Blob.- Parameters:
inBlob
- Input blob.
-
PDFInfo
public PDFInfo(Blob inBlob, String inPassword)
Constructor for Blob + encrypted PDF.- Parameters:
inBlob
- Input blob.inPassword
- If the PDF is encrypted.
-
PDFInfo
public PDFInfo(DocumentModel inDoc)
Constructor with a DocumentModel. Uses the defaultfile:content
xpath to get the blob from the document.- Parameters:
inDoc
- Input DocumentModel.
-
PDFInfo
public PDFInfo(DocumentModel inDoc, String inXPath, String inPassword)
Constructor for DocumentModel + encrypted PDFIf
inXPath
isnull
or""
, it is set to the defaultfile:content
value.- Parameters:
inDoc
- Input DocumentModel.inXPath
- Input XPath.inPassword
- If the PDF is encrypted.
-
-
Method Detail
-
setParseWithXMP
public void setParseWithXMP(boolean inValue)
If set to true, parsing will extract PDF.The value cannot be modified if
run()
already has been called.- Parameters:
inValue
- true to extract XMP.
-
run
public void run() throws NuxeoException
After building the object with the correct constructor, and after possibly having set some parsing property (setParseWithXMP()
, for example), this method will extract the information from the PDF.After extraction, the info is available through getters: Either all of them (
toHashMap()
ortoString()
) or individual info (see all getters).- Throws:
NuxeoException
-
toHashMap
public HashMap<String,String> toHashMap()
Return all and every parsed info in a StringHashMap
.Possible values are:
- File name
- File size
- PDF version
- Page count
- Page size
- Page width
- Page height
- Page layout
- Title
- Author
- Subject
- PDF producer
- Content creator
- Creation date
-
toFields
public DocumentModel toFields(DocumentModel inDoc, HashMap<String,String> inMapping, boolean inSave, CoreSession inSession)
TheinMapping
map is an HashMap where the key is the xpath of the destination field, and the value is the exact label of a PDF info as returned bytoHashMap()
. For example:pdfinfo:title=Title pdfinfo:producer=PDF Producer pdfinfo:mediabox_width=Media box width ...
If
inSave
is false, inSession can be null.- Parameters:
inDoc
- Input DocumentModel.inMapping
- Input Mapping.inSave
- Whether should save.inSession
- If is saving, should do it in this particular session.
-
toString
public String toString()
Wrapper fortoHashMap().toString()
-
getNumberOfPages
public int getNumberOfPages()
-
getMediaBoxWidthInPoints
public float getMediaBoxWidthInPoints()
-
getMediaBoxHeightInPoints
public float getMediaBoxHeightInPoints()
-
getCropBoxWidthInPoints
public float getCropBoxWidthInPoints()
-
getCropBoxHeightInPoints
public float getCropBoxHeightInPoints()
-
getFileSize
public long getFileSize()
-
isEncrypted
public boolean isEncrypted()
-
getAuthor
public String getAuthor()
-
getContentCreator
public String getContentCreator()
-
getFileName
public String getFileName()
-
getKeywords
public String getKeywords()
-
getPageLayout
public String getPageLayout()
-
getPdfVersion
public String getPdfVersion()
-
getProducer
public String getProducer()
-
getSubject
public String getSubject()
-
getTitle
public String getTitle()
-
getXmp
public String getXmp()
-
getCreationDate
public Calendar getCreationDate()
-
getModificationDate
public Calendar getModificationDate()
-
getPermissions
public org.apache.pdfbox.pdmodel.encryption.AccessPermission getPermissions()
-
-