java.lang.Object

org.nuxeo.ecm.platform.pdf.PDFInfo

public class PDFInfo extends Object

The class will parse the info embedded in a PDF, and return them either globally (toHashMap() or toString()) or via individual getters.

The PDF is parsed only at first call to run(). Values are cached during first call.

About page sizes, see PDF page boxes for details. Here, we get the info from the first page only. The dimensions are in points. Divide by 72 to get it in inches.

Since:: 8.10

Constructor Summary

Constructors

Constructor

Description

PDFInfo(Blob inBlob)

Constructor with a Blob.

PDFInfo(Blob inBlob, String inPassword)

Constructor for Blob + encrypted PDF.

PDFInfo(DocumentModel inDoc)

Constructor with a DocumentModel.

PDFInfo(DocumentModel inDoc, String inXPath, String inPassword)

Constructor for DocumentModel + encrypted PDF
Method Summary

Modifier and Type

Method

Description

String

getAuthor()

String

getContentCreator()

Calendar

getCreationDate()

float

getCropBoxHeightInPoints()

float

getCropBoxWidthInPoints()

String

getFileName()

long

getFileSize()

String

getKeywords()

float

getMediaBoxHeightInPoints()

float

getMediaBoxWidthInPoints()

Calendar

getModificationDate()

int

getNumberOfPages()

String

getPageLayout()

String

getPdfVersion()

org.apache.pdfbox.pdmodel.encryption.AccessPermission

getPermissions()

String

getProducer()

String

getSubject()

String

getTitle()

String

getXmp()

boolean

isEncrypted()

void

run()

After building the object with the correct constructor, and after possibly having set some parsing property (setParseWithXMP(), for example), this method will extract the information from the PDF.

void

setParseWithXMP(boolean inValue)

If set to true, parsing will extract PDF.

DocumentModel

toFields(DocumentModel inDoc, HashMap<String,String> inMapping, boolean inSave, CoreSession inSession)

The inMapping map is an HashMap where the key is the xpath of the destination field, and the value is the exact label of a PDF info as returned by toHashMap().

HashMap<String,String>

toHashMap()

Return all and every parsed info in a String HashMap.

String

toString()

Wrapper for toHashMap().toString()

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Constructor Details
- PDFInfo
  
  public PDFInfo(Blob inBlob)
  
  Constructor with a Blob.
  
  Parameters:
  
  inBlob - Input blob.
- PDFInfo
  
  public PDFInfo(Blob inBlob, String inPassword)
  
  Constructor for Blob + encrypted PDF.
  
  Parameters:
  
  inBlob - Input blob.
  
  inPassword - If the PDF is encrypted.
- PDFInfo
  
  public PDFInfo(DocumentModel inDoc)
  
  Constructor with a DocumentModel. Uses the default file:content xpath to get the blob from the document.
  
  Parameters:
  
  inDoc - Input DocumentModel.
- PDFInfo
  
  public PDFInfo(DocumentModel inDoc, String inXPath, String inPassword)
  
  Constructor for DocumentModel + encrypted PDF
  If inXPath is null or "", it is set to the default file:content value.
  
  Parameters:
  
  inDoc - Input DocumentModel.
  
  inXPath - Input XPath.
  
  inPassword - If the PDF is encrypted.
Method Details
- setParseWithXMP
  
  public void setParseWithXMP(boolean inValue)
  
  If set to true, parsing will extract PDF.
  The value cannot be modified if run() already has been called.
  
  Parameters:
  
  inValue - true to extract XMP.
- run
  
  public void run() throws NuxeoException
  
  After building the object with the correct constructor, and after possibly having set some parsing property (setParseWithXMP(), for example), this method will extract the information from the PDF.
  After extraction, the info is available through getters: Either all of them (toHashMap() or toString()) or individual info (see all getters).
  
  Throws:
  
  NuxeoException
- toHashMap
  
  public HashMap<String,String> toHashMap()
  Return all and every parsed info in a String HashMap.
  Possible values are:
  
  File name
  
  File size
  
  PDF version
  
  Page count
  
  Page size
  
  Page width
  
  Page height
  
  Page layout
  
  Title
  
  Author
  
  Subject
  
  PDF producer
  
  Content creator
  
  Creation date
- toFields
  
  public DocumentModel toFields(DocumentModel inDoc, HashMap<String,String> inMapping, boolean inSave, CoreSession inSession)
  The inMapping map is an HashMap where the key is the xpath of the destination field, and the value is the exact label of a PDF info as returned by toHashMap(). For example:
  
  pdfinfo:title=Title pdfinfo:producer=PDF Producer pdfinfo:mediabox_width=Media box width ...
  
  If inSave is false, inSession can be null.
  Parameters:
  
  inDoc - Input DocumentModel.
  
  inMapping - Input Mapping.
  
  inSave - Whether should save.
  
  inSession - If is saving, should do it in this particular session.
- toString
  
  public String toString()
  
  Wrapper for toHashMap().toString()
  
  Overrides:
  
  toString in class Object
- getNumberOfPages
  
  public int getNumberOfPages()
- getMediaBoxWidthInPoints
  
  public float getMediaBoxWidthInPoints()
- getMediaBoxHeightInPoints
  
  public float getMediaBoxHeightInPoints()
- getCropBoxWidthInPoints
  
  public float getCropBoxWidthInPoints()
- getCropBoxHeightInPoints
  
  public float getCropBoxHeightInPoints()
- getFileSize
  
  public long getFileSize()
- isEncrypted
  
  public boolean isEncrypted()
- getAuthor
  
  public String getAuthor()
- getContentCreator
  
  public String getContentCreator()
- getFileName
  
  public String getFileName()
- getKeywords
  
  public String getKeywords()
- getPageLayout
  
  public String getPageLayout()
- getPdfVersion
  
  public String getPdfVersion()
- getProducer
  
  public String getProducer()
- getSubject
  
  public String getSubject()
- getTitle
  
  public String getTitle()
- getXmp
  
  public String getXmp()
- getCreationDate
  
  public Calendar getCreationDate()
- getModificationDate
  
  public Calendar getModificationDate()
- getPermissions
  
  public org.apache.pdfbox.pdmodel.encryption.AccessPermission getPermissions()

Class PDFInfo

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

PDFInfo

PDFInfo

PDFInfo

PDFInfo

Method Details

setParseWithXMP

run

toHashMap

toFields

toString

getNumberOfPages

getMediaBoxWidthInPoints

getMediaBoxHeightInPoints

getCropBoxWidthInPoints

getCropBoxHeightInPoints

getFileSize

isEncrypted

getAuthor

getContentCreator

getFileName

getKeywords

getPageLayout

getPdfVersion

getProducer

getSubject

getTitle

getXmp

getCreationDate

getModificationDate

getPermissions