A question was asked a few months ago on the WebCenter Content OTN Discussion Forums about WebCenter Content storing page numbers for documents.
I am wondering how can i know the number of pages of each document in the UCM, is there a table that stores this information ? is there a script that i can run to know this information?
Keeping in mind that all the documents are in PDF format.
This question was recently brought up again on the forums.
I'm trying to write a component that uses iText to get a page count of web viewable pdfs and then updates the content item's metadata with the count. I'm struggling to find a good filter to use.
postWebfileCreation seemed like the ideal candidate but it wouldn't let me update the metadata. I get a 'csJdbcStartTranWithinATranNotAllowed' error.
Ideally the filter will be executed before the content item has been indexed as I assume that if the metadata is updated afterwards it will need reindexing?
Any help/advice would be much appreciated.
Out of the box, WebCenter Content does not store the number of pages for documents checked-in. However, as many problems can be, this issue can be solved by creating a custom component. I created a custom component, PDFPageNumbers, which takes advantage of the Java library, iText, to calculate the number of pages in a PDF.
iText is a Java API that was developed to do the following (and much more):
As you can see above, iText has quite a few neat functions. However, in PDFPageNumbers, I am simply using the library to calculate the number of pages. PDFPageNumbers calculates this value during checkin and stores it in a metadata field (which can be configured in a preference prompt). This component is just a simple example of how iText can be utilized.
In compliance with the open source version of iText (and the Affero General Public License), the component includes the full source code, which can be viewed/cloned/forked here.