Mythics Blog

PDF Page Numbers in WebCenter Content

Posted on May 8, 2013 by Jonathan Hult

Tags: Mythics Consulting, WebCenter

A question was asked a few months ago on the WebCenter Content OTN Discussion Forums about WebCenter Content storing page numbers for documents.

I am wondering how can i know the number of pages of each document in the UCM, is there a table that stores this information ? is there a script that i can run to know this information?

Keeping in mind that all the documents are in PDF format.

This question was recently brought up again on the forums.

I'm trying to write a component that uses iText to get a page count of web viewable pdfs and then updates the content item's metadata with the count. I'm struggling to find a good filter to use.

postWebfileCreation seemed like the ideal candidate but it wouldn't let me update the metadata. I get a 'csJdbcStartTranWithinATranNotAllowed' error.

Ideally the filter will be executed before the content item has been indexed as I assume that if the metadata is updated afterwards it will need reindexing?

Any help/advice would be much appreciated.

Out of the box, WebCenter Content does not store the number of pages for documents checked-in. However, as many problems can be, this issue can be solved by creating a custom component. I created a custom component, PDFPageNumbers, which takes advantage of the Java library, iText, to calculate the number of pages in a PDF.

iText is a Java API that was developed to do the following (and much more):

  • Generate documents and reports based on data from an XML file or a database
  • Create maps and books, exploiting numerous interactive features available in PDF
  • Add bookmarks, page numbers, watermarks, and other features to existing PDF documents
  • Split or concatenate pages from existing PDF files
  • Fill out interactive forms
  • Serve dynamically generated or manipulated PDF documents to a web browser

As you can see above, iText has quite a few neat functions. However, in PDFPageNumbers, I am simply using the library to calculate the number of pages. PDFPageNumbers calculates this value during checkin and stores it in a metadata field (which can be configured in a preference prompt). This component is just a simple example of how iText can be utilized.

In compliance with the open source version of iText (and the Affero General Public License), the component includes the full source code, which can be viewed/cloned/forked here.

Comments

  • ! No comments yet

Leave a Comment