Mythics Blog

Understanding WebCenter Capture’s Document Processing Model

Posted on July 26, 2013 by Caleb Ely

Tags: Mythics Consulting, WebCenter

Oracle’s WebCenter Capture is document capture software where users can scan documents, index them and send them to a repository. There is often confusion about the unique WebCenter Capture components involved in these stages of document processing and how they fit together. To understand how these components work together you must first know how they are structured.

The main structural component to WebCenter Capture’s document management model is the File Cabinet. A File Cabinet is a logical bucket in the capture system that stores, processes and delivers documents based on its configurations. File Cabinets include four main components:  Batches, Users, Index Fields and Profiles. The diagram below shows how these components fit together.

 

Below are descriptions of the four main components within this diagram. These are high-level descriptions and don’t necessarily address how the components work at a detailed level or their idiosyncrasies.

  1. Batches: Batches are ‘stacks of documents’ that are input into the system. All of the documents in a Batch should meet the standard for the File Cabinet they're scanned into. For example, you can have a File Cabinet for ‘Accounting Documents’ and a File Cabinet for ‘HR Documents’. Documents should be scanned together as Batches according to which File Cabinet they apply.
  2. Users: Users are assigned directly to File Cabinets. Once a User is assigned to a File Cabinet, they have rights to access all of the Profiles and Batches within the File Cabinet. WebCenter Capture also allows you to create Administrative Users who automatically have access to all File Cabinets and all administrative options.
  3. Index Fields: Index Fields are used to assign import information to documents while still within Capture. These fields are labeled and have a data type. Each field can be assigned a value in association with an individual document during indexing. All Index Fields created in a File Cabinet are available to every document in a File Cabinet, but are not necessarily required. Index Fields should be consistent with the required information for the document types used with the File Cabinet. For example, you could have a File Cabinet for ‘Accounting Documents’ which could include an Index Field like ‘Invoice #’ and a File Cabinet for ‘HR Documents’ which could include an Index Field like ‘Employee #’.
  4. Profiles: Profiles allow you to create separate settings for different document types. There are three types of Profiles: Scan, Index, and Commit. A File Cabinet can contain several Profiles of each type, but each individual Profile can be assigned to only one File Cabinet and cannot be copied. Below is a description of each.
  • Scan Profiles: A Scan Profile contains all the scan settings for a particular batch. It defines the scanner used, scanner settings, the batch name, and image processing options such as: rotating, cropping, noise removal, blank page removal, etc. For example you could have one Scan Profile for ‘Invoices’ and one for ‘Receipts’ in an ‘Accounting Documents’ File Cabinet.
  • Index Profiles: An Index Profile contains all the settings for assigning index values (metadata) to documents. It identifies which fields will be indexed and how they will be indexed. Index Fields not marked as required in an Index Profile can be left null and not visible on the indexing screen. Index Profiles can map to an external database to values, identify how separator sheets are used, set up certain fields to be auto-populated, and set up zones for character recognition to populate index fields.
  • Commit Profiles: A Commit Profile determines the repository where a File Cabinet’s documents are sent for final storage. It also maps the metadata between WebCenter Capture and the repository.

Note that you can create a single File Cabinet and create all your Profiles within this one File Cabinet. The main consideration here is that all Users assigned to the File Cabinet will have access to all of its Profiles.

There are three additional WebCenter Capture components that aren't included in the File Cabinet structure. This is because they are not assigned directly to a File Cabinet and can be bypassed altogether. These components are the Import Server, Recognition Server and Commit Server. The diagram below shows an example File Cabinet including these additional components highlighted in yellow. The diagram also shows a typical workflow that a Batch would go through from Scan to Commit.

 

The Import Server, Recognition Server and Commit Server run independently and process Batches that meet criteria across all File Cabinets. All of these components process Batches based on criteria and settings setup in groupings called Batch Jobs. Once a Batch meets the criteria for a Batch Job, it is processed. Below is a description of the function of each of these components.

  1. Import Server: The Import Server allows you to import electronic documents into WebCenter Capture from multiple sources such as a folder, FTP, email and faxes.
  2. Recognition Server: The Recognition Server can determine how images are separated into documents, how barcodes are read and assigned to Index Fields, and can automatically assign values from an external database.
  3. Commit Server: The Commit Server allows you to schedule the process of rendering compiled documents and sending them to a repository. It allows you to free up resources on the workstation while batches are actively being scanned and indexed. The Commit Server is separate from Commit Profiles.

Understanding the structure of the components described in this article will assist in visualizing how WebCenter Capture processes documents. This can be used when designing your business process, security model and metadata model as it must conform to this structure when using WebCenter Capture. Beyond the document processing structure, there are other points that make it unique. These include:

  • Advanced out-of-the-box features such as processing documents from electronic sources like MFP’s, Barcode Recognition with 1D and 2D symbologies, and zonal recognition with rubberbanding (on-the-fly OCR);
  • Expansion and integration with WebCenter Distributed Capture, Oracle’s web accessible capture solution;
  • Additional application customization with WebCenter Marcos;
  • A per processor licensing model (many other document capture solutions run on a per user or per click model).

Leveraging these can increase efficiency and lower cost and make WebCenter Capture a unique and viable web capture solution.

Comments

  • ! No comments yet

Leave a Comment