Partitioning DocuVaults

This topic provides instructions for partitioning your DocuVault.

Overview

The DocuVault partition featurewhich you invoke using the partition_prog utility—is an extension of the DocuVault architecture that enables you to move DocuVault content from the DocuVault’s primary disk to a secondary storage environment. Storing documents on disk creates an active storage environment in which all content is instantly retrievable. You also can migrate content from active storage to an inactive storage environment (e.g., HSM or another separately managed storage system). Inactive storage might not meet the needs of your user base, however, because delays can occur when users attempt to retrieve documents.

The Partition tab in the Administration Tools module allows you to manage your DocuVault partitions. See Using the Partition Tab for more information.

partition_prog.exe recognizes and uses many Content Processing Facility (CPF) commands as well as partition_prog-exclusive commands. See the ASG-Cypress Content Processing Facility Reference topic for more information.

DocuVault partitioning splits a DocuVault into read-only pieces, or partitions, that contain the documents you specify. As documents are moved into partitions and written to disk, the document partition system index is updated to reflect each document’s new location. Cypress automatically closes all DocuVault partition files if they have been idle for more than 15 minutes so that the partitions can be moved, destroyed, or otherwise managed.

When a document that has been moved to a partition is selected as a result of a DocuVault query, Cypress transparently and automatically identifies the partition containing the document, retrieves it from the partition, and processes and delivers it as appropriate.

DocuVault partitions—and the documents they contain—are still an integral part of the DocuVault even though they are not stored on the DocuVault’s primary disk. ASG strongly encourages you to leave DocuVault partitions on a secondary disk to gain performance, cost, and reliability benefits. You can configure an HSM system to migrate (and de-migrate) partitions written to a secondary disk.

DocuVault Partition Operation and Event Flow

You can use partition_prog, as well as the Partition tab in the Administration Tools module, to create partitions and move documents from one partition to another.

Creating and Populating Partitions

Cypress uses a partition_prog script to create and populate partitions. The partition_prog script is a program that specifies documents to be moved into the partition, where the partition is to be written, and partition management information (e.g., partition name and description).

To create partition_prog scripts, you can use commands that are common to both CPF and partition_prog as well as commands that are unique to partition_prog. When you execute a partition_prog script, these events occur:

Your script queries the DocuVault to identify documents that match your selection criteria.
The script sends the document list to Cypress.
Cypress creates the partition at the specified location.
Cypress populates the partition with the documents on the list and their indexes.
Cypress executes the backup command specified in the begin_partition function, if any.
Because the partition files are now a permanent part of the DocuVault, ASG strongly recommends that you perform a backup of these files.
Cypress updates the document partition system index.
Cypress deletes the partitioned documents from the DocuVaults.
Cypress executes the complete command specified in the begin_partition function, if any.

For more information on using scripts, see Creating and Executing partition_prog Scripts.

Retrieving Documents from Partitions

Document retrieval is completely transparent and fully automated. Cypress processes end-user and CPF document queries normally and returns results that include partitioned documents. Cypress will not retrieve a partitioned document unless it is explicitly selected for viewing or processing.

If a user or program selects a partitioned document for retrieval, Cypress takes these actions:

Identifies the partition in which the document is stored
Retrieves the document
Processes the document as requested

Indexing Documents in a Partition

The document partition index, Document Partition Number, is a numeric system index that identifies the location of all documents within a DocuVault.

Documents stored on the DocuVault’s primary disk are indexed with a value of 0. Documents stored on partitions are indexed with a value equal to the partition number on which they reside. A partition number (a nonzero unsigned integer) is automatically generated and assigned each time a new partition is created. The first partition you create is assigned a value of 1. This value is incremented by one each time you create a new partition.

This numeric system index provides you with new ways to control access to content. Users can query the document partition index from the client user interface using a folder query or expression query, and developers can build document partition index queries into CPF, Cypress.Web, and Cypress Server Pages applications.

For example, you might want to create Cypress.Web or Cypress Server Pages that only query active content. You could limit your query to the primary disk and partitions you have identified in secondary disk storage. Similarly, partition_prog scripts could choose documents from multiple partitions and move them to a new partition or combine multiple partitions into a new single partition by adding the partition number index to an expression query.

Creating and Executing partition_prog Scripts

The partition_prog utility is a variant of CPF. The scripts you create for partition_prog are based on CPF programming principles and conventions, and adhere to CPF syntax. Several functions unique to partition_prog are provided to control partition creation, identify documents to be moved into a partition, etc.

You must execute partition_prog scripts in batch mode from a command prompt window. You can install and execute partition_prog either at the Cypress Server or from a remote workstation, if more convenient. Keep in mind, however, that performance can suffer if you do not execute the program at the Server.

ASG recommends that, when possible, you design partition_prog scripts to accept parameters directly from the command string entered at execution. This eliminates the need to continually modify existing partition_prog scripts.

This string runs a script that takes five parameters and moves all documents created before October 17, 2008, into a partition named part1:

partition_prog.exe input=d:\prod\partition_by_date.cpf output=d:\prod\partition_by_date.lst debug

param=2008 param=10 param=17 param=part1 param=docs before 101700

When you implement DocuVault partitions, keep these restrictions in mind: You cannot rename a partition after you create it. You cannot create a partition when your DocuVault is in online dump mode or start an online dump while Cypress is creating a partition.
The user who runs the partition_prog script must have read access to all documents that Cypress queries. No other security permissions are necessary. ASG strongly recommends that you control access to partition_prog scripts tightly.

Partitioning Best Practices

Unless you already have an external document storage solution in place, you must establish policies and procedures for implementing DocuVault partitioning. These policies and procedures will create an efficient and flexible external document storage environment. Consider the issues discussed in this section when you are developing your site’s partitioning policies and procedures.

Choosing a Secondary Storage Environment

ASG strongly encourages you to store partitions on disk unless you are required by law or company policy to store documents on tape or other offline media or you have very little need to access documents in partitions. This enables you to maximize knowledge and worker productivity, since document retrieval times are near-instant. By contrast, storing partitions on tape can delay access to needed documents. Disk storage is often less expensive and usually more reliable than tape.

To maximize throughput and minimize disk contention, ensure that your partitions, DocuVault data files, and log files each reside on separate disks.

Ensuring Sufficient Disk Space to Accommodate HSM Systems

If you are storing partitions on HSM or another external disk storage system, ensure that the amount of space available in your secondary disk environment can accommodate new partition creation and partitions moved back to secondary disk by an HSM system. When you determine the necessary amount of space, consider the size of your partitions, end-user access to partitions in inactive storage systems, partition creation frequency, and other related factors.

See DocuVault Partitions and HSM Systemsfor more information.

Choosing an Organization Strategy

The strategy you use to divide and organize your content into partitions is completely up to you. Because Cypress identifies documents for partitioning based on page content and document information (e.g., creation date, deletion date, etc.), you can organize partitions in the way that best suits your users and organization. You can create partitions based on these organizing principles:

Report type (e.g., Sales Reports, HR Reports, and Financial Reports)
Source application (e.g., SAP, HBOC, and PeopleSoft)
Organization (e.g., Accounts Payable, Marketing, and Admitting)
Retention time (e.g., January 1, 2010; April 15, 2005; and never delete)
Any combination of document characteristics

Determining the Frequency of Partition Creation

You should determine the frequency of partition creation based on your DocuVault’s growth rate (i.e., the amount of free disk space you wish to maintain on your DocuVault’s primary disk) and your site’s business requirements (e.g., government regulations).

Determining Partition Size

Although the number of documents that you store in a partition is largely a function of how frequently you partition your DocuVault, ASG strongly encourages you to create partitions containing many documents if you plan to store partitions on HSM solutions. If you store documents on tape, an HSM system might require longer to process a request and load a tape. Accordingly, your goal should be to migrate as many documents with each tape access as is practical. This will make more documents available for retrieval (query) and result in fewer tape loads.

Choosing the Best Time to Create Partitions

ASG recommends that you create partitions at non peak printing and processing hours. Additionally, you should ensure that partition creation does not conflict with your online backup schedule. You cannot partition a DocuVault during an online backup or start an online backup while a partition is actively being created.

Using the Partition Tab

In addition to using partition_prog scripts, you can manage DocuVault partitions from the Partition tab in Cypress’s Administration Tools module. From this tab, you can delete partitions, resume suspended partitions, and set partition security. This section discusses the elements of the Partition tab.

The Button Panel

The button panel to the right of the Partition tab differs from the standard Cypress button panel:

The New and Clone buttons are always disabled when you are using the Partition tab. In addition, the Partition tab’s button panel includes an extra button: Destroy. Clicking the Destroy button deletes the partition.

The General Tab

The General tab holds basic information about your partitions.

The Name field displays the name of the selected partition.
The Description field enables you to enter a string that identifies the contents of the partition.
The File Path field displays the location of the partition.
The Status field displays the current status of the partition. You can monitor this field to determine when partition creation is complete. If the status is Suspended, you can click the Resume button to bring the partition back online.
The Created field displays the date and time the partition was created.
The Sch. Delete field displays the date and time that Cypress is scheduled to delete the partition. If you select the Never check box, the partition will be retained indefinitely.

The File List Tab

The File List tab displays the contents of the selected partition. This tab is most useful when you are migrating partitions, because it enables you to determine the total size of each partition.

The File Name column lists the names of the Cypress data files contained in the selected partition. The files are located in the directory specified in the General tab’s File Path field.
The Size (KB) column displays the size of each file in the partition, in kilobytes.
The Total File Size xxx KB field displays the size of the entire partition, in kilobytes.

The Security Tab

The Security tab enables you to set the administration security permissions for access to the selected partition.

This table describes the available permission settings, in order by least access to most access:

Permission

Description

Read

The assignee can view the partition’s properties.

Write

The assignee can modify the selected partition’s description and scheduled deletion time.

Delete

The assignee can delete the selected partition.

Owner

The assignee can modify partition security.

ASG recommends these administration security assignments:

End users have only Read access.
Administrators have all permissions.

DocuVault Partitions and HSM Systems

DocuVault partitioning enables Cypress to be compatible with hierarchical storage management (HSM) systems. To support HSM systems and other external storage systems, you must configure the HSM product to enable it to migrate DocuVault partitions to and from secondary disk storage.

As Cypress writes DocuVault partitions to disk, you can configure the HSM solution to acquire the partition and migrate it to tape or to another library, based on criteria that you have defined. When the HSM system migrates the partition, it records the new location in the operating system’s internal file table.

When you need to retrieve a document from a partition stored in an HSM system, Cypress requests the partition from the platform operating system, not the HSM system directly. Then the operating system requests the partition from the HSM system, which then loads, retrieves, and writes it to the specified location on disk. Once the partition is on disk, Cypress can retrieve the requested document from the partition for processing.

If you plan to store partitions on HSM solutions, ASG recommends that you create large partitions of related documents. When a document must be retrieved from tape, the HSM system must access the tape, read the contents of the partition, and write it to disk (i.e., active storage). Depending on the HSM system’s capabilities and the number of user requests waiting to be processed, the time to migrate a partition can be several seconds, several minutes, or hours. Creating large partitions of related documents makes more documents available for retrieval from disk and probably will minimize the number of times a tape must be loaded.

Monitoring partition_prog Events

Cypress records partition_prog status information in these forms and/or places:

Windows Event Viewer messages
db_prog console window
pcon_log files

Microsoft Event Viewer Messages

When you execute a partition_prog script, Cypress writes the messages in this table to the Microsoft Event Viewer’s application event log. To find DocuVault partition-related events quickly, filter the Event Viewer by event ID and scroll to numbers listed in the Event ID column:

Message

Event

Event ID

Cypress DocuVault partition # <x> has been completed successfully.

Partition completed successfully

190

Cypress DocuVault partition has been cancelled.

Partition creation cancelled

189

Cypress DocuVault started a DocuVault partition. The partition name is <y> partition # <x>.

Begin partition creation

185

where:

<x> is the internally generated document partition number that uniquely identifies a partition. Not only is this number required for Cypress-internal operations, but it also is very helpful if your DocuVault has multiple partitions with the same name. You also can use it to control the scope of a query.

<y> is the name of the partition used in the partition_prog script.

In addition, if Cypress encounters an error, it writes a message specific to that error to the application event log.

db_prog Console Window

The partition_prog utility writes status information to the db_prog.exe window when partition creation begins. It provides a convenient and efficient method to track the creation progress.

pcon_log Files

Every day, Cypress creates a pcon_log file containing a list of all Cypress-related events over a 24-hour period. This feature enables you to quickly view Cypress events when you are troubleshooting problems or verifying that specific events have occurred as expected.

The log files contain detailed information describing all DocuVault partition activities. By viewing the logs, you can follow every step in the partition creation process.

As of Cypress 6.5, events in the pcon_log file are timestamped and all DocuVault repair and repacks include start and finish events.

Scheduling the Deletion Time of a DocuVault Partition

To schedule the deletion time of a DocuVault partition

1. Open the Administration Tools module, and select the Partition tab.
2. Select the partition for which you want to modify the deletion time from the tree view at the left side of the window.

3. On the General tab, enter the desired deletion date and time in the Sch. Delete field.
Before you enter values in the Sch. Delete field, you must clear the Never check box. Cypress will automatically set the Sch. Delete field to one year from the current date.
4. Click Apply on the button panel on the right side of the application window to save your changes.

To prevent Cypress from deleting a DocuVault partition automatically

1. Open the Administration Tools module, and select the Partition tab.
2. Select the Never check box to retain the partition indefinitely.
3. Click Apply on the button panel on the right side of the application window to save your change.

Destroying a DocuVault Partition

Destroying a partition permanently deletes it from the specified directory and removes from the primary DocuVault all information associated with its documents.

To destroy a DocuVault partition

1. Open the Administration Tools module, and select the Partition tab.
2. Select the partition that you want to delete from the tree view at the left side of the window.

 

3. Click the Destroy button.

Or  

Select Edit } Delete.

A message box displays:

4. Click Yes to destroy the partition.

Or  

Click No to keep the partition.