Introduction to ArtiSan Rules

The purpose of the ArtiSan rules system is to evaluate whether an item needs to be archived or stubbed and, if so, what retention period should be applied to the item. There are three aspects to the rules system:

In essence, the rules system uses information about a message and its context to examine the rules tree. From the tree, it determines which rules should be applied to a given message. Associated with each rule in the tree is a VBScript scriptlet and some other properties. The rules system executes each scriptlet in turn, using custom objects that represent the message being examined and the context in which that message is found to determine what actions should be performed on the message.

The Rules Tree

The rules tree structure is created through a combination of queries to the active directory and the information contained in the file Rules.xml contained in the data directory. In some ways, the rules tree is just a model of the various users and groups in the system built, on the fly, from ADS and WebDAV queries. There are nodes representing the various users and groups in the system as well as nodes representing user mailbox folders and public mailbox folders. There are also nodes that represent an internal rule mechanism called a template.

The rules system has a built-in strategy for traversing the tree. In general, the system looks for a node specific to the message and then searches iteratively to ever more general rules that might apply to the message. The various versions of ArtiSan have slight variations in this process that have progressively refined the algorithm used.

In general, we refer to nodes in the rules tree as a rule and notate it by the path that describes the location of the rule in the tree. There are four root folders (TEMPLATES, GROUPS, USERS and PUBLIC FOLDERS). Under USERS is a list of all the user mailboxes in the system (if the list is large, then these may be grouped according to the first letter of the account name). Let's suppose you have three users Tom, Dick and Harry. Then there are rules for /USERS/Tom, /USERS/Dick and /USERS/Harry. Tom and Dick may be members of a group "Marketing" and all three in a group called "Sales". Consequently, simply by virtue of the fact that the groups Marketing and Sales exist in the ADS, there are rules for /GROUPS/Marketing and /GROUPS/Sales. Equally, Tom may have a folder in his mailbox called "Inbox" and so there is a rule for /USERS/Tom/Inbox. NOTE: the fact that nothing is defined for a particular node in the tree does not mean that a rule does not exist, but simply that there is no scriptlet or other definition associated with that rule.

Each rule in the tree has three properties, a scriptlet, an inherit flag and a delegate. Any or all of these properties can be undefined. The scriptlet is the VBScript that will be executed to evaluate the rule. The inherit flag determines whether the rules system should inherit rule information from the parent in the tree. The delegate indicates the name of a template rule to use for this node in the tree. If a rule is not defined, the system assumes that the script is empty, that the inherit flag is true and that there is no delegate. Let's suppose that a message is being evaluated in Tom's Inbox and that there are no rules defined. The rules system would search the tree in the following order:

/USERS/Tom/Inbox
/USERS/Tom
/USERS
/GROUPS/Marketing
/GROUPS
/GROUPS/Sales
/GROUPS

The system progresses from the specific (/USERS/Tom/Inbox) to the general (/USERS) because the inherit flag is by default true. In version 2.1, the order is slightly amended so that the /USERS and /GROUPS nodes, being the most general are left to the end, but the principle is still the same. Now, suppose we define a rule for Tom that has no scriptlet, but has the inherit flag false. Now the order of evaluation would be:

/USERS/Tom/Inbox
/USERS/Tom
/GROUPS/Marketing
/GROUPS
/GROUPS/Sales
/GROUPS

/USERS is missing because /USERS/Tom does not inherit from /USERS. We could also amend the system to define a template rule "MKT_RULE" and to delegate from the Marketing rule to that template. The order now becomes:

/USERS/Tom/Inbox
/USERS/Tom
/GROUPS/Marketing
/TEMPLATES/MKT_RULE
/GROUPS
/GROUPS/Sales
/GROUPS

We could add a script to the rule for /GROUPS/Sales, but this would not affect the traversal of the tree, but only the scripts that are collected during that traversal.

The Scripting Engine

The rules system is a standard VBScript engine exactly the same as used by ASP files and cscript/wscript. Consequently, many administrators will have some familiarity with it as these are common tools for automated administration tasks. The script engine has some COM modules automatically added to it. These are the "ArtiSan Type Library" and the Microsoft Scripting Run Time. The latter is a standard type library. The former is the basic type library of the ArtiSan system that defines key elements like MessageInfo, Management, the Engine and so on. Since the rules system is using a standard and extensible scripting engine in VBScript, the rules system is very flexible whilst maintaining transparency to the programmer. You can dim variables, set objects, call methods and do anything you would normally do in VBScript/ASP.

When executing scripts, the rules system automatically add "option explicit" to the top of each script. This forces the VBScript engine to require that all variables are defined prior to use. This is important because the rules system has to be able to run unattended and so script errors need to be isolated as early as possible.

You can add to the list of preloaded objects using the registry key HKLM\SOftware\Saxonite\AMM\Rules\GlobalObjects. Under this key, you create values whose name is the name of the object as you want it to appear in script and whose value is the ProgId of the object (e.g. "Scripting.FileSystemObject", "ADODB.Connection"). The system will then load these objects on the fly when the script engine is launched. This allows you to cache the object in the engine rather than constantly creating and destroying them, but you should be aware that these objects are shared across evaluations and so will carry state information between separate invocations of the rules. So suppose you decided to preload a database connnection in the rules script (because you didn't want to keep logging on and off a database), then you need to be aware that the connection is going to be shared across all evaluations in the rules system itself. In v2.1, you can also add preprocessed files to allow you to define common functions.

The system will execute the scripts in order that it has found them until either there are no more scripts or the Item.Archive property is set to True. In general, if there is no rule defined, then the system will do nothing. Indeed, the system attempts to detect this during the scanning process. Prior to opening a user mailbox the system will ask if there are any rules defined for the given user. If none are defined the user is skipped since if there are no rules, there can be no work to do.

The rules system can execute rules very quickly; measured in 1000s of evaluations per second, which is significantly quicker than getting the data from the Exchange to allow ArtiSan to execute the rules in the first place. So, in general, scripting performance is not a concern. However, it should be borne in mind that evaluation is performed on a per-message basis and so a very complex user script relying on external resources, such as database connections or web services, could have a significant impact on the ArtiSan system.

The BuiltIn Objects

The main type libraries described here are the ones called "ArtiSan Type Library" and "ArtiSan Rules Type Library". The ArtiSan Type Library defines general purpose objects such as Management (used for managing the system), the Engine interface (the thing used to operate on mail items) and so on. The ArtiSan Rules Type Library contains objects specific to the rules engine.

NOTE: One easy way to browse the objects in the system is to use the Visual Basic object browser. To do this, launch Visual Basic, create a new empty project and add references to the type libraries. Now, open the object browser. You can select the type libraries from the drop-down combo on the top left of the display. (This, of course assumes that you have the relevant files installed on a machine with Visual Basic installed on it).

The main objects In the rules type library are:

MessageInfo - an abstract of an item in Exchange

RuleItem - the context of a rule evaluation (contains the message, ancilliary information like folder size, user name and so on)

Evaluator - the rule engine itself In general, the rules system is launched through use of the engine or other part of the system and these type libraries are added.

In more detail, as each mail is tested for action the system creates an object called a MessageInfo. MessageInfo object contains a read-only abstract of the information in the mail item. The MessageInfo object is actually a specialised version of a general purpose property bag. It is possible to add your own properties to a message info object and to retrieve those properties. The main purpose of this is to make the data being passed into rules extensible so we can add new features later. Incidentally, .art files are the result of serialising the MessageInfo object to disk file.

When a test on a mail is to be performed, the system uses the strategy described above to create a list of scriptlets that need to be run based on properties such as user, groups, folders and so on. These are the scripts used to evaluate the status of the message (should it be archived/stubbed, what is the retention and so on). The rule evaluator then creates a RuleItem object with the MessageInfo and the other properties of the message. The evaluator then evaluates each script in turn passing the RuleItem object into the scripting environment and testing the results. From the perspective of scripting the RuleItem oject appears as a global variable called Item. Hence, the usage:

item.Archive = True

or

if item.Message.Size then ...

The Item object has a property called Message which is set to the MessageInfo object.

The Item object represents the context of evaluation.

It has the following read-only properties:
MessageThe message being examined
FolderSizeThe size of the folder containing the message
UserNameThe name of the user (e.g. Tom)
GroupNamesThe group names of the user (e.g. Marketing:Sales:Domain Users)
FolderNameThe name of the folder where the message is located
It has the following read-write properties:
ArchiveWhether to archive the message (defaults to False)
RetentionThe retention period (defaults to the system default retention period)
StubMessageWhether to stub the message (Defaults to True)
SkipRulesWhether any further rules should be processed for this message
DeleteMessageWhether this message should be deleted from the mailbox
MarkCompletedWhether this message should be reprocessed in future
NextProcessDateThe date at which this message should be reprocessed
It has the following method:
LogAllows a rule to generate ArtiSan logs (see below)

The MarkCompleted and NextProcessDate date are part of the Smart Rules technology. In general, if an item is either stubbed or deleted it is automatically marked as complete. By marking an item complete, the scanner can skip evaluation of the item in future. However, in some circumstances, you may wish to mark an item as complete without stubbing or deleting the message. This is achieved by setting MarkComplete to be true. Equally, you may know that the effect of your rules means that no further operation is necessary on the item until a specific date in the future. You can achieve this by setting the NextProcessDate to be the date when you want to reprocess the message.

Careful application of smart rules can:

If you update the rules all the relevant items will be reprocessed to ensure that the changes are taken into account. For instance, if you change a rule for a user, then all mail in the users mailbox will be reprocessed on the next scan.

The MessageInfo Properties are:

SubjectThe subject of the message
FromThe sender(s) of the message
ToThe main recipients of the message
CCIndividuals who were copied on the message
BCCIndividuals who were blind copied on the message
DateThe most recent of the send or received date
SizeThe size of the message
HasAttachmentsWhether the message has attachments
ArchiveDateThe date on which the message was archived
MessageClassThe MAPI message class of the message
ImportanceThe importance of the message
SensitivityThe sensitivity of the message
BodyTextThe body text of the message
BackupCountThe number of times the message has been successfully backed up (see below)
OriginalBackupBackup information for the first time the item was backed up
LastBackupBackup information for the last backup of the message
PreviousBackupBackup information for the previous backup of the message

The backup information associated with a message is only maintain is the administrator indicates the times of each backup using the BackupAMM.vbs script supplied. For each backup, the system has the following properties:

IsValidWhether the backup was made
IDThe unique identifier of the backup
OperatorThe name of the operator performing the backup
ReasonThe reason the backup was made
IncrementalWhether the backup was full or incremental
DateThe date of the backup

Debugging Rules

In general, the main tools for testing rules are:

a) The Management UI itself

b) RuleChecker

c) Debug information within rules

d) Test Mode Accounts

A key feature of the management UI rules pages is the ability to select a node in the tree and request the system display the rules that would apply in the context. This allows the user to tell if a given rule would be applied to a given message. The management UI also forces the rule engine to parse new scriptlets that are defined allowing the user early indication of whether a rule could execute.

The RuleChecker is a program supplied as part of the installation that allows end-users to test the outcome of rules definitions for given message properties. For instance, you can use rule checker to ask "what would happen if Tom had a message that was three months old in his Sent Items folder"?

It is also possible to generate log information from rules as they execute in the system using the Log method of the RuleItem. So you can add things in rules like:

Item.Log 1, "Message has size " & Item.Message.Size & " bytes"

This will emit a log at trace level 1 to AMM.log assuming that the "AMM Rules" value (under HKLM\Software\Saxonite\AMM\Trace) is set to log at level 1. You will see something like:

rulename(0): AMM Rules: 06/08/05 15:43:59: LEVEL 1: 1692: 1636: Message has size 15346 bytes

Finally, there are the test mode accounts. These are defined in the management UI. The "Test Mode Account" restricts the actions of the system as a whole to one specific account for the purposes of test. The "Test Mode Alias" in the name of the account that should be used for executing rules.

Using Management

One key trick in the system is to use the various objects in the ArtiSan type library like the Management object. Management lets you do things with configurations (including the ability to create your own), it gives you access to key system information like the name of the domain server, details of licencing, access stats from the engine, version information and so on. Indeed, the management UI is largely built from functions provided by Management (in association with ADSI and ADO). In addition, management allows you access to the logging system. This means you can generate trace logs from within the rules system. The way you would do it is to create a mangement object, set the LogName property of the object (this will be reflected in the registry under trace, where you set the trace level) and then use the Log method to pass the relevant log level and trace string. The trace information will turn up in AMM.log. But I hear you say, "how is that different from the Log method of the RuleItem object"? The point here is that you can create you own trace types that will be listed separately in the Trace key and therefore is controllable separately. The Log method on the RuleItem object shares the trace context with the rule system, so increases to the log level will affect both.

A Real World Example of Using Rules

To illustrate the use of rules, we will create a real-world scenario. Let's suppose that you have a couple of hundred users of whom thirty are candidates for mail archival. You maintain a journal because you need to capture all mail messages.

You begin by creatng an ADS group called "AMM Users" and place the thirty individuals in that group. You can now edit a rule for the group to get their mail in the archive. We are going to archive their message immmediately and stub the message when the item is at least 180 days old and has been archived for more than 30 days. The rule for "AMM Users" would look something like:

Item.Archive = True
if (Item.Message.Date < Now - 180) And (Item.ArchiveDate < Now - 30) Then
Item.StubMessage = True
end if

A rule would also be placed on the owner of the journal mailbox. The rule we will apply will state that the item should be archived immediately and deleted once it has been archived for at least thirty days and at least two backups have been completed. The rule to implement this would be:

Item.Archive = True
if (Item.Message.BackupCount > 2) And (Item.ArchiveDate < Now - 30) Then
Item.DeleteMessage = True
end if