Class arrays are useful only in the LHS of rules.
The sendmail program offers two ways to use them:
$=X
The $=
prefix causes sendmail to seek a match
between the workspace and one of the words in a class list.
[5]
$~X
The $~
prefix causes sendmail to accept only
an entry in the workspace that does not match any of the words
in a class list.
[5] Under V8, words in a class can be multitokened.
The list of words that form a class array are searched by prefixing
the class name with the characters $=
.
R$=X $@<$1>
In this rule, the expression $=X
causes sendmail to search a class for
the word that is in the current workspace.
If sendmail finds that the word has been defined, and if it
finds that the word is associated with the class X
,
then a match is made.
The matching word
is then made available for use in the RHS rewriting.
Because the value of $=X
is not known ahead of time, the
matched word can be referenced in the RHS with the $
digit
positional operator.
Consider the following example. Two classes have been declared
elsewhere in the configuration file. The first, w
, contains all the
possible names for the local host:
Cw localhost mailhost server1 server2
The second, D
, contains the domain names of the two
different networks on which this host sits:
CD domain1 domain2
If the object of a rule is to match any variation on the local hostname at either of the domains and to rewrite the result as the official hostname at the appropriate domain, the following rule can be used:
R $=w . $=D $@ $w . $2 make any variations "official"
If the workspace contains the tokenized address server1.domain2,
sendmail first checks to see whether the word server1
has been defined as part of the class w
. If it has,
the dot in the rule and workspace match each other, and then sendmail
looks up domain2.
If both the host part and the domain part are found to be members of
their respective classes,
the RHS of the rule is called to rewrite the workspace.
The $2
in the workspace corresponds to the $=D
in the LHS.
The $=D
matches the domain2 from the workspace.
So that text is used to rewrite the new workspace.
Note that prior to V8 sendmail,
only words could be in classes. When sendmail looked up
the workspace to check for a match to a class, it looked up only
a single token.
That is, if the workspace contained server1.domain2.edu,
the LHS expression $=D
would cause sendmail to
look up only the domain2 part. The .edu part would
not be looked up, and consequently the rule would fail.
Note that the V8 and IDA versions of sendmail allow multitoken class matching.
The inverse of the $=
prefix is the $~
prefix.
It is used to match any word in
the workspace that is not in a class.
It is seldom used in production configuration
files; but when the need for its properties arises, it
can be very useful.
To illustrate, consider a network with three PC machines on it.
The PC machines cannot receive mail, whereas all the other machines on the network can. If the list of PC hostnames is defined in
the class {PChosts}
:
C{PChosts} pc1 pc2 pc3
Then a rule can be designed that will match any but a PC hostname:
R$*<@$~{PChosts}> $:$1<@$2> filter out the PC hosts
Here the LHS looks for an address of the form
"user" "<" "@" "not-a-PC" ">"
This matches only if the @
token is not followed
by one of the PC hosts listed in class {PChosts}
.
If the part of the workspace that is tested against the list provided
by $~
is found in that list, then the match fails.
Note that the $
digit
positional operator in the RHS (the $2
above) references the part of the workspace that doesn't match. If the
workspace contains ben<@philly>, the $2
references the philly.
Also note that multitoken expressions in the workspace will not match, as you
might expect. That is, for multitoken expressions in the workspace,
$~
is not the opposite of $=
. To illustrate,
consider this miniconfiguration file:
CX hostA.com R$~X $@ $1 is not in X R$=X $@ yes $1 is in X R$* $@ neither
Now feed a multitokened address through these rules in rule testing mode:
%/usr/lib/sendmail -Cx.cf -bt
ADDRESS TEST MODE (ruleset 3 NOT automatically invoked) Enter <ruleset> <address> >0 hostC.com
rewrite: ruleset 0 input: hostC . com rewrite: ruleset 0 returns: neither
Whenever $~
is given a multitoken expression, it will always
find the expression in the class and so will always fail.
Multitoken matching operators,
such as $+
, always try to match the least that they can
(see Section 28.5.2, "Backup and Retry").
Such a simple-minded approach could lead to problems in matching (or not matching) classes in the LHS.
However, the ability of sendmail to backup and retry alleviates
this problem.
For example,
consider the following three tokens in the workspace:
"A" "." "B"
and consider the following LHS rule:
R$+ $=X $*
Because the $+
tries to match the minimum, it first
matches only the A
in the workspace. The $=X
then
tries to match the .
to the class X
. If this
match fails, sendmail backs up and tries again.
The next time through, the $+
matches the
A.
, and the $=X
tries to match the
B
in the workspace. If B
is not in the class
X
, the entire LHS fails.
The ability of the sendmail program to back up and retry LHS matches eliminates much of the ambiguity from rule design. The multitoken matching operators try to match the minimum but match more if necessary for the whole LHS to match.
When comparing a token in the workspace to a list of words in a class array, sendmail tries to be as efficient as possible. Instead of comparing the token to each word in the list, one by one, it simply looks up the token in its internal string pool. If the token is in the pool and if the pool listing is marked as belonging to the class being sought, then a match is found.
The comparison of tokens to entries in the string pool is case-insensitive. Each token is converted to lowercase before the comparison, and all strings in the string pool are stored in lowercase.
Because strings are stored in the pool as text with a type, the same string value may be used for different types with no conflict. For example, the symbolic name of a delivery agent and a word in a class may be identical, yet they will still be separate entries in the string pool.
The sendmail program uses a simple hashing algorithm to ensure that the token is compared to the fewest possible strings in the string pool. In normal circumstances that algorithm performs its job well. At sites with unusually large classes (perhaps a few thousand hosts in a class of host aliases), it may be necessary to tune the hashing algorithm. The code is in the file stab.c with the sendmail source. The size of the symbol table hash is set by the constant STABSIZE.
As an alternative to large class arrays, sendmail offers external database macros (see Section 33.1, "Enable at Compile Time"). No information is currently available contrasting the efficiency of the various approaches.