previous next top complete contents complete index framed top this page unframed
An expression can be a constant, a variable, a call to a typed PROCEDURE (a PROCEDURE expression), a substring, an IF expression, an assignment expression, a compiletime pseudoprocedure, or a combination built up with operators and parentheses. Unless otherwise stated, the order of evaluation of the components of an expression is unspecified.
The data type and other attributes of a variable are given in the variable's declaration; see Chapter 6.
The value of a variable may be changed by an assignment statement (Section 5.1), by an assignment expression (Section 5.2), or by a dotted operator (Section 4.8.6), or when used as a PROCEDURE argument corresponding to a MODIFIES or PRODUCES parameter (Section 7.5).
A variable is either a simple variable, a subscripted variable, or a field variable.
A simple variable is an identifier associated with a single value that may be changed during program execution as governed by its data type. For example, if n is an INTEGER variable, it may be assigned INTEGER values during program execution.
Subscripted variables are used to access the elements of an ARRAY (see Section 12.6) and field variables are used to access the fields of a record (see Section 9.5) and the interface fields of a MODULE (see Section 11.3).
A call to a typed PROCEDURE in an expression invokes execution of the PROCEDURE body and then uses the value it returns in place of the PROCEDURE call in the expression.
A substring of a STRING s is specified by:
s[e1 TO e2] or s[e1 FOR e2]
where e1 and e2 are INTEGER expressions. e1 is the start position. e2 is the stop position in the TO form, and the length in the FOR form. The characters of a STRING are numbered from left to right, the first position being number one. If e1 is less than one, then the effect is the same as if one were used for e1, and if e2 is greater than the length of s, the effect is the same as if the length of s were used for e2. For example, if:
s = "yellow"
then:
s[1 TO 4] = s[1 FOR 4] = s[-3 TO 4] = "yell",
s[4 TO 6] = s[4 FOR 3] = s[4 FOR 99] = "low", and
s[7 FOR 1] = "" (since s doesn't have a 7th character).
A substring cannot be assigned a value, since it is not a variable.
The s in s[e1 TO/FOR e2] may be a STRING variable, a STRING constant, a parenthesized STRING expression, or a call to a STRING PROCEDURE. If s is a STRING constant, and both e1 and e2 are INTEGER constant expressions, then the substring is evaluated at compiletime.
The system procedure $nth (see Section 42.23) is sometimes more convenient than a one-character substring; it returns the character code for the character at a specified position in a STRING.
s = "brown"
then:
s[1 TO INF - 1] = s[1 TO length(s) - 1] = s[1 TO 4]
= "brow"
INF stands for “infinity”. It gives the rightmost character position, no matter what the length of the STRING.
INF is evaluated at compiletime if the STRING is a constant. In this case the length returned is the same as that that would have been obtained at runtime.
IF e1 THEN e2 ELSE e3
where e1, e2, and e3 are expressions. IF expressions may be nested, e.g.,
IF e1 THEN e2
ELSE IF e3 THEN e4
ELSE e5
MAINSAIL provides the abbreviations EF for ELSE IF and EL for ELSE. They allow alignment of conditions in IF expressions for clarity. Thus, the example above could be written as:
IF e1 THEN e2
EF e3 THEN e4
EL e5
In an IF expression, each possible result expression following THEN is preceded by a condition expression following IF or EF. The condition expressions are evaluated one by one (starting with the first) until one evaluates to a non-Zero value. Its associated result expression becomes the value of the IF expression, and no further condition expressions are evaluated. If all condition expressions evaluate to Zero, the expression after the final ELSE (or EL) becomes the value of the IF expression. Unselected result expressions are not evaluated.
This use of an IF expression used in an assignment statement:
var := IF i < 0 THEN k
EF i = 0 THEN k + 1
EL k * 10
is equivalent to the IF statement (see Section 5.6):
IF i < 0 THEN var := k
EF i = 0 THEN var := k + 1
EL var := k * 10
If the expression governing an IF expression is a constant, the IF expression is evaluated at compiletime.
If the governed expression is also a constant, the IF expression can appear anywhere a constant would be allowed. For example, the following code defines a as 2:
DEFINE a = IF FALSE THEN 1 EL 2;
When an IF expression is used as an operand, it must be enclosed in parentheses; e.g.:
result := a + (IF b > 10 THEN c ELSE d)
CASE i := (IF j THEN k ELSE m) OFB ...
For simple data types (those other than ADDRESSes, POINTERs, inplace records, $PROCVARs, and (inplace) ARRAYs), the rule is that all the result expressions of an IF expression must be of the same type, and that type is the result type of the IF expression. The rule is more complicated for non-simple data types.
For purposes of the discussion below, a prefix CLASS is considered smaller than any of its prefixed CLASSes (or their prefixed CLASSes, and so on); likewise, a prefixed CLASS is considered larger than any of its prefix CLASSes. Two unrelated CLASSes (see Section 9.8 for a discussion of related and unrelated classes) are not considered to have an ordering.
The rules for determining the result type of an IF expression, where the result expressions (i.e., the THEN expressions and ELSE expression) are ADDRESSes, POINTERs, records, $PROCVARs, or (inplace) ARRAYs, are as follows:
For example, given the following declarations:
CLASS c (...);
CLASS(c) d (...);
CLASS(c) e (...);
POINTER(c) pc;
POINTER(d) pd;
POINTER(e) pe;
then the result of IF bo THEN pc EL pd is classified as POINTER(c) (c is the smaller of c and d), but IF bo THEN pd EL pe is an unclassified POINTER (because d and e are considered unrelated even though they have a common ancestor, c).
Note that even though two POINTERs or ADDRESSes of unrelated CLASSes are not assignment compatible, they can be result expressions of the same IF expression.
For example, given the following declarations:
CLASS c (...);
CLASS(c) d (...);
CLASS(c) e (...);
CLASS x (...);
$RECORD(c) rc;
$RECORD(d) rd;
$RECORD(e) re;
$RECORD(x) rx;
then the result of IF bo THEN rc EL rd is classified as c (c is the smaller of c and d), and IF bo THEN rd EL re is also classified as c (c is the largest common ancestor of both d and e). IF bo THEN rc EL rx is not legal.
If a particular ARRAY bound is a constant for all result expressions, then that constant is the corresponding ARRAY bound in the IF expression result. Otherwise, the corresponding ARRAY bound in the IF Expression result is *.
If the selected result expression's value is computed into a temporary before being selected as the IF expression's result (which would happen, e.g., if the result expression were a call that returned an inplace record), then a reference to the temporary ends up being passed as the argument to the call. However, such a temporary is not created just because of the IF expression.
If a result expression is an assignment expression, then that result expression's reference is to the assignment's destination.
If a result expression is itself an IF expression, then the result of the inner IF expression is a reference to the result expression selected by the inner IF expression's BOOLEAN expressions.
For example, if proc is a PROCEDURE that expects a $REFERENCE $RECORD(c) parameter, and rec1, rec2, and rec3 are variables of type $RECORD(c), then:
proc(IF bo THEN rec1 EL rec2 := rec3)
passes the address of either rec1 or rec2 as the argument to proc, depending on the value of bo. In this example, the call is equivalent to:
IF bo THEN proc(rec1) EB rec2 := rec3; proc(rec2) END
An example of the use of an assignment expression i := j + 2:
IF i := j + 2 THEN s...
which is equivalent to:
i := j + 2; IF i THEN s...
An expression that would depend on whether the variable on the left or the expression on the right of the := is evaluated first is undefined. Thus, i := 0; a[i] := (i := 2) may assign to a[0], assign to a[2], give an error message, or produce undefined behavior. It is the programmer's responsibility to avoid assignment expressions that are affected by the order of evaluation of the variable and the expression.
If the variable assigned to is changed before the assigned value is used, the result of the assignment expression is undefined. For example, the results of the expressions (v := 3) > (v := 0) and (a[i] := 4) > (i := 0) are undefined. It is the programmer's responsibility to avoid undefined expressions.
DSP and $LDSP are compiletime pseudoprocedures that return the offset of a field in a record; see Section 16.8.
$compileTimeValue is a compiletime PROCEDURE that provides a number of miscellaneous compiletime values; see Section 32.48.
$expr and $STMT are not compiletime pseudoprocedures themselves, but are facilities allowing arbitrary MAINSAIL code to be executed at compiletime (see Chapter 19).
The second column of each table gives data type information in the general format:
t_{}_{1}_{}, ..., t_{n} -> t
t_{i} gives the allowed data types for the ith operand (the leftmost operand is number one) by listing the possible data type abbreviations (see Table 1–1) for the ith operand, separated by dashes. t is the data type of the result.
For example, in the shift right operation e1 SHR e2, e1 is the first operand, SHR is the operator, and e2 is the second operand. From Table 4–2, it can be seen that the “Data Types” entry contains the lines b,i -> b and lb,i -> lb; this means that the first operand must be a (LONG) BITS, the second must be an INTEGER, and the result of e1 SHR e2 is of the same data type as the first operand (i.e., a (LONG) BITS).
n (standing for “numeric”) is an abbreviation for i-li-r-lr, and ninp is an abbreviation for all types except inplace records and ARRAYs, i.e., bo-n-b-lb-s-a-c-p-array-procvar; all stands for all types, i.e., bo-n-b-lb-s-a-c-p-array-procvar-inplaceRecord-inplaceArray.
If all t_{i} must be the same data type, then only t_{1} is given. For example, in the test operation e1 TST e2, the first and second operands (e1 and e2) must both be the same data type, either BITS or LONG BITS. If the result value t has the same data type as all the operands, then -> t is omitted. For example, in the concatenation operation e1 & e2, both operands must be STRINGs and the result is also a STRING, so the “Data Types” entry for this operation is just s.
In the tables, e, e1, and e2 stand for expressions and v for a variable.
Operation | Data Types | Description of Result |
---|---|---|
NOT e | ninp -> bo | IF e THEN FALSE ELSE TRUE (if e is non-Zero the result is FALSE, otherwise it is TRUE) |
- e | n | negation of e |
Operation | Data Types | Description of Result |
---|---|---|
v := e | all | Assign e to v. The result is the value assigned. See Section 4.6. e and v must be assignment compatible (see Section 4.9). |
e1 OR e2 | ninp,ninp -> bo | IF e1 THEN TRUE EF e2 THEN TRUE EL FALSE (e2 is evaluated only if e1 is FALSE) |
e1 AND e2 | ninp,ninp -> bo | IF NOT e1 THEN FALSE EF e2 THEN TRUE EL FALSE (e2 is evaluated only if e1 is TRUE) |
e1 = e2 | ninp -> bo | TRUE if e1 is equal to e2. See Section 4.8.2 regarding STRING comparisons. |
e1 NEQ e2 | ninp -> bo | TRUE if e1 is not equal to e2. See Section 4.8.2 regarding STRING comparisons. |
e1 < e2 | n-s-a-c -> bo | TRUE if e1 is less than e2. See Section 4.8.2 regarding STRING comparisons. |
e1 LEQ e2 | n-s-a-c -> bo | TRUE if e1 is less than or equal to e2. See Section 4.8.2 regarding STRING comparisons. |
e1 > e2 | n-s-a-c -> bo | TRUE if e1 is greater than e2. See Section 4.8.2 regarding STRING comparisons. |
e1 GEQ e2 | n-s-a-c -> bo | TRUE if e1 is greater than or equal to e2. See Section 4.8.2 regarding STRING comparisons. |
e1 TST e2 | b-lb -> bo | TRUE if any 1-bit in e2 is a 1-bit in e1. Same as (e1 MSK e2) NEQ '0. TST stands for “test”. |
e1 NTST e2 | b-lb -> bo | TRUE if no 1-bit in e2 is a 1-bit in e1. Same as NOT (e1 TST e2). NTST stands for “not test”. |
e1 TSTA e2 | b-lb -> bo | TRUE if all 1-bits in e2 are 1-bits in e1. Same as (e1 MSK e2) = e2. TSTA stands for “test all”. |
e1 NTSTA e2 | b-lb -> bo | TRUE if not all 1-bits in e2 are 1-bits in e1. Same as NOT (e1 TSTA e2). NTSTA stands for “not test all”. |
e1 MIN e2 | n-s-a-c | Minimum of e1 and e2. |
e1 MAX e2 | n-s-a-c | Maximum of e1 and e2. |
e1 + e2 | n | Sum of e1 and e2. |
e1 - e2 | n | Difference of e1 and e2. |
e1 IOR e2 | b-lb | Inclusive or of e1 and e2. See Section 4.8.3. |
e1 XOR e2 | b-lb | Exclusive or of e1 and e2. See Section 4.8.3. |
e1 MSK e2 | b-lb | Mask e1 with e2; i.e., the result has 1-bits only where both e1 and e2 have 1-bits (logical and). See Section 4.8.3. |
e1 CLR e2 | b-lb | Clear e2 from e1; i.e., the result has 1-bits only where e1 has a 1-bit and e2 has a 0-bit. See Section 4.8.3. |
e1 ! e2 | b-lb | Same as e1 IOR e2, except has higher precedence (see Section 4.8.5). |
e1 * e2 | n | Product of e1 and e2. |
e1 / e2 | r-lr | Quotient (REAL) of e1 and e2. |
e1 DIV e2 | i-li | Quotient (INTEGER) of e1 and e2. The remainder is discarded. Undefined if e1 is negative or e2 is not positive; see Section 4.8.1. |
e1 MOD e2 | i-li | Remainder of e1 divided by e2. Same as e1 - e2 * (e1 DIV e2). Undefined if e1 is negative or e2 is not positive; see Section 4.8.1. MOD stands for modulus, another name for remainder. |
e1 SHL e2 | b,i -> b; lb,i -> lb | e1 shifted left by e2 bits; leftmost e2 bits are lost (i.e., logical shift rather than arithmetic shift). 0-bits are brought in from the right. Undefined if e2 < 0 or GEQ the number of bits in the data type of e1. |
e1 SHR e2 | b,i -> b; lb,i -> lb | e1 shifted right by e2 bits. 0-bits are brought in from the left (i.e., logical shift rather than arithmetic shift). Undefined if e2 < 0 or GEQ the number of bits in the data type of e1. |
e1 & e2 | s | e1 concatenated with e2; the STRING consisting of the characters of e1 immediately followed by the characters of e2. |
e1 ^ e2 | i,i -> i; li,i-> li; r,i -> r; lr,i -> lr; r,r -> r; lr,r -> lr; lr,lr -> lr | e1 raised to the power e2. If e1 is an INTEGER or LONG INTEGER, undefined if e2 is negative. If e2 is not a positive INTEGER, undefined if e1 negative. Undefined if e1 and e2 both zero. |
The operations shown in the Table 4–2 are evaluated at compiletime if all the operands are evaluated at compiletime, except when any operands are of type (LONG) REAL.
Truncate toward zero | Truncate toward negative |
---|---|
7 DIV 3 = 2 | 7 DIV 3 = 2 |
-7 DIV 3 = -2 | -7 DIV 3 = -3 |
7 DIV -3 = -2 | 7 DIV -3 = -3 |
-7 DIV -3 = 2 | -7 DIV -3 = 2 |
The truncate-toward-zero convention has the advantage of obeying these relations that also apply to real numbers:
a DIV b = - (-a DIV b) = - (a DIV -b) = -a DIV -b
The truncate-toward-negative-infinity convention has the advantage that division by positive powers of two can be implemented on two's-complement machines (by far the most common) as an arithmetic right shift for both positive and negative dividends, since, e.g.:
-1 DIV 4 = -1
the same as the result of an arithmetic right shift of two bits.
MOD is usually defined in terms of DIV by the relation:
(a DIV b) * b + a MOD b = a
i.e.:
a MOD b = a - (a DIV b) * b
The following examples show MOD operations obeying this relation under each convention, corresponding to the DIV operations above:
Truncate toward zero | Truncate toward negative |
---|---|
7 MOD 3 = 1 | 7 MOD 3 = 1 |
-7 MOD 3 = -1 | -7 MOD 3 = 2 |
7 MOD -3 = 1 | 7 MOD -3 = -2 |
-7 MOD -3 = -1 | -7 MOD -3 = -1 |
For reasons of efficiency, XIDAK implements DIV and MOD on each machine with whatever instruction the hardware provides. This leads to different results for these operations on different machines when negative INTEGERs are involved. XIDAK does not plan to change this behavior, because it would require extra tests that would result in significantly slower DIV and MOD on some machines (the code generated for DIV and MOD would have to implement the logic of the routines below, which is not trivial).
The following INLINE PROCEDUREs can be used to implement the truncate toward zero convention:
INLINE INTEGER PROCEDURE iDivZ (INTEGER dividend,divisor);
BEGIN
IF NOT divisor THEN errMsg("iDivZ: division by zero","",fatal);
RETURN(
IF dividend > 0 THEN
IF divisor > 0 THEN dividend DIV divisor
EL - (dividend DIV - divisor)
EL IF divisor > 0 THEN - (- dividend DIV divisor)
EL - dividend DIV - divisor);
END;
INLINE INTEGER PROCEDURE iModZ (INTEGER dividend,divisor);
BEGIN
IF NOT divisor THEN errMsg("iModZ: division by zero","",fatal);
RETURN(
IF dividend > 0 THEN dividend MOD abs(divisor)
EL - (- dividend MOD abs(divisor)));
END;
The following INLINE PROCEDUREs can be used to implement the truncate toward negative infinity convention:
INLINE INTEGER PROCEDURE iDivN (INTEGER dividend,divisor);
BEGIN
IF NOT divisor THEN errMsg("iDivN: division by zero","",fatal);
RETURN(
IF dividend > 0 THEN
IF divisor > 0 THEN dividend DIV divisor
EL - (dividend DIV - divisor) -
(IF dividend MOD - divisor THEN 1 EL 0)
EL IF divisor > 0 THEN - (- dividend DIV divisor) -
(IF - dividend MOD divisor THEN 1 EL 0)
EL - dividend DIV - divisor);
END;
INLINE INTEGER PROCEDURE iModN (INTEGER dividend,divisor);
BEGIN
INTEGER i;
IF NOT divisor THEN errMsg("iModN: division by zero","",fatal);
IF dividend > 0 THEN
IF divisor > 0 THEN RETURN(dividend MOD divisor)
EB i := dividend MOD - divisor;
IF i THEN i .+ divisor END
EL IF divisor > 0 THENB
i := - dividend MOD divisor;
IF i THEN i := divisor - i END
EL RETURN(- (- dividend MOD - divisor));
RETURN(i);
END;
Analogous PROCEDUREs can be written for LONG INTEGERs.
Two STRINGs are compared according to the following definition:
Because the uppercase and lowercase letters are alphabetically ordered (see Section 2.1), this algorithm produces an alphabetical ordering for STRINGs that are either all uppercase or all lowercase; e.g., "ABC" is less than "ABD" or "ABCD". STRINGs with the same length and sequence of characters are equal. The empty STRING is less than any other STRING.
For example, STRING comparison produces the following ordering:
"" < "A" < "AA" < "AB" < "B"
STRINGs of mixed case may be compared in a “caseless” comparison by specifying the upperCase option to the system PROCEDURE compare or equ.
Let a be a (LONG) BITS value, and a_{i} denote the ith bit of a; similarly for b, b_{i} and c, c_{i}. In the computation c := a operator b, c_{i} is computed from a_{i} and b_{i} as:
a_{i} | b_{i} | c_{i} | |||
---|---|---|---|---|---|
IOR | XOR | MSK | CLR | ||
0 | 0 | 0 | 0 | 0 | 0 |
0 | 1 | 1 | 1 | 0 | 0 |
1 | 0 | 1 | 1 | 0 | 1 |
1 | 1 | 1 | 0 | 1 | 0 |
In words:
e_{1} op_{1} e_{2} op_{2} e_{3} op_{3} ... e_{}_{n}_{-1} op_{}_{n}_{-1} e_{n}
where the e_{i} are expressions and the op_{i} any of the following operators:
= NEQ < LEQ >
GEQ TST TSTA NTST NTSTA
Such a comparison chain is equivalent to the expanded form:
(e_{1} op_{1} e_{2}) AND (e_{2} op_{2} e_{3}) AND (e_{3} op_{3} ...) ... (e_{}_{n}_{-1} op_{}_{n}_{-1} e_{n})
except that the e_{i} are evaluated just once. For example, if p is a PROCEDURE, then in the comparison chain:
i < p(...) < j
p is called just once.
A chain may be composed of any combination of data types and operators, provided that the expanded form is valid. The chain is effectively enclosed in parentheses, with ANDs inserted at the “shared” expressions. Thus, NOT e_{1} = e_{2} TST e_{3} is equivalent to NOT ((e_{1} = e_{2}) AND (e_{2} TST e_{3})), except that e_{2} is evaluated once.
Consistent with the evaluation of AND, only as many e_{i} are evaluated as necessary to determine the value of e_{1} op_{1} e_{2} op_{2} e_{3} .... For example, in e1 < e2 = e3, e3 is evaluated only if e1 < e2 is true (otherwise the entire expression is false, so there is no reason to proceed any further).
A comparison chain is undefined if its expanded form is undefined; e.g., 1 > v > (v := 2) is undefined since v > (v := 2) is undefined.
(least precedence—least binding)
OR
AND
NOT
= NEQ < LEQ > GEQ TST NTST TSTA NTSTA
:=
MIN MAX
+ - (binary) IOR XOR MSK CLR
* / & DIV MOD SHL SHR
! ^
- (unary)
(greatest precedence—most binding)
Operators of equal precedence are associated from left to right; e.g., a + b + c is equivalent to (a + b) + c, with two exceptions:
No exception is made for ^ (exponentiation), as in some other programming languages; i.e., a ^ b ^ c is equivalent to (a ^ b) ^ c, not a ^ (b ^ c).
Since the order of evaluation of the operands of an operator is usually not specified, the programmer must be careful to avoid expressions that could depend on the order of evaluation. For example, in p(a) + q(b), where p and q are PROCEDUREs, it is not specified which of p and q is called first. If it is important that p be called first, then a separate statement must be used to force the evaluation order, e.g., t := p(a); ... t + q(b).
Figure 4–3. Precedence of the Assignment Operator in Expressions and Statements
IF v := e1 OR e2 THEN ... is equivalent to:
IF (v := e1) OR e2 THEN ... not equivalent to:
IF v := (e1 OR e2) THEN ... But the statement:
v := e1 OR e2; is equivalent to:
v := (e1 OR e2); |
The precedence of operators may be explicitly altered by using parentheses. An operand enclosed in parentheses is evaluated before the operation is performed. For example, in ((a + b) * c), a + b is evaluated before its is result multiplied by c (however, it is not specified whether c itself is evaluated before or after a + b; the only thing determined by the parentheses is that the addition takes place before the multiplication).
Parentheses may enclose any MAINSAIL expression, whether or not they are required in order to change the operator precedence that would prevail in the absence of parentheses. Redundant parentheses may be used to make source code easier to read.
The expression v .op e, (where v is a variable, op is one of the binary operators that may be dotted, e is an expression, and v := v op e is well defined) is equivalent to the assignment expression v := v op e, except that:
.- v is a short form of v := - v, except that if v is a non-simple variable, then the location of v within its data structure is evaluated just once.
If i, j, and k are simple variables, a is a one-dimensional INTEGER ARRAY, and proc is an INTEGER PROCEDURE, then:
i .+ 1 is equivalent to i := i + 1
.- i is equivalent to i := - i
i .+ j * k is equivalent to i := i + j * k
a[proc] .+ i is equivalent to j := proc; a[j] := a[j] + i,
since proc is called just once.
All operators more binding than the assignment operator (see Section 4.8.5) may be dotted:
MIN MAX
+ - (binary)
IOR XOR MSK CLR
* / & DIV MOD SHL SHR
! ^
- (unary)
A dotted operator has the same precedence as its corresponding non-dotted operator. An expression v .op e containing a dotted operator is undefined if e contains an operator that is evaluated after op; e.g., a .* b + c is undefined, since + has a lower precedence than *.
An expression containing a dotted operator is undefined if its equivalent assignment expression is undefined; e.g., (v .+ 5) = (v := 2) is undefined. It is the programmer's responsibility to avoid the use of such expressions.
Dotted operators can be used in expression statements (Section 5.2).
Whitespace characters are permitted between the dot and the following operator, although XIDAK recommends against a style that puts any whitespace characters between them.
Other MAINSAIL language constructs that may trigger collections are the Init Statement and many system PROCEDUREs.
The assignment compatibility rules are modified when the compiler subcommand (or $DIRECTIVE directive) STRICTCLASSES is in effect. When STRICTCLASSES is in effect, the following are illegal:
See Section 4.42 of the MAINSAIL Compiler User's Guide for a detailed discussion of STRICTCLASSES.
MAINSAIL Language Manual, Chapter 4