previous next top complete contents complete index framed top this page unframed
Associated with each data type is a set of values and a set of operations that may be performed on the values. The set of values associated with each data type includes a value called the Zero of the data type. The memory representation of the Zero of every data type consists entirely of 0-bits.
There is no implicit data type conversion in MAINSAIL. For example, if i is an INTEGER variable and r a REAL variable, then i + r is an illegal expression. Conversion procedures are provided to convert arguments to another data type. They are discussed in Section 3.11. cvi, for example, is a procedure that converts its argument to an INTEGER; i + cvi(r) is a legal expression.
The difference between the data types INTEGER, REAL, and BITS and their corresponding LONG types is in the range of values the corresponding types may take on. For example, the guaranteed range (range of values that may be assumed in a portable program) of a LONG REAL specifies more digits than that of a REAL. You should use the LONG forms only when necessary, since LONG values may take more space and LONG operations more time on some processors.
The guaranteed range of a data type applies only to a value of the data type for which no explicit size is given; an explicit size may override the default range. See Section 3.12 for details.
For each data type discussed in this chapter, a list of the operators that may be used with values of the data type is given. All operators are described in more detail in Section 4.8. Each data type description also includes a brief description of some of the system procedures that may be used with values of the data type. Complete system procedure descriptions are given starting in Chapter 30.
XIDAK reserves the right to create new MAINSAIL data types, and to enhance any system procedure, macro, or variable to handle such new data types.
3.1. BOOLEAN
BOOLEAN values are the logical values true and false.
The boolean constants are TRUE and FALSE;
these represent the only two values a BOOLEAN variable may have.
The boolean Zero is FALSE.
The following operators may be used with boolean expressions:
OR AND NOT = NEQ :=
An INTEGER constant is composed of an optional minus sign (-) followed by decimal digits (0 through 9). Some examples are 1874, -53, and 0.
A LONG INTEGER constant is like an INTEGER constant except that it must be immediately followed by the letter L (or lowercase l), e.g., 1874L, -53L, 0L, or 29875234l.
A character enclosed in single quotes represents the INTEGER constant of which the value is the target-machine character code of the enclosed character. For example, 'A' represents the INTEGER constant that is the character code of the letter “A” on the target machine. Character codes are discussed in Section 2.1.
The INTEGER Zero is 0, and the LONG INTEGER Zero is 0L (or 0l).
The following operators may be used with (LONG) INTEGER expressions:
OR = LEQ := DIV +
AND NEQ > MIN MOD - (unary and binary)
NOT < GEQ MAX * ^
The following system procedure may operate on (LONG) INTEGER expressions:
| abs | absolute value of a (LONG) INTEGER |
A REAL constant is like an INTEGER constant except that it has either a decimal point, an exponent, or both. An exponent immediately follows the last digit (or the decimal point if it is last), and is the letter E (or e) immediately followed by an integer. A nonnegative exponent may be separated from E by +. Some REAL constants are:
1874.56
-.78E-3 (= -.00078)
0.
1E3 (= 1e3 = 1E+3 = 1.E3 = 1.E+3 = 1000.)
A LONG REAL constant is like a REAL constant except that it must be immediately followed by the letter L (or lowercase l), e.g.:
12387658.5L
-.57E28L
0.0L (= 0.L)
The REAL Zero is 0., and the LONG REAL Zero is 0.L (or 0.l).
The following operators may be used with REAL and LONG REAL expressions:
OR = LEQ := + /
AND NEQ > MIN ^ - (unary and binary)
NOT < GEQ MAX *
The system procedures below may be used with REAL and LONG REAL expressions. Trigonometric functions such as sin, cos, and log are also provided:
| abs | absolute value of a (LONG) REAL |
| ceiling | smallest (LONG) INTEGER not exceeded by a (LONG) REAL |
| floor | largest (LONG) INTEGER not exceeding a (LONG) REAL |
| truncate | truncate a (LONG) REAL to a (LONG) INTEGER |
The guaranteed range of BITS and LONG BITS is the same, namely, at least 32 bits. However, on platforms that support 64-bit addresses, a BITS is typically 4 bytes (32 bits) and a LONG BITS 8 bytes (64 bits).
BITS and LONG BITS differ from INTEGER and LONG INTEGER in that (LONG) BITS operations are bitwise logical operations; (LONG) INTEGER operations are arithmetic (numerical) operations. Values of one data type may be easily converted to the other, if it is necessary to view a value alternately in one way, then another.
A bit has two states, 0 and 1, sometimes called 0-bit and 1-bit or clear and set. To cause a bit to enter the 0 state is to clear it; to cause it to enter the 1 state, to set it.
A BITS constant is a sequence of characters preceded by a single quote and a letter that indicates the base: B (or b) for binary (base 2), O (or o) for octal (base 8), or H (or h) for hexadecimal (base 16). The base letter may be omitted for octal; i.e., octal is the default.
Each binary character (0 or 1) represents a single bit. Each octal character (0 through 7) represents three bits (000 through 111). Each hexadecimal character (0 through 9, A through F) represents four bits (0000 through 1111). The lowercase letters a through f, like A through F, can be used to represent the bit patterns 1010 through 1111 in hexadecimal constants.
The bits for each character are concatenated to obtain the bits of the constant. For example, 'B101011, 'O53 (or just '53) and 'H2B all represent the same bit sequence 101011 (ignoring leading zeros).
Other examples of BITS constants are '573, 'B10111, and 'H82A3.
A LONG BITS constant is like a BITS constant except that it must be immediately followed by the letter L (or lowercase l), e.g., '743L (= 'B111100110L = 'H1D6L = 'h1d6l).
The BITS Zero is '0 (or equivalently 'B0 or 'O0 or 'H0); the LONG BITS Zero is the BITS Zero followed by L (or lowercase l), e.g., '0L.
Bits are numbered from right to left starting with zero.
The following operators may be used with BITS and LONG BITS expressions:
OR = NTST := MSK SHR
AND NEQ TSTA IOR CLR !
NOT TST NTSTA XOR SHL
The following system procedures may operate on BITS and LONG BITS expressions:
| bMask | form a BITS mask (sequence of 1-bits) |
| lbMask | form a LONG BITS mask (sequence of 1-bits) |
| $lbOnes | LONG BITS value consisting of all 1-bits |
A STRING is a variable-length sequence of characters. MAINSAIL automatically keeps track of how many characters are in a STRING.
The limit on the number of characters in a STRING is the maximum INTEGER that can be represented; thus, the smallest maximum STRING length any MAINSAIL implementation enforces is 2147483647 characters (although the effective maximum length will be less if you have less than 2 Gb of memory available).
The constant $maxStringLength represents the maximum allowable STRING length, and is defined to be $maxInteger.
A STRING constant is a sequence of characters enclosed in double quotes. Some examples are shown below. A double quote is represented in a STRING constant with two consecutive double quotes. Each such pair of double quotes stands for one double quote inside the STRING. For example, the last STRING in the list below contains two embedded double quote characters. It contains 23 characters; the two extra double quotes are not retained as part of the STRING constant, since they are only indicators to the compiler.
"Hello"
"She is 12 years old"
"The umbrella cost $2.50"
"He cried ""Wolf!"" again."
A STRING constant may extend across line and page boundaries; the characters that indicate the boundaries are part of the constant. For example, the STRING:
"This is a STRING constant that extends
across a line boundary in the source text"
has an embedded eol. It could also be written:
"This is a STRING constant that extends " & eol &
"across a line boundary in the source text"
The concatenations are performed at compiletime, since all the STRINGs involved are constants.
The STRING Zero (sometimes called the null STRING or the empty STRING) is "". It is the STRING consisting of no characters.
& is the concatenation operator. s1 & s2 is the STRING consisting of the characters of s1 immediately followed by the characters of s2. Thus, if s1 has the value:
"This is "
and s2 has the value:
"a concatenated STRING"
then the expression s1 & s2 has the value:
"This is a concatenated STRING"
Substrings are described in Section 4.4, and STRING comparison in Section 4.8.2.
The following operators may be used with STRING expressions:
OR = LEQ := &
AND NEQ > MIN
NOT < GEQ MAX
The following system procedures may operate on STRING expressions:
| length | number of characters in a STRING |
| cvu | convert a STRING to upper case |
| cvl | convert a STRING to lower case |
| compare | return -1, 0, or 1 to indicate comparison of two STRINGs (see Section 4.8.2). Can be made to treat upper and lower case identically, i.e., a “caseless” comparison |
| equ | returns TRUE if two STRING arguments are equal. Like compare, can do a caseless comparison |
| first | first character of a STRING |
| last | last character of a STRING |
| $nth | nth character of a STRING |
| read | reads a value from a STRING |
| write | writes a value to a STRING |
| cRead | reads a character from a STRING |
| cWrite | writes a character to a STRING |
| rcRead | reads a character from the end of a STRING (reverse cRead) |
| rcWrite | writes a character to the front of a STRING (reverse cWrite) |
| $dup | reduplicate a STRING (concatenate with itself) |
| scan | scans a STRING according to a scan specification |
| scanSet | sets up scan bits to be used with scan |
| $scanSet | sets up scan integers to be used with scan |
| scanRel | releases scan bits or integers used with scan |
| $cScan | scan for single character |
| newString | create a STRING descriptor from a CHARADR and a length |
| $getInArea | ensure that a STRING is in MAINSAIL's STRING space |
| $removeLeadingBlankSpace, $removeTrailingBlankSpace | remove blank space from a STRING |
| $removeWord | remove non-blank characters from a STRING |
| $removeLastWord | remove trailing non-blank characters |
| $removeBoolean, $removeBits, $removeInteger, $removeReal | parse STRING of specified data type |
| $formParagraph | fill and justify STRING |
3.5.1. Low-Level STRING Manipulation
STRINGs are represented in memory as
STRING descriptors, composed
of a length and a character address.
STRING descriptors usually point to
characters stored in a region of memory called STRING space.
The characters stored in STRING
space are subject to garbage collection
if they become inaccessible (i.e., no STRING
descriptor points to them).
The characters of
a STRING allocated in scratch space or created
by a foreign language
procedure do not reside in STRING space.
The user who needs to move such a STRING
into MAINSAIL's STRING space
may do so by means of the system procedure $getInArea.
Most programs that do not call foreign language procedures do not need to manipulate STRINGs or STRING descriptors explicitly with newString or $getInArea.
3.5.2. STRING Constants and Garbage Collection
The first time each STRING constant in a MODULE is used,
its characters may be copied into STRING space.
This can trigger a garbage collection.
Subsequent uses of the same STRING constant (in the same
MODULE)
use the previously
copied characters, and so do not cause a collection.
3.6. POINTER
POINTER is a data type for referencing dynamic objects,
i.e., dynamic records, data sections, and dynamic ARRAYs.
Records are described in Chapter 10,
data sections in Chapter 11, and ARRAYs in Chapter 12.
POINTERs are frequently classified, i.e., associated with a
particular CLASS, as described in Section 9.2.
Only unclassified POINTERs can be used to refer to ARRAYs; see Section 9.3. If you know that a dynamic object is an ARRAY, you should use an ARRAY variable rather than a POINTER variable to refer to it; see Chapter 12.
The only POINTER constant is NULLPOINTER, which is the POINTER Zero. A NULLPOINTER references no object.
The following operators may be used with POINTER expressions:
OR AND NOT = NEQ :=
The only $PROCVAR constant is $NULLPROCVAR, which is the $PROCVAR Zero. It references no PROCEDURE.
The following operators may be used with $PROCVAR expressions:
OR AND NOT = NEQ :=
Chapter 8 describes $PROCVARs in more detail.
3.8. ADDRESS
ADDRESS is a data type for representing
the location of a storage unit
in memory.
ADDRESSes may be used for loading and storing values of
any data type to
and from memory. Individual characters are usually loaded and
stored by means of
the data type CHARADR.
ADDRESS is a “low-level” data type; many user programs can be written without the use of ADDRESSes.
Not every ADDRESS representable on a processor is a valid MAINSAIL ADDRESS. On some implementations of MAINSAIL, an ADDRESS that is not a multiple of the size of the smallest data type is considered unaligned (or non-data-type-aligned) and is invalid. Portable programs must therefore compute ADDRESSes as linear combinations of exact multiples of the sizes of MAINSAIL data types, starting from some ADDRESS that is known to be properly aligned (e.g., an ADDRESS obtained from the system procedure newScratch or newPage, or an ADDRESS that is the start of a dynamic object). Furthermore, during any particular execution of MAINSAIL, some ADDRESSes may be invalid for reading or writing because the storage units they reference are protected by the operating system; therefore, ADDRESSes should point into memory that has been properly requested from the operating system. Storage units in regions of memory allocated by the system procedures newScratch and newPage will always have been properly requested from the operating system. Storage units in regions of memory allocated by the system procedure new are also valid for reading and writing, although invalid values (e.g., POINTERs not pointing to a valid MAINSAIL data structure) should not be stored into a MAINSAIL data structure. Storing into arbitrary or unallocated memory has undefined effects; for example, doing so may overwrite executable code or MAINSAIL runtime data structures, thereby damaging them.
The use of an invalid ADDRESS is undefined.
On some processors, read and write to an address may align the address before performing the operation. The address is increased, if necessary, to the minimum alignment the processor requires for the data type being read or written. For example, on the PA64 processor, where INTEGERs occupy 4 bytes and LONG INTEGERs 8 bytes, read(a,ii1,i,ii2), where i is an INTEGER and ii1 and ii2 are LONG INTEGERs, automatically aligns by skipping 4 bytes before reading ii2.
The address of a collectable MAINSAIL data structure may change if a garbage collection occurs; an ADDRESS variable is not updated in such a case. Collectable data are normally referenced with the POINTER and STRING data types, which are updated when a garbage collection occurs.
ADDRESSes may be classified like POINTERs; see Section 9.2.
The only ADDRESS constant is NULLADDRESS, which is the ADDRESS Zero.
ADDRESSes are ordered with respect to the relative position of the referenced storage units in memory. It is this order that is used when comparing ADDRESSes, or using MIN or MAX on an ADDRESS.
The following operators may be used with ADDRESS expressions:
OR AND NOT = NEQ <
LEQ > GEQ := MIN MAX
The following system procedures may be used in operations with ADDRESSes:
| clear | clears storage units of memory |
| copy | copies storage units from one memory location to another |
| xLoad | loads a value (of data type x) from memory; see Section 40.16 |
| store | stores a value into memory |
| displace | returns an ADDRESS that is displaced a given number of storage units from another ADDRESS |
| displacement, lDisplacement | computes the distance between two ADDRESSes |
| newPage | gets some memory pages |
| pageDispose | disposes of pages obtained with newPage |
| newScratch | returns the ADDRESS of some memory for scratch space |
| scratchDispose | disposes of scratch space |
| read | reads a value from an ADDRESS |
| write | writes a value to an ADDRESS |
CHARADR is a “low-level” data type; many user programs can be written without the use of CHARADRs.
As with ADDRESSes, there may be CHARADR values at which the effect of performing a load or store is undefined, because the memory has not been properly allocated; see Section 3.8. Unlike ADDRESSes, however, there is never any alignment requirement for storing at or loading from a CHARADR.
The only CHARADR constant is NULLCHARADR, which is the CHARADR Zero.
The following operators may be used with CHARADR expressions:
OR AND NOT = NEQ <
LEQ > GEQ := MIN MAX
The following system procedures may be used in operations with CHARADRs:
| clear | clears character units of memory |
| copy | copies characters from memory starting at one CHARADR to memory starting at another CHARADR |
| cLoad | loads a character from memory |
| store | stores a character into memory |
| cRead | reads a character from memory |
| cWrite | writes a character to memory |
| displace | returns a CHARADR that is displaced a given number of characters from another CHARADR |
| displacement | computes the distance between two CHARADRs |
| newString | makes a STRING (descriptor) from a CHARADR and an INTEGER (length) |
It is an error to use a constant that cannot be represented on the target machine, e.g., an INTEGER that is too large.
In general, MAINSAIL does not support a portable notion of arithmetic exceptions, especially overflow, and particularly floating-point overflow. The notion of overflow visible to a MAINSAIL program is whatever notion is supported by the underlying hardware, with all its quirks. Since different platforms have different notions of overflow, overflow is not a portable concept. Programs should not be written expecting the rules for overflow to be the same across different platforms.
In particular, on some platforms, intermediate calculations are done using more precision than can be represented by variables of a given type. Thus, overflow (as defined by the underlying hardware) might occur only when a result is finally stored in memory, since the representable range of exponents is smaller for memory operands than for intermediate operands. Usually results are stored in memory at a point in the program fairly near the point where they were calculated; however, the store could be far away from the calculation, maybe even in a different procedure.
For example, a value returned by a RETURN statement might be too large to be representable in memory, but not too large for an intermediate representation. It is possible for such a value to be calculated by the processor with no overflow, and returned (in a processor register large enough to hold the intermediate representation) to the caller where it is then stored in a variable, at which point overflow would occur. A programmer expecting overflow to occur when the value was originally calculated would be disappointed. According to the processor's notion of overflow, the calculation didn't overflow at all; overflow occurred only when the caller eventually stored the result in memory.
Programs that calculate values that are outside the allowed range of values for a given type are not legal programs, and the results of executing such programs are undefined. XIDAK does what it can, within reason, to do something sensible, but there are no guarantees. Overflow ultimately occurs only when the processor says it occurs.
3.10.1. How to Write (LONG) INTEGER
Addition and Multiplication Routines That Do Not Overflow
Some MAINSAIL programmers have developed code on platforms where
arithmetic
overflow is not detected by default, and have come to depend on the lack
of overflow. These users have been unpleasantly surprised when moving
their programs to a platform where (LONG) INTEGER
overflow is detected by default.
MAINSAIL does not provide a way to disable (LONG) INTEGER
overflow detection on such platforms.
If you have such a program, you should replace the overflowing multiply and add operators with calls to portable arithmetic routines that perform addition and multiplication without triggering overflow. These routines should look something like the routines noOverflowAdd and noOverflowMultiply in the MODULE FOO below. This MODULE has been written specifically to handle 32-bit LONG INTEGERs on processors that support two's-complement arithmetic; to handle different conventions, you would need to modify the code somewhat.
BEGIN "foo"
IFC size(longIntegerCode) NEQ 4 THENC
# This code works only for 32-bit integers
MESSAGE "Long integers are not 32 bits!","error";
ENDC
IFC $attributes TST $onesComplement THENC
# This code works only for two's-complement arithmetic
MESSAGE "Integer arithmetic is ones' complement!","error";
ENDC
LONG INTEGER PROCEDURE noOverflowAdd (LONG INTEGER a,b);
# Add two 32-bit long integers a and b without overflow.
BEGIN
BOOLEAN carry;
INTEGER i;
LONG BITS aa,bb,res,m,n,am,bm;
aa := cvlb(a) MSK 'H3FFFFFFFL;
bb := cvlb(b) MSK 'H3FFFFFFFL;
res := cvlb(cvli(aa) + cvli(bb));
carry := res TST 'H40000000L;
FOR i := 30 UPTO 31 DOB
m := '1L SHL i;
am := cvlb(a) MSK m; bm := cvlb(b) MSK m;
n := am XOR bm;
IF carry THEN n .XOR m;
carry := (am AND bm) OR (am AND carry) OR (bm AND carry);
res .IOR n END;
RETURN(cvli(res));
END;
LONG INTEGER PROCEDURE noOverflowMultiply (LONG INTEGER a,b);
# Let au, bu be the uppermost 4 bits of a and b, am, bm the 14
# next lower-order bits, and al, bl be the lowest-order 14 bits.
# Then:
# a * b = au * bu * 2 ^ 56
# + (au * bm + am * bu) * 2 ^ 42
# + (au * bl + am * bm + al * bu) * 2 ^ 28
# + (am * bl + al * bm) * 2 ^ 14
# + al * bl
# None of the intermediate multiplications or additions can
# overflow.
BEGIN
LONG INTEGER al,au,am,bm,bl,bu;
au := cvli(cvlb(a) SHR 28);
IF cvlb(a) TST 'H80000000L THEN
au := cvli(cvlb(au) IOR 'HFFFFFFF0L);
am := cvli((cvlb(a) SHR 14) MSK 'H3FFFL);
al := cvli(cvlb(a) MSK 'H3FFFL);
bu := cvli(cvlb(b) SHR 28);
IF cvlb(b) TST 'H80000000L THEN
bu := cvli(cvlb(bu) IOR 'HFFFFFFF0L);
bm := cvli((cvlb(b) SHR 14) MSK 'H3FFFL);
bl := cvli(cvlb(b) MSK 'H3FFFL);
RETURN(
noOverflowAdd(
cvli(cvlb(au * bl + am * bm + al * bu) SHL 28),
noOverflowAdd(
cvli(cvlb(am * bl + al * bm) SHL 14),
al * bl)));
END;
INITIAL PROCEDURE;
BEGIN
LONG INTEGER i,j;
DOB i := $liGet("First long integer: ");
j := $liGet("Second long integer: ");
write(logFile,noOverflowMultiply(i,j),eol,i * j,eol) END;
END;
END "foo"
A conversion procedure for converting a value of type x to type y is provided for each x–y combination for which the box is marked the following table (which uses the data type abbreviations listed in Table 1–1):
| y | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| i | li | r | lr | b | lb | s | a | c | p | ||
| x | i | * | * | * | * | * | * | * | |||
| li | * | * | * | * | * | * | * | * | |||
| r | * | * | * | * | * | ||||||
| lr | * | * | * | * | * | ||||||
| b | * | * | * | * | * | ||||||
| lb | * | * | * | * | * | * | * | ||||
| s | * | * | * | * | * | * | * | * | |||
| a | * | * | * | * | * | ||||||
| c | * | * | * | ||||||||
| p | * | * | |||||||||
MAINSAIL does not guarantee
to catch underflow or overflow in conversions.
The effect is undefined of calling one of the MAINSAIL system
routines that converts a STRING to numeric value (e.g.,
read, cvi, cvli, cvr, cvlr)
if the numeric value is outside the range supported by the
processor.
3.12. Explicit Data Sizing
By default, a given MAINSAIL data type on a given processor
always occupies the same amount
of space in memory.
You can override that default and specify an explicit size with
many MAINSAIL data types.
There are two reasons why you might want to override the default size for a MAINSAIL data type:
3.12.1. Allowed Data Types and Sizes for Explicit Sizing
A type qualified with a size specification is said to be
explicitly sized and the sized type is called the
base type.
A size specification takes the form of a parenthesized INTEGER
following the data type name.
BOOLEAN and (LONG) BITS may be followed by (n) to indicate that only the low-order n bytes of the value are stored in memory, where n is 1, 2, or 4. If necessary, the value is zero-extended when accessed.
(LONG) INTEGER may be followed by (n) or (-n) to indicate that only the low-order n bytes of the value are stored in memory, where n is 1, 2, or 4. A size specification of (-n) forces the value to be sign-extended (if necessary) when accessed; (n) with no minus sign specifies zero-extension.
A value that is explicitly sized to be smaller than its base type is expanded to (at least) the size of its base type when it is accessed. This expansion sign-extends the value for a negative size specification, and zero-extends otherwise. Zero-extension does not mean that MAINSAIL uses unsigned arithmetic on the accessed value; it indicates only how to expand the value.
You cannot make a field larger than the natural size of its base type: if a base type is of size m, and n is greater than m, then an explicit size of n or -n for that type is a compiletime error. You cannot specify general unsigned arithmetic on data types that support only signed arithmetic: if the base type is INTEGER or LONG INTEGER, and the base type size is m, then a size of m is also a compiletime error, but -m is allowed (and has no effect, since it specifies the default behavior). If the base type is any type other than INTEGER or LONG INTEGER, then a negative size is prohibited, and an explicit size of m is legal and has no effect.
On platforms where the size of INTEGER is 2 bytes, 4 (or -4, in the case of LONG INTEGER) would be a legal size only for the LONG types (for which it currently has no effect, since these types are 4 bytes on all presently supported platforms).
In summary, MAINSAIL's guaranteed data type ranges in combination with the rules for explicit data type sizes imply the following:
| Data Type | Legal? |
|---|---|
| BOOLEAN(1) | always |
| BOOLEAN(2) | always |
| BOOLEAN(4) | on platforms where size(booleanCode) GEQ 4 |
| INTEGER(-1) | always |
| INTEGER(-2) | always |
| INTEGER(-4) | always |
| INTEGER(1) | always |
| INTEGER(2) | always |
| INTEGER(4) | illegal on all current platforms, but would be legal where size(integerCode) > 4 |
| LONG INTEGER(-1) | always |
| LONG INTEGER(-2) | always |
| LONG INTEGER(-4) | always |
| LONG INTEGER(1) | always |
| LONG INTEGER(2) | always |
| LONG INTEGER(4) | illegal on all current platforms, but would be legal where size(longIntegerCode) > 4 |
| BITS(1) | always |
| BITS(2) | always |
| BITS(4) | always |
| LONG BITS(1) | always |
| LONG BITS(2) | always |
| LONG BITS(4) | always |
The only types that are guaranteed to need zero-extension on all platforms are (LONG) INTEGER(1), LONG INTEGER(2), (LONG) BITS(1) and LONG BITS(2), since INTEGER and BITS are guaranteed to be at least 2 bytes, and LONG INTEGER and LONG BITS at least 4 bytes.
When a value is stored into an explicitly-sized variable that is smaller than its base type, the value must be truncated. If ACHECK is in effect, and the base type is (LONG) INTEGER, overflow is reported if the truncated value, when reexpanded to the size of the base type, would be different from the original value reexpanded to the size of the base type. High-order bits are silently discarded for a (LONG) BITS, whether or not ACHECK is in effect.
In read(a,v) and write(a,v), the incrementing of a is not affected by the size of v; i.e., these procedures read or write a value at a of v's base type's size, then displace a by the base type's size. v's size affects only how v is represented in memory, not how the value at ADDRESS a is accessed. To load or store an explicitly-sized value from an ADDRESS, use the sized load and store procedures (see Sections 40.16.1 and 47.50), or use an explicitly sized record field (or inplace ARRAY element), as follows:
CLASS c (INTEGER(-2) f);
ADDRESS(c) a;
INTEGER v;
v := a.f; # load a 2-byte integer
a.f := v; # store a 2-byte integer
Elements must have the same base type, size, and sign extension in order for two inplace ARRAYs to be assignment compatible:
LONG INTEGER(1) $INPLACEARRAY(0 TO 10) ary1;
LONG INTEGER(2) $INPLACEARRAY(0 TO 10) ary2;
...
ary2[i] := ary1[i]; # legal
ary1[i] := ary2[i]; # legal, but could overflow
ary1 := ary2; # ILLEGAL
CLASS c (INTEGER(-1) i);
MODULE m (BITS(1) b);
BOOLEAN(1) ARRAY(*) ary;
BOOLEAN(1) PROCEDURE p (INTEGER(-1) i);
BEGIN
LONG INTEGER(1) ii;
OWN LONG BITS(1) bb;
...
END;
is currently treated as:
BOOLEAN PROCEDURE p (INTEGER i);
BEGIN
LONG INTEGER ii;
OWN LONG BITS bb;
...
END;
Note that declaring arrays of BOOLEAN to be arrays of BOOLEAN(1) will typically save space with little or no runtime penalty (depending on the relative efficiency of loading bytes and words on the processor in question).
| Temporary feature: subject to change |
Although any explicit size specified for local variables, parameters, PROCEDURE return values, outer variables, and local OWN variables is currently ignored (except for FLI parameters in MAINSAIL Version 16.29 and later), later versions of MAINSAIL may do something with explicit sizes specified in these contexts. Therefore, it is not advisable to specify explicit sizes unless those sizes actually correspond to the number of bits you intend to be represented in the value declared.
| Temporary feature: subject to change |
The GENERIC PROCEDURE selection mechanism (see Section 7.16) currently ignores explicit sizes associated with data types (other than array elements). Thus, given the following declarations:
PROCEDURE p1 (INTEGER(1) i);
PROCEDURE p2 (INTEGER(-2) i);
GENERIC PROCEDURE p "p1,p2";
p1 will be the only instance PROCEDURE ever chosen for p (because it comes first in p's instance PROCEDURE list), regardless of how or whether p's INTEGER argument is explicitly sized.
This behavior may change in future versions of MAINSAIL. In Version 16, XIDAK recommends against including a PROCEDURE with explicitly sized parameters in a GENERIC PROCEDURE instance list, as the semantics of doing so may change.
MAINSAIL Language Manual, Chapter 3