previous next top contents index framed top this page unframed
The sorting package is a program utility that provides the ability to sort one-dimensional dynamic ARRAYs based on the values of their elements, and facilities from which a programmer may generate sorting PROCEDUREs for other data structures. The sort may be an in-place sort (the elements of the ARRAY are rearranged) or an index sort (a parallel dynamic ARRAY of subscripts (indices) is rearranged to represent the sorted arrangement of the ARRAY to be sorted). The sorting algorithm used is a mixture of quicksort and bubble sort. PROCEDUREs are also provided to reverse the order of elements of a one-dimensional ARRAY or to rearrange them based on an index ARRAY.
The facilities in the sorting package are located in an intmod called SRTMOD. Therefore, to use the identifiers from it, either use the directive:
RESTOREFROM "srtmod";
or prefix each identifier described herein with srtmod$, e.g., srtmod$sort.
In the description of the PROCEDUREs in this chapter, where instances of a GENERIC PROCEDURE have analogous parameters of a number of different types, the notation dataType may be used in place of the actual data type names in the PROCEDURE headers. The description of the PROCEDURE enumerates the possible values for dataType.
31.1. Sorting Dynamic ARRAYs of Ordered Data Types
Ordered data types, for the purpose of this section,
are BOOLEAN, STRING,
(LONG) INTEGER, and (LONG) REAL.
In the case of BOOLEAN,
TRUE is considered greater than FALSE;
STRINGs are compared caselessly, as by the system PROCEDURE
compare with the upperCase bit set;
the other data types are ordered
as by the MAINSAIL comparison
operators.
PROCEDURE sort (dataType ARRAY(*) a;
OPTIONAL LONG INTEGER lb,ub);
PROCEDURE sort (dataType ARRAY(*) a;
OPTIONAL INTEGER lb,ub);
PROCEDURE sort (dataType ARRAY(*) a;
LONG INTEGER lb,ub;
MODIFIES LONG INTEGER
ARRAY(*) ind);
PROCEDURE sort (dataType ARRAY(*) a;
INTEGER lb,ub;
MODIFIES INTEGER ARRAY(*) ind);
dataType in the declaration of sort may be BOOLEAN, INTEGER, LONG INTEGER, REAL, LONG REAL, or STRING. All forms sort the elements of a from lb to ub, or from a.$lb1 to a.$ub1 if lb and ub are not specified. An error message is issued if lb and ub are specified and either is outside the bounds of the ARRAY.
The in-place forms of sort are those that take only one ARRAY parameter a. The elements of a are reordered so that every element has a value greater than or equal to the values at all lesser subscripts; i.e., greater values are at greater subscripts. An error occurs if a is Zero.
The index forms of sort are those that take an ARRAY parameter a and an ARRAY index ARRAY parameter ind. a is not modified. If ind is Zero or if its bounds do not equal the bounds of the subrange of a to be examined, it is reallocated to be the right size. The elements of ind are set to the ordered subscripts of a; i.e., successive elements of ind, when used as subscripts of a, result in a sorted series of values, greater values at greater subscripts of ind.
Index sorts are useful if several parallel ARRAYs with related values are to be sorted, e.g., ARRAYs of names and dates, where the date attached to a name has the same index in one ARRAY as the name in the other. Sorting each ARRAY in place would destroy the relationship between the names and the dates. An index sort does not modify the name and date ARRAYs, but can produce the ordering of the two ARRAYs based on either the name or the date. If the program of Example 31–2 is given the input file shown in Example 31–3, the output is as shown in Example 31–4.
Example 31–2. A Parallel Sorting Program
| BEGIN "parSrt" # Display a sorted listing of MODULEs along with dates of # creation. An unsorted list is read from a file in which # the first line has the number of MODULE names in the # file and each subsequent line has the name of a MODULE # followed by a date. RESTOREFROM "srtmod"; INITIAL PROCEDURE; BEGIN INTEGER i,j; STRING s; POINTER(textFile) f; INTEGER ARRAY(1 TO *) ind; LONG INTEGER ARRAY(1 TO *) date; STRING ARRAY(1 TO *) name; open(f,"Unsorted file: ",prompt!input); read(f,s); j := cvi(s); new(date,1,j); new(name,1,j); FOR i := 1 UPTO j DOB read(f,s); name[i] := $removeWord(s); date[i] := $strToDate(s) END; close(f); IF confirm("Sort by name (else by date)") THEN sort(name,1,j,ind) EL sort(date,1,j,ind); FOR i := 1 UPTO j DOB fldWrite(logFile,name[ind[i]],8,- ' '); write(logFile,$dateToStr(date[ind[i]],$hyphenateDate), eol) END; END; END "parSrt" |
Example 31–3. Input File parsrt.dat for PARSRT
| 8 AMOD 20-oct-86 ZMOD 20-jan-87 X 13-feb-87 PHOTO 12-jun-85 TIMER 9-oct-86 POST 11-jan-85 PRE 6-jul-86 INTER 9-aug-86 |
Example 31–4. PARSRT Execution
| *parsrt<eol> Unsorted file: parsrt.dat<eol> Sort by name (else by date) (Yes or No): y<eol> AMOD 20-Oct-86 INTER 9-Aug-86 PHOTO 12-Jun-85 POST 11-Jan-85 PRE 6-Jul-86 TIMER 9-Oct-86 X 13-Feb-87 ZMOD 20-Jan-87 *parsrt<eol> Unsorted file: parsrt.dat<eol> Sort by name (else by date) (Yes or No): n<eol> POST 11-Jan-85 PHOTO 12-Jun-85 PRE 6-Jul-86 INTER 9-Aug-86 TIMER 9-Oct-86 AMOD 20-Oct-86 ZMOD 20-Jan-87 X 13-Feb-87 |
The macro generateQuickSort is used to generate a GENERIC PROCEDURE to perform a sort with a user-defined ordering. This is useful, e.g., if the elements of the ARRAY to be sorted are records of which one or more fields must be examined to determine their relationship. generateQuickSort may be called wherever a PROCEDURE declaration is legal.
generateQuickSort takes five arguments as follows:
generateQuickSort (
procedureName,
elementType,
isLessThan,
isLEQ,
isEqual);
procedureName is the name of the GENERIC PROCEDURE to be generated. elementType is the data type of the elements of the ARRAY, e.g., POINTER(foo); the permissible data types are:
BOOLEAN INTEGER LONG INTEGER
REAL LONG REAL BITS
LONG BITS STRING ADDRESS
CHARADR POINTER $PROCVAR
where POINTER may be classified and $PROCVAR may be modeled. isLessThan, isLEQ, and isEqual are the names of BOOLEAN macros or PROCEDUREs that take two arguments of type elementType; they return TRUE if and only if the first element is, respectively, less than, less than or equal to, or equal to the second.
The generated GENERIC PROCEDURE declaration and the headers of the generated PROCEDUREs look like:
GENERIC PROCEDURE procedureName
"procedureNameInPlace,procedureNameInPlaceShort," &
"procedureNameIndexed,procedureNameIndexedShort";
PROCEDURE procedureNameInPlace
(elementType ARRAY(*) a;
OPTIONAL LONG INTEGER alb,aub);
PROCEDURE procedureNameInPlaceShort
(elementType ARRAY(*) a;
OPTIONAL INTEGER alb,aub);
PROCEDURE procedureNameIndexed
(elementType ARRAY(*) a;
LONG INTEGER alb,aub;
MODIFIES LONG INTEGER ARRAY(*) ind);
PROCEDURE procedureNameIndexedShort
(elementType ARRAY(*) a;
INTEGER alb,aub;
MODIFIES INTEGER ARRAY(*) ind);
Additional PROCEDUREs, the names of which begin with procedureName, may be generated but are not intended to be called directly.
The parameters a, alb, aub, and ind correspond to the parameters a, lb, ub, and ind to sort; provided that the arguments to generateQuickSort are well-formed and correct, the generated PROCEDURE functions exactly like sort except that the ordering criteria are provided by the programmer. For example, the standard REAL ARRAY forms of sort could be generated by:
DEFINE
rLessThan(a,b) = [((a) < (b))],
rLEQ(a,b) = [((a) LEQ (b))],
rEqual(a,b) = [((a) = (b))];
generateQuickSort(sort,REAL,rLessThan,rLEQ,rEqual);
The macro generateMultipleQuickSort may be used to generate a sort based on multiple ARRAYs, two- or three-dimensional ARRAYs, or non-ARRAY data structures. generateMultipleQuickSort takes four or five parameters as follows:
generateMultipleQuickSort (
procedureName,
isLessThan,
isLEQ,
isEqual);
or:
generateMultipleQuickSort (
procedureName,
isLessThan,
isLEQ,
isEqual,
LONG);
procedureName is the name of the GENERIC PROCEDURE to be generated. isLessThan, isLEQ, and isEqual are the names of BOOLEAN macros or PROCEDUREs that take two arguments of type INTEGER (if LONG is absent) or LONG INTEGER (if LONG is present); they return TRUE if and only if the (program-defined) quantity corresponding to the first argument is, respectively, less than, less than or equal to, or equal to the second.
If LONG is present, the generated GENERIC PROCEDURE declaration and the header of the generated PROCEDURE look like:
GENERIC PROCEDURE procedureName "procedureNameIndexed";
PROCEDURE procedureNameIndexed
(LONG INTEGER alb,aub;
MODIFIES LONG INTEGER ARRAY(*) ind);
If LONG is absent, the generated GENERIC PROCEDURE declaration and the header of the generated PROCEDURE look like:
GENERIC PROCEDURE procedureName "procedureNameIndexedShort";
PROCEDURE procedureNameIndexedShort
(INTEGER alb,aub;
MODIFIES INTEGER ARRAY(*) ind);
By calling generateMultipleQuickSort twice with the same procedureName, once with LONG present and once with it absent, a GENERIC PROCEDURE with two instances is generated. The generated PROCEDURE sets ind in the range alb to aub to reflect the ordering specified by isLessThan, isLEQ, and isEqual.
Example 31–5 shows how to sort two parallel ARRAYs a and b, with a as the primary and b the secondary sort. The ARRAYs a and b must be outer variables, since they cannot be passed as arguments to the generated sort PROCEDUREs.
Example 31–5. Use of generateMultipleQuickSort
| INTEGER ARRAY(*) a,b; DEFINE myLT(i,j) = [(a[i] < a[j] OR (a[i] = a[j] AND b[i] < b[j]))], myLEQ(i,j) = [(a[i] < a[j] OR (a[i] = a[j] AND b[i] LEQ b[j]))], myEQ(i,j) = [(a[i] = a[j] AND b[i] = b[j])]; generateMultipleQuickSort(myGenSort,myLT,myLEQ,myEQ); . . . PROCEDURE foo; BEGIN INTEGER ARRAY(*) ind; . . . myGenSort(1,1000,ind); # ind now contains the sorted subscripts. To reorder both # a and b according to ind: reorder(a,ind); reorder(b,ind); . . . |
Figure 31–6. reverse (GENERIC)
| PROCEDURE reverse (dataType ARRAY(*) a; OPTIONAL INTEGER lb,ub); PROCEDURE reverse (dataType ARRAY(*) a; OPTIONAL LONG INTEGER lb,ub); |
dataType in the declaration
of reverse may be any MAINSAIL data
type:
BOOLEAN, INTEGER, LONG INTEGER,
REAL, LONG REAL, BITS, LONG BITS,
STRING, ADDRESS, CHARADR, or POINTER.
reverse reverses an ARRAY a from lb to ub,
or from a.$lb1 to a.$ub1
if lb and ub are not specified.
The elements of a are rearranged to put the value at
the highest subscript at the lowest subscript and vice versa,
and the second highest at the second lowest and vice versa,
and so on.
An error occurs if a is Zero
or lb or ub falls outside the range
a.$lb1 to a.$ub1.
31.5. Reordering an ARRAY According to an Index ARRAY
Figure 31–7. reorder (GENERIC)
| PROCEDURE reorder (dataType ARRAY(*) a; INTEGER ARRAY(*) ind); PROCEDURE reorder (dataType ARRAY(*) a; LONG INTEGER ARRAY(*) ind); |
dataType in the declaration of reorder may be any MAINSAIL data type: BOOLEAN, INTEGER, LONG INTEGER, REAL, LONG REAL, BITS, LONG BITS, STRING, ADDRESS, CHARADR, or POINTER. reorder rearranges the elements of a so that the value at the subscript i is moved to the subscript ind[i] for i in the range ind.$lb1 to ind.$ub1. ind is not modified. An error occurs if a or ind is Zero or if the bounds of ind fall outside the bounds of a. The effect is undefined if ind is not a valid index ARRAY, i.e., does not contain exactly one occurrence of each value in the range ind.$lb1 to ind.$ub1.
MAINSAIL Utilities User's Guide, Chapter 31