MAINSAIL Utilities User's Guide, Chapter 31

previous   next   top   contents   index   framed top   this page unframed


31. SRTMOD, Sorting Package

The sorting package is a program utility that provides the ability to sort one-dimensional dynamic ARRAYs based on the values of their elements, and facilities from which a programmer may generate sorting PROCEDUREs for other data structures. The sort may be an in-place sort (the elements of the ARRAY are rearranged) or an index sort (a parallel dynamic ARRAY of subscripts (indices) is rearranged to represent the sorted arrangement of the ARRAY to be sorted). The sorting algorithm used is a mixture of quicksort and bubble sort. PROCEDUREs are also provided to reverse the order of elements of a one-dimensional ARRAY or to rearrange them based on an index ARRAY.

The facilities in the sorting package are located in an intmod called SRTMOD. Therefore, to use the identifiers from it, either use the directive:

RESTOREFROM "srtmod";

or prefix each identifier described herein with srtmod$, e.g., srtmod$sort.

In the description of the PROCEDUREs in this chapter, where instances of a GENERIC PROCEDURE have analogous parameters of a number of different types, the notation dataType may be used in place of the actual data type names in the PROCEDURE headers. The description of the PROCEDURE enumerates the possible values for dataType.

31.1. Sorting Dynamic ARRAYs of Ordered Data Types

Ordered data types, for the purpose of this section, are BOOLEAN, STRING, (LONG) INTEGER, and (LONG) REAL. In the case of BOOLEAN, TRUE is considered greater than FALSE; STRINGs are compared caselessly, as by the system PROCEDURE compare with the upperCase bit set; the other data types are ordered as by the MAINSAIL comparison operators.

Figure 31–1. sort (GENERIC)
PROCEDURE    sort       (dataType ARRAY(*) a;
                         
OPTIONAL LONG INTEGER lb,ub);

PROCEDURE    sort       (dataType ARRAY(*) a;
                         
OPTIONAL INTEGER lb,ub);

PROCEDURE    sort       (dataType ARRAY(*) a;
                         
LONG INTEGER lb,ub;
                         
MODIFIES LONG INTEGER
                             
ARRAY(*) ind);

PROCEDURE    sort       (dataType ARRAY(*) a;
                         
INTEGER lb,ub;
                         
MODIFIES INTEGER ARRAY(*) ind);

dataType in the declaration of sort may be BOOLEAN, INTEGER, LONG INTEGER, REAL, LONG REAL, or STRING. All forms sort the elements of a from lb to ub, or from a.$lb1 to a.$ub1 if lb and ub are not specified. An error message is issued if lb and ub are specified and either is outside the bounds of the ARRAY.

The in-place forms of sort are those that take only one ARRAY parameter a. The elements of a are reordered so that every element has a value greater than or equal to the values at all lesser subscripts; i.e., greater values are at greater subscripts. An error occurs if a is Zero.

The index forms of sort are those that take an ARRAY parameter a and an ARRAY index ARRAY parameter ind. a is not modified. If ind is Zero or if its bounds do not equal the bounds of the subrange of a to be examined, it is reallocated to be the right size. The elements of ind are set to the ordered subscripts of a; i.e., successive elements of ind, when used as subscripts of a, result in a sorted series of values, greater values at greater subscripts of ind.

Index sorts are useful if several parallel ARRAYs with related values are to be sorted, e.g., ARRAYs of names and dates, where the date attached to a name has the same index in one ARRAY as the name in the other. Sorting each ARRAY in place would destroy the relationship between the names and the dates. An index sort does not modify the name and date ARRAYs, but can produce the ordering of the two ARRAYs based on either the name or the date. If the program of Example 31–2 is given the input file shown in Example 31–3, the output is as shown in Example 31–4.

Example 31–2. A Parallel Sorting Program
BEGIN "parSrt"

Display a sorted listing of MODULEs along with dates of
creation.  An unsorted list is read from a file in which
the first line has the number of MODULE names in the
file and each subsequent line has the name of a MODULE
followed by a date.

RESTOREFROM "srtmod";

INITIAL PROCEDURE;
BEGIN
INTEGER i,j;
STRING s;
POINTER(textFilef;
INTEGER ARRAY(1 TO *) ind;
LONG INTEGER ARRAY(1 TO *) date;
STRING ARRAY(1 TO *) name;
open(f,"Unsorted file: ",prompt!input);
read(f,s); j := cvi(s); new(date,1,j); new(name,1,j);
FOR i := 1 UPTO j DOB
    
read(f,s); name[i] := $removeWord(s);
    
date[i] := $strToDate(sEND;
close(f);
IF confirm("Sort by name (else by date)") THEN
    
sort(name,1,j,ind)
EL  sort(date,1,j,ind);
FOR i := 1 UPTO j DOB
    
fldWrite(logFile,name[ind[i]],8,- ' ');
    
write(logFile,$dateToStr(date[ind[i]],$hyphenateDate),
        
eolEND;
END;

END "parSrt"

Example 31–3. Input File parsrt.dat for PARSRT
8
AMOD 20-oct-86
ZMOD 20-jan-87
X 13-feb-87
PHOTO 12-jun-85
TIMER 9-oct-86
POST 11-jan-85
PRE 6-jul-86
INTER 9-aug-86

Example 31–4. PARSRT Execution
*parsrt<eol>
Unsorted fileparsrt.dat<eol>
Sort by name (else by date) (Yes or No): y<eol>
AMOD    20-Oct-86
INTER   9-Aug-86
PHOTO   12-Jun-85
POST    11-Jan-85
PRE     6-Jul-86
TIMER   9-Oct-86
X       13-Feb-87
ZMOD    20-Jan-87
*
parsrt<eol>
Unsorted fileparsrt.dat<eol>
Sort by name (else by date) (Yes or No): n<eol>
POST    11-Jan-85
PHOTO   12-Jun-85
PRE     6-Jul-86
INTER   9-Aug-86
TIMER   9-Oct-86
AMOD    20-Oct-86
ZMOD    20-Jan-87
X       13-Feb-87

31.2. Sorting ARRAYs with User-Defined Ordering

The macro generateQuickSort is used to generate a GENERIC PROCEDURE to perform a sort with a user-defined ordering. This is useful, e.g., if the elements of the ARRAY to be sorted are records of which one or more fields must be examined to determine their relationship. generateQuickSort may be called wherever a PROCEDURE declaration is legal.

generateQuickSort takes five arguments as follows:

generateQuickSort (
    
procedureName,
    
elementType,
    
isLessThan,
    
isLEQ,
    
isEqual);

procedureName is the name of the GENERIC PROCEDURE to be generated. elementType is the data type of the elements of the ARRAY, e.g., POINTER(foo); the permissible data types are:

BOOLEAN         INTEGER         LONG INTEGER
REAL            LONG REAL       BITS
LONG BITS       STRING          ADDRESS
CHARADR         POINTER         $PROCVAR

where POINTER may be classified and $PROCVAR may be modeled. isLessThan, isLEQ, and isEqual are the names of BOOLEAN macros or PROCEDUREs that take two arguments of type elementType; they return TRUE if and only if the first element is, respectively, less than, less than or equal to, or equal to the second.

The generated GENERIC PROCEDURE declaration and the headers of the generated PROCEDUREs look like:

GENERIC PROCEDURE procedureName
    "
procedureNameInPlace,procedureNameInPlaceShort," &
    "
procedureNameIndexed,procedureNameIndexedShort";

PROCEDURE procedureNameInPlace
        (
elementType ARRAY(*) a;
         
OPTIONAL LONG INTEGER alb,aub);

PROCEDURE procedureNameInPlaceShort
        (
elementType ARRAY(*) a;
         
OPTIONAL INTEGER alb,aub);

PROCEDURE procedureNameIndexed
        (
elementType ARRAY(*) a;
         
LONG INTEGER alb,aub;
         
MODIFIES LONG INTEGER ARRAY(*) ind);

PROCEDURE procedureNameIndexedShort
        (
elementType ARRAY(*) a;
         
INTEGER alb,aub;
         
MODIFIES INTEGER ARRAY(*) ind);

Additional PROCEDUREs, the names of which begin with procedureName, may be generated but are not intended to be called directly.

The parameters a, alb, aub, and ind correspond to the parameters a, lb, ub, and ind to sort; provided that the arguments to generateQuickSort are well-formed and correct, the generated PROCEDURE functions exactly like sort except that the ordering criteria are provided by the programmer. For example, the standard REAL ARRAY forms of sort could be generated by:

DEFINE
    
rLessThan(a,b) = [((a) < (b))],
    
rLEQ(a,b) = [((aLEQ (b))],
    
rEqual(a,b) = [((a) = (b))];
generateQuickSort(sort,REAL,rLessThan,rLEQ,rEqual);

31.3. Multiple-ARRAY or Non-ARRAY Sorting

The macro generateMultipleQuickSort may be used to generate a sort based on multiple ARRAYs, two- or three-dimensional ARRAYs, or non-ARRAY data structures. generateMultipleQuickSort takes four or five parameters as follows:

generateMultipleQuickSort (
    
procedureName,
    
isLessThan,
    
isLEQ,
    
isEqual);

or:

generateMultipleQuickSort (
    
procedureName,
    
isLessThan,
    
isLEQ,
    
isEqual,
    
LONG);

procedureName is the name of the GENERIC PROCEDURE to be generated. isLessThan, isLEQ, and isEqual are the names of BOOLEAN macros or PROCEDUREs that take two arguments of type INTEGER (if LONG is absent) or LONG INTEGER (if LONG is present); they return TRUE if and only if the (program-defined) quantity corresponding to the first argument is, respectively, less than, less than or equal to, or equal to the second.

If LONG is present, the generated GENERIC PROCEDURE declaration and the header of the generated PROCEDURE look like:

GENERIC PROCEDURE procedureName "procedureNameIndexed";

PROCEDURE procedureNameIndexed
    (
LONG INTEGER alb,aub;
     
MODIFIES LONG INTEGER ARRAY(*) ind);

If LONG is absent, the generated GENERIC PROCEDURE declaration and the header of the generated PROCEDURE look like:

GENERIC PROCEDURE procedureName "procedureNameIndexedShort";

PROCEDURE procedureNameIndexedShort
    (
INTEGER alb,aub;
     
MODIFIES INTEGER ARRAY(*) ind);

By calling generateMultipleQuickSort twice with the same procedureName, once with LONG present and once with it absent, a GENERIC PROCEDURE with two instances is generated. The generated PROCEDURE sets ind in the range alb to aub to reflect the ordering specified by isLessThan, isLEQ, and isEqual.

Example 31–5 shows how to sort two parallel ARRAYs a and b, with a as the primary and b the secondary sort. The ARRAYs a and b must be outer variables, since they cannot be passed as arguments to the generated sort PROCEDUREs.

Example 31–5. Use of generateMultipleQuickSort
INTEGER ARRAY(*) a,b;

DEFINE
    
myLT(i,j) =
        [(
a[i] < a[jOR (a[i] = a[jAND b[i] < b[j]))],
    
myLEQ(i,j) =
        [(
a[i] < a[jOR
            (
a[i] = a[jAND b[iLEQ b[j]))],
    
myEQ(i,j) =
        [(
a[i] = a[jAND b[i] = b[j])];

generateMultipleQuickSort(myGenSort,myLT,myLEQ,myEQ);

        .
        .
        .

PROCEDURE foo;
BEGIN
INTEGER ARRAY(*) ind;
        .
        .
        .
myGenSort(1,1000,ind);
ind now contains the sorted subscripts.  To reorder both
a and b according to ind:
reorder(a,ind); reorder(b,ind);
        .
        .
        .

31.4. Reversing an ARRAY

Figure 31–6. reverse (GENERIC)
PROCEDURE   reverse     (dataType ARRAY(*) a;
                         
OPTIONAL INTEGER lb,ub);

PROCEDURE   reverse     (dataType ARRAY(*) a;
                         
OPTIONAL LONG INTEGER lb,ub);

dataType in the declaration of reverse may be any MAINSAIL data type: BOOLEAN, INTEGER, LONG INTEGER, REAL, LONG REAL, BITS, LONG BITS, STRING, ADDRESS, CHARADR, or POINTER. reverse reverses an ARRAY a from lb to ub, or from a.$lb1 to a.$ub1 if lb and ub are not specified. The elements of a are rearranged to put the value at the highest subscript at the lowest subscript and vice versa, and the second highest at the second lowest and vice versa, and so on. An error occurs if a is Zero or lb or ub falls outside the range a.$lb1 to a.$ub1.

31.5. Reordering an ARRAY According to an Index ARRAY

Figure 31–7. reorder (GENERIC)
PROCEDURE   reorder     (dataType ARRAY(*) a;
                         
INTEGER ARRAY(*) ind);

PROCEDURE   reorder     (dataType ARRAY(*) a;
                         
LONG INTEGER ARRAY(*) ind);

dataType in the declaration of reorder may be any MAINSAIL data type: BOOLEAN, INTEGER, LONG INTEGER, REAL, LONG REAL, BITS, LONG BITS, STRING, ADDRESS, CHARADR, or POINTER. reorder rearranges the elements of a so that the value at the subscript i is moved to the subscript ind[i] for i in the range ind.$lb1 to ind.$ub1. ind is not modified. An error occurs if a or ind is Zero or if the bounds of ind fall outside the bounds of a. The effect is undefined if ind is not a valid index ARRAY, i.e., does not contain exactly one occurrence of each value in the range ind.$lb1 to ind.$ub1.


previous   next   top   contents   index   framed top   this page unframed

MAINSAIL Utilities User's Guide, Chapter 31