MAINSAIL Language Manual, Chapter 25

previous   next   top   complete contents   complete index   framed top   this page unframed


25. Areas

Areas provide a method for more precise control of memory management. Correct explicit use of areas is relatively complex, and requires considerable care in order to avoid introducing bugs that may be difficult to track. Programmers need not understand areas in order to write correct programs, and programs that do not allocate significant amounts of memory will not benefit from explicit use of areas. This chapter may therefore be considered optional reading for programmers not involved in writing or maintaining such programs.

An area is a dynamically growing collection of “dynamic objects” or “chunks” (collectable data structures, i.e., dynamic records, dynamic ARRAYs, and data sections) and STRING text, which can be treated as a unit. An area can be disposed, which immediately causes all of the area's memory (an integral number of pages) to be released; this simplifies the task of the MAINSAIL runtime system's memory manager, which can more efficiently reclaim free pages than free chunks (unallocated or deallocated memory from which dynamic objects may be allocated) or inaccessible text in STRING space. The memory management algorithms applied to an area can be explicitly specified by a program, so that unnecessary memory management does not take place. In some programs, the explicit use of areas can substantially improve the efficiency of memory management; however, bugs introduced by the improper use of areas can be very difficult to track.

25.1. Examples and Motivation

The exact details of memory management and the MAINSAIL compiler implied by this section are subject to change, although the explanation of the advantages of areas is expected to be broadly applicable to future releases of MAINSAIL.

An area could contain a data structure, such as a linked-list structure, that is built up, processed, and then no longer needed. If all of the structure's data are allocated in a single area, disposing the area when the structure is no longer needed frees all the memory used by the structure in a single operation. This has several advantages over individually disposing each dynamic object:

If areas are not explicitly used and the dynamic objects are disposed individually, they are put on free lists (based on size). Even if two free chunks are contiguous, they are not coalesced until a chunk garbage collection occurs. Disposing a large number of dynamic objects results in a large number of free chunks, rather than free pages. If another data structure consisting of dynamic objects of the same size is built up soon afterward, then the free chunks are reallocated, and the free lists have played their role well. However, if different-sized objects or nonchunk data (e.g., STRING space, control sections, file buffers, etc.) are needed, the free chunks are wasting space, and a garbage collection is needed to coalesce the chunks and free the pages consisting only of free chunks.

By contrast, freeing an entire area immediately frees all the pages, making them available for any kind of use.

An area can also be used as a “bag” containing unrelated objects and data structures, all of which should be disposed at the same time. A program (e.g., the MAINSAIL compiler) can allocate all its dynamic objects in the same area, then dispose the area when it is finished; this reclaims the whole bag. This is an especially convenient way to clean up when an exception occurs that causes a program to abort. For example, the MAINSAIL compiler disposes several areas when a compilation is aborted; in this way, the code to deallocate all current data structures can be simple and does not need to understand in detail any of the data structures.

25.2. Area Facilities

Any number of areas can be created during a MAINSAIL execution. Facilities are provided for allocating dynamic objects and STRING text in specified areas. System PROCEDUREs are provided to:

25.2.1. Allocation, Clearing, and Disposal

An area is specified by a POINTER to a record of the CLASS $area, which contains information about an area. $newArea, which creates areas, returns a POINTER to the $area record for the newly created area, which is “empty” (it actually contains the $area record itself and some supporting data structures). A name (or “title”) may be given to an area when it is created, to help identify the area in various situations during execution.

An area automatically grows to accommodate new dynamic objects and STRING text. The memory pages occupied by an area are not necessarily contiguous. There is a single default area, $defaultArea (with the title "Default area"), into which all objects and STRING text are put unless explicitly specified otherwise (so that programs that make no explicit use of areas use only $defaultArea). Another area, entitled "Dscr area", contains all dynamic object descriptors. Other areas may be created by the MAINSAIL runtime system or utilities. The effect of disposing or clearing any system areas is undefined. The titles of the system areas are subject to change.

Clearing an area returns it to its state immediately after allocation (i.e., empty except for the $area record and supporting structures). This is useful when the current contents of an area are no longer needed, but more data are to be put into the area. It is slightly more efficient than disposing and reallocating the area; also, if several POINTERs are pointing at the $area record, all of them would have to be made to point to the new $area record if the old area were disposed and then re-allocated (since the $area record is also disposed when the area is disposed). The STRING space part of an area can be cleared separately with $clearStrSpc.

Most system PROCEDUREs that generate STRING text can be given an $area parameter to specify the area into which the text is to be put (in most cases, the text is put into the area only if new text is generated; otherwise, the text remains in the area where it started out. See Section 29.4 for examples). Alternately, in the case of a file, the file record's $strArea field can be set to an area that is to receive input text by default:

STRING s;
POINTER(textFilef;
POINTER($areamyArea;
...
f.$strArea := myArea;
read(f,s); # input text goes into myArea

25.2.2. Specifying Memory Management Attributes of an Area

Chunk collections, chunk compactions, and STRING collections are performed only on areas with memory management attributes permitting these operations. The memory management attributes of an area are established when the area is allocated by
$newArea.

Chunk collection of an area is useful when many dynamic objects become inaccessible, i.e., when many objects are not referenced by any accessible POINTER. Chunk compaction is useful when dynamic objects become fragmented by a mixture of free and allocated chunks, and such fragmentation causes inefficient use of memory due to the inability to find free chunks to satisfy allocation requests (if all chunks in an area are the same size, fragmentation is not a problem). STRING collection (which marks all accessible text, then compacts it within STRING space) is useful when there is a lot of inaccessible STRING text, i.e., text not referenced by any accessible STRING descriptor (the CHARADR and length that represent a STRING). Each of these operations can take quite some time in a large address space, and can be particularly slow when virtual memory must be swapped in from disk, since they involve an essentially random pattern of many memory accesses. It saves time to avoid collections and compactions that do not reclaim much space.

By default, when an area is allocated, it is not marked as collectable or compactible. The assumption is that an area managed by a single program typically does not contain enough garbage, or become sufficiently fragmented, to justify the time spent collecting or compacting. It is expected that the programmer knows enough about the anticipated use of the area to know whether this assumption holds. For example, an area used to build up a binary tree, process it, and then dispose it may never contain any garbage, or even any free chunks. If the assumption that collections and compactions are unnecessary in an area does not hold, but the programmer does not contradict the default assumption by setting the correct bits in the call to $newArea that allocates the area, then the area could build up enough garbage to cause the MAINSAIL execution to abort due to lack of memory.

The programmer can control the memory management attributes of an area by specifying the following predefined LONG BITS constants for $newArea's attr argument:

attr Bit Description
$collectableChkSpc collect area's chunks
$compactableChkSpc compact area's chunks
$collectableStrSpc collect area's STRING text

For example, $defaultArea is allocated as if by the call:

$newArea("Default area",
    
$collectableChkSpc!$compactableChkSpc!$collectableStrSpc)

since collections and compactions occur in $defaultArea.

By default, even if an area A is not marked $collectableChkSpc or $compactableChkSpc, all the POINTERs it contains must be examined when one or more other areas are being chunk collected or compacted in order to determine whether A has any POINTERs into any of the collected or compacted areas (each dynamic object that A references must be marked as accessible). Likewise, for STRING collections, all A's STRINGs must be examined to determine whether they reference STRING-collected areas. However, the programmer may know that a particular area contains no POINTERs into any area that could be chunk collected or chunk compacted, or no STRING variable referencing text in any area that could be STRING collected. In this case, the area need not be examined at all when a collection or compaction takes place (provided the area is not itself collectable or compactible); this may be specified in attr bits to $newArea, which may result in significantly faster collections and compactions. For example, suppose an area contains a symbol table that consists of records in which all the POINTER and STRING fields reference chunks and STRING text in the same area. A garbage collection or compaction can safely ignore the area, assuming the area itself is marked so as not to take part in the collection or compaction. As another example, an area may consist entirely of records with no STRING fields, so that there is never a need to examine the area when collecting STRINGs.

The following predefined LONG BITS constants for $newArea's attr argument are provided for indicating this kind of information:

attr Bit Description
$noCollectablePtrs This area contains no POINTERs into $collectableChkSpc areas, and hence its POINTERs need not be examined when collecting chunks.
$noCompactablePtrs This area contains no POINTERs into $compactableChkSpc areas, and hence its POINTERs need not be examined when compacting chunks.
$noCollectableStrs This area contains no STRING descriptors into $collectableStrSpc areas, and hence its STRING descriptors need not be examined when collecting STRINGs.

Be very careful when specifying these bits; if the conditions specified by the bits do not hold, effects are undefined and the resulting bugs can be strange, unpredictable, and delayed, and thus difficult to track down.

Figure 25–1. POINTER Assignment That Violates the Bits Specified in $newArea's attr Argument
<diagram>

An area that contains a data section must not be marked $noCollectablePtrs or $noCompactablePtrs.

If you wish to change the memory management attributes of an area after it has been created, call $changeAreaParms.

25.3. Inaccessible Areas

Areas are not disposed by the garbage collector, even if no accessible POINTERs or STRING descriptors point into them, since such areas are still accessible by title, using $findArea. It is the programmer's responsibility to dispose an area that is no longer needed.

25.4. Anchored Areas

Values of the data types POINTER and STRING must be used with care if they are to be shared with a foreign language. POINTERs and STRINGs take part in garbage collection, and can reference data that are moved by the MAINSAIL memory manager. If the foreign language obtains a POINTER or STRING and holds onto it while MAINSAIL has an opportunity to do memory management, the data referenced by the POINTER or STRING must be allocated on static pages or in an anchored area that is not subject to garbage collection or chunk compaction. Otherwise, the data referred to by an address the foreign language holds may move, thereby invalidating the address. Foreign language code has no way to know that the address is no longer valid.

Anchored areas allow you to prevent the MAINSAIL memory manager from moving pages within a specified area. This makes the area suitable for allocating data that can be passed to a foreign language, since you do not have to worry that the MAINSAIL memory manager will move the data, thereby invalidating addresses held by foreign code. You can allocate and manipulate the data using all the high-level facilities of MAINSAIL.

As an alternative to the use of anchored areas, you can allocate all data to be passed to a foreign language either within the foreign language itself or in MAINSAIL's static memory. In either case, you have to manipulate the memory from within MAINSAIL using ADDRESSes instead of POINTERs and ARRAYs, because ADDRESSes are the data type returned by PROCEDUREs that allocate static memory. The use of ADDRESSes is syntactically clumsy and cannot benefit from MAINSAIL's automatic checking (e.g., for NULLPOINTER, NULLARRAY, or ARRAY index out of bounds), so anchored areas are usually an easier solution.

25.4.1. When to Use Anchored Areas

You should anchor an area only when it is to contain data structures whose addresses are passed to foreign code; otherwise, you may interfere with the MAINSAIL memory management algorithm in a way that adversely affects MAINSAIL's performance.

25.4.2. How to Anchor an Area

To anchor an area, pass the bit $anchored as the second parameter to $newArea or to $changeAreaParms:

$newArea("Area title",$anchored); # Create anchored area
$changeAreaParms(area,$anchored); # Anchor existing area

To unanchor an anchored area, pass $anchored as the third parameter to $changeAreaParms:

$changeAreaParms(area,'0L,$anchored); # Unanchor area

When you want to pass MAINSAIL data to a foreign routine, create an anchored area and allocate your data in the area by using the area parameter to MAINSAIL system PROCEDUREs that allocate memory.

25.4.3. Compatibility of $anchored with Other Area Attributes

It does not make sense for an anchored area to have the bits $collectableChkSpc, $compactableChkSpc, or $collectableStrSpc set (these bits are turned off by default when you call $newArea with no attr bits specified). Turning on $compactableChkSpc would defeat the purpose of turning on $anchored, since individual dynamic objects can still be moved around in the area even though entire pages cannot be moved. Turning on $collectableChkSpc or $collectableStrSpc would risk collecting data that the foreign language may be referencing if the foreign code holds the only reference to the data. Consequently, $newArea and $changeAreaParms issue an error message if you have any of these bits set in conjunction with $anchored, and turn off the bits that conflict with $anchored.

25.4.4. Preallocated Space for Areas

Anchoring an area does not automatically cause its pages to be allocated contiguously within memory unless space for the area is preallocated. If sufficient space is not preallocated, then allocating many small data structures within an anchored area at different times during your program's execution may cause memory fragmentation problems. The area spreads across memory as it grows, often one page at a time. This can lead to a “checkerboard” effect of pages that cannot be moved interspersed with pages outside the area that can become free but are “trapped” among the anchored area pages. The trapped pages cannot be coalesced when a large contiguous piece of memory is needed. This can make the memory manager run slower or cause your program to terminate prematurely if it attempts to allocate a large ARRAY.

To solve this problem, $newArea and $changeAreaParms provide OPTIONAL parameters that let you preallocate space for an area. As the area grows, it first uses any preallocated space, then when that is exhausted it either preallocates additional space or uses nonpreallocated space, based on parameters described below.

You should preallocate space for any anchored area that will contain many different dynamic objects or STRINGs, and could therefore grow piecemeal in such a way as to cause memory management problems. Appropriate use of preallocated space requires that you make a reasonably good guess as to how much space will be needed by an area, though this guess need not be perfectly accurate, since the area will grow as necessary in any case.

When an area is created, you can specify an initial amount of preallocated space to be set aside for the area, and an increment size to be allocated whenever additional space is needed. A contiguous region of memory of the requested size is set aside to be used as needed. Even if the area does not need all the space, it is nevertheless set aside until the area is disposed, or until you explicitly free it by calling $changeAreaParms with an incrementalPreallocatedSpace value of -1L.

Preallocated space is used for both chunk- and STRING-space pages.

The headers for $newArea and $changeAreaParms are:

POINTER($area)
PROCEDURE   $newArea    (OPTIONAL STRING title;
                         
OPTIONAL LONG BITS attr;
                         
OPTIONAL LONG INTEGER
                             
loStrSpcChars,hiStrSpcChars,
                             
initialPreallocatedSpace,
                             
incrementalPreallocatedSpace);

PROCEDURE   $changeAreaParms
                        (
POINTER($areaareaRec;
                         
LONG BITS attrBitsToSet;
                         
OPTIONAL LONG BITS
                             
attrBitsToClear;
                         
OPTIONAL LONG INTEGER
                             
loStrSpcChars,hiStrSpcChars,
                             
incrementalPreallocatedSpace);

In $newArea, initialPreallocatedSpace is the number of bytes set aside in the initial preallocation of space for the area, and incrementalPreallocatedSpace is the number of bytes of contiguous space allocated each time the area is expanded. Both numbers are rounded up to whole pages. incrementalPreallocatedSpace is ignored if initialPreallocatedSpace is less than or equal to 0L. If incrementalPreallocatedSpace is 0L, no additional preallocated space is obtained once the initial amount is used up, though the increment size can be changed by $changeAreaParms.

In $changeAreaParms, if incrementalPreallocatedSpace is non-Zero, it changes the amount of memory requested the next time the area is expanded. To turn off further preallocation, you can set incrementalPreallocatedSpace to any value less than or equal to cvli($pageSize) (preallocating one page has no effect, since that is the minimum amount of memory added to an area when the area expands in any case). To discard any unused preallocated space, set incrementalPreallocatedSpace to -1L (you should do this when you know you are done allocating data in the area to avoid wasting space).

25.5. Area Caveats

Read this section carefully before making use of areas!

When an area is disposed or cleared, all POINTERs and STRINGs outside the area that reference data inside the area are dangling, i.e., reference invalid data (when just the STRING space part of the area is cleared, only STRINGs referencing the area are dangling). The effects of using a dangling POINTER or STRING are undefined, just as in the case of a dangling POINTER that points to a dynamic object that has been individually disposed. The MAINSAIL memory manager has been designed to ignore dangling POINTERs and STRINGs; in particular, the garbage collector does not fail if it encounters a dangling POINTER or STRING.

Figure 25–2. Dangling POINTERs and STRINGs after an Area Is Disposed
<diagram>

A program must not use a dangling POINTER or STRING. Whenever a dynamic object or STRING text is allocated in an area, the programmer must be certain that it does not outlive the area, i.e. that the program does not attempt to reference the object or the STRING text after the area has been disposed.

Special care must be taken when putting STRING text into an area, to avoid dangling STRINGs. The effects are undefined if you assign a STRING system variable a value in an area, and then dispose the area. Also, if you pass a STRING as a field of a dynamic record to a PROCEDURE that holds onto the record (i.e., that sets a nonlocal POINTER to point to the record, like the global symbol table PROCEDUREs or the HSHMOD interface PROCEDUREs), the system PROCEDURE generally will not call $getInArea on the STRING fields to copy them into $defaultArea. You must assume that the system PROCEDURE will hold onto the STRING text (by holding onto the record), and you must cause the record to be removed from the data structure where the system PROCEDUREs have stored it (e.g., by removing the global symbol or hashed record) before you can safely dispose the area where the text for the STRING fields of the record is located.

Unless otherwise specified, it is permissible to pass a STRING as a USES or MODIFIES parameter to a system PROCEDURE when the STRING's text is in an area that will be disposed; unless otherwise stated, each system PROCEDURE calls $getInArea to make a copy of the STRING's text in $defaultArea if there is a chance that it will hold onto the STRING after the PROCEDURE returns. However, you must be careful in dealing with system PROCEDUREs that allow your code to gain control during a call to the PROCEDURE. For example, PROCEDUREs like errMsg and $raise take STRING parameters and also cause an exception. If the STRING parameters to these PROCEDUREs are in an area you have created, you must not dispose the area during the handling of the exception. After the PROCEDUREs return, disposing the area causes no problems.

Many PROCEDUREs return or produce substrings of their STRING parameters. Examples include scan, $removeWord, the substring operator itself, etc. Such substrings are in the same area as the STRING they are derived from. The documentation does not specifically point out such PROCEDUREs; wherever a PROCEDURE's output STRING is logically derived as a substring from an input STRING, you may assume that the output STRING descriptor may in fact point into the text of the input STRING, and therefore that the output STRING's text may be in the same area as the input STRING's.

Dangling POINTERs may produce bugs as difficult to track as those produced by dangling STRINGs, but are perhaps easier to manage conceptually, since the programmer always explicitly allocates and deallocates dynamic objects, so the problems of dangling POINTERs exist even when areas are not used if the programmer ever calls dispose.

Of course, you must always set the area's attribute bits in a way that accurately reflects the kind of data that will be put in the area.


previous   next   top   complete contents   complete index   framed top   this page unframed

MAINSAIL Language Manual, Chapter 25