MAINSAIL Language Manual, Chapter 26

previous   next   top   contents   index   framed top   this page unframed


26. Portable Data Format (PDF)

26.1. Introduction

Distributed processing involves the sharing of data across different kinds of machines. This is done, for example, when transmitting files across a network, on when making remote PROCEDURE calls to a process on another machine (see
Chapter 5 of the MAINSAIL STREAMS User's Guide for a description of remote PROCEDURE calls). The use of a common data representation avoids each machine having to cope with the data representation of every other kind of machine. MAINSAIL's portable data format (PDF) is provided for this purpose.

The term PDF data refers to data formatted according to PDF. Table 26–1 shows the PDF representation for each MAINSAIL data type. The characters composing the PDF data representation are not human-readable.

PDF text, PDF characters, and PDF STRINGs are just like host text, characters, and STRINGs except that the characters are translated to the PDF character set. This translation does nothing if the host and PDF character sets are identical, as they are on many ASCII machines; see Appendix F.

Table 26–1. Portable Data Format (PDF) Representation of Data
Data Type Chars PDF Representation
BOOLEAN 2 TRUE is INTEGER 1; FALSE is INTEGER 0
INTEGER 2 2's-complement
LONG INTEGER 4 2's-complement
REAL 4 IEEE floating point
LONG REAL 8 IEEE floating point
BITS 2 16 bits
LONG BITS 4 32 bits

Chars is the number of 8-bit characters (bytes).

All values are stored with the high-order byte first (“big-endian” representation).

PDF characters are represented as ASCII characters.

Appendix F gives the character set translation between host character sets and the PDF character set.

PDF text and data can be manipulated in the following ways:

26.2. PDF I/O

The MAINSAIL I/O system allows an application program to store and retrieve PDF data from a file with the conversion between host data and PDF data carried out automatically by the reads and writes. The application program need not be aware that such conversions are taking place.

When a data file is opened for PDF I/O, all data read as individual MAINSAIL data type values from the file are interpreted as PDF data and all data written to the file as individual MAINSAIL data type values are written as PDF data. Values read as characters or text are interpreted as PDF characters, and values written as characters or text are written as PDF characters. Data written en masse as storage units or pages undergo no change. If the PDF data format is the same as the host data format for all data types and the host character set is the same as the PDF character set, then host and PDF data files are treated identically.

PDF specifies only the representation of individual data types in a file. The individual data in the file are not “tagged”, so it is not possible to read the file unless the sequence of data types in the file is known. For example, if you write one INTEGER, followed by one REAL, followed by one LONG BITS with PDF I/O, you must read the same data using PDF I/O with an INTEGER read, followed by a REAL read, followed by a LONG BITS read. Otherwise, the data will not be correctly interpreted.

When a datum is read from or written to a data file opened for PDF I/O, the effect is undefined of reading or writing a value outside the MAINSAIL guaranteed range for the datum's data type.

When a text file is opened for PDF I/O, all data read as individual MAINSAIL data type values from the file are read by scanning for the appropriate STRING representation and all data written to the file as individual MAINSAIL data type values are written as STRINGs. The STRINGs are in the PDF character set rather than the host character set. Values read as characters or text are interpreted as PDF characters, and values written as characters or text are written as PDF characters. If the host character set is the same as the PDF character set, then host and PDF text files are treated identically.

Figure 26–2. PDF Data Files vs. Host Data Files
<diagram>

When text is interpreted as PDF characters on input, it is translated to the host character set; when text in the host character set is written as PDF characters, a translation is also performed. A program reading and writing PDF files may therefore be written to deal only with the host character set, provided that it uses the subset of the host character set that has PDF equivalents.

Some of the I/O PROCEDUREs provide a way to suppress the translation that normally takes place when using PDF I/O.

The choice of whether or not to perform PDF I/O on a file is made at execution time when the file is opened. By default, the host format is used. PDF I/O is specified by including additional information in the file name as described below.

26.2.1. PDF I/O and $storageUnit/$page I/O

$storageUnit I/O (using the PROCEDUREs $storageUnitRead and $storageUnitWrite) and $page I/O (using the PROCEDUREs $pageRead and $pageWrite) simply copy the bytes at the specified ADDRESS to a file or vice versa, with no modification. These PROCEDUREs do not understand or interpret the data written as MAINSAIL data types: the data read or written are just a sequence of bytes.

Each PROCEDURE performing PDF I/O must know the data type of each value read or written. If a PROCEDURE does not have a way to specify the data type of the datum read or written, it does not know how to do the translation to or from PDF format, and so cannot support PDF I/O.

$pageRead, $pageWrite, $storageUnitRead, and $storageUnitWrite do not have any way of specifying the data type(s) of the data involved. They may be given arbitrary ADDRESSes at which any sort of data could be stored. Therefore, they cannot support PDF translations.

By contrast, read and write are GENERIC PROCEDUREs, so a different instance PROCEDURE is called for each data type, and the appropriate PDF translation for that type can be performed. The other PROCEDUREs that perform PDF I/O (including $characterRead/$characterWrite) all operate on text, and so perform translations to and from the PDF character set.

26.3. Opening a File for PDF I/O

A file with a name starting with the device prefix PDF is opened for PDF I/O. Alternately, the bit $pdf may be specified in the openBits parameter to open, which forces a file to be opened for PDF I/O whether its name contains the PDF device prefix or not.

A file opened for PDF I/O is sometimes referred to as a PDF file.

PDF must appear in the file name before any other device prefix specifications, e.g., where the $devModBrk character is >, PDF>LIB(foo.lib)>/baz.

PDF I/O is supported for the following PROCEDUREs:

read    cRead   $characterRead      fldRead     scan
write   cWrite  $characterWrite     fldWrite

i.e., if the file is opened for PDF I/O, then all forms of these PROCEDUREs access the data in the file as PDF text (for character and STRING operations and for text files) or PDF data (for non-character, non-STRING operations on data files). Character and STRING operations are considered to be cRead, cWrite, fldRead, fldWrite, scan, and the STRING forms of read and write.

By default, $characterRead and $characterWrite translate characters read from or written to PDF files. This translation can be suppressed by setting the $noTranslate bit in ctrlBits (an OPTIONAL BITS parameter to both $characterRead and $characterWrite).

26.4. Positions in a File Opened for PDF I/O

When a file is opened for PDF I/O, all file positions are in terms of character units; i.e., the units used for positioning by relPos and setPos are character units, and the positions returned by getPos and $getEofPos are character positions.

26.5. $ioSize

An application that does not know how data are formatted in a file must be careful when positioning within the file. For example, if a data file has been opened for PDF I/O, positions are character units; otherwise, they are storage units. Even if a storage unit is equal to a character unit, the sizes of host data and PDF data may differ. $ioSize is provided to simplify the writing of such applications.

$ioSize(f,x), where x is a MAINSAIL data type code, returns the size of x based on the format of the data in a data file f. For example, if f contains host data, $ioSize(f,x) returns the same value as size(x), but if f contains PDF data, $ioSize(f,x) returns the same value as pdfChars(x) (see the description of PDFMOD in Chapter 27 of the MAINSAIL Utilities User's Guide.

$ioSize returns 0 if f is a text file since there is no fixed size for the STRING representation of a data type.

26.6. PDF Example

PDF is designed so that as many programs as possible can be written to operate on either PDF data or host data without any special logic to handle the two different cases; in other words, as many PROCEDUREs as possible “do the right thing” for each case.

Example 26–3 shows a sample program FVIEW that displays data in a file. It is written to work independently of the format of the data in the file. Figure 26–4 shows how to run FVIEW.

Example 26–3. Data-Format-Independent I/O
BEGIN "fView"

fView is an interactive program that lets the user
view the contents of a file.
#
The use of $ioSize when positioning makes the program
independent of the type of data stored in the file.

INTEGER iSize;

PROCEDURE examine (POINTER(dataFiled);
BEGIN
INTEGER         t;
LONG INTEGER    l;
BITS            b;
l := getPos(d);
fldWrite(logFile,cvs(l),8,' '); write(logFile,"/   ");
read(d,t); b := cvb(t);
write(logFile,t,"  '",b,"  'H",cvs(b,hex),eol);
setPos(d,(getPos(d) - cvli(iSize)) MAX 0L);
END;



INITIAL PROCEDURE;
BEGIN
INTEGER             t,lastCmd,cmd;
LONG INTEGER        tt;
STRING              s;
POINTER(dataFile)   d;

s := $sGet(
    "
File viewer (? for help)" & eol &
    "
File to view: ");

open(d,s,random!input);
iSize := $ioSize(d,integerCode); examine(d); lastCmd := 0;

DOB s := $sGet("FVIEW>");
    
CASE cmd := cvu(first(s)) OFB
        ['
Q'] DONE;

        [-1] 
BEGIN
            
IF NOT relPos(d,iSize,errorOKTHEN
                
write(logFile,
                    "
Cannot position beyond end of file"
                    & 
eol);
            
examine(dEND;

        ['0' 
TO '9'] BEGIN
            
tt := cvli(s);
            
IF NOT setPos(d,cvli(cvlb(tt)),errorOKTHEN
                
write(logFile,
                    "
Cannot position beyond end of file"
                    & 
eol);
            
examine(dEND;

        ['^'] 
BEGIN
            
IF getPos(dTHEN relPos(d,- iSize,errorOK);
            
examine(dEND;

        ['+']['-'] 
BEGIN
            
IF length(s) = 1 THEN cWrite(s,'1');
            
t := cvi(s) * iSize;
            
IF NOT relPos(d,t,errorOKTHEN
                
write(logFile,
                    "
Cannot position beyond end of file"
                    & 
eol);
            
examine(dEND;

        ['?']
            
write(logFile,
                "
n   Examine position n" & eol &
                "
eol Step forward through file" & eol &
                "^   
Step backward through file" & eol &
                "+
n  Forward n integers" & eol &
                "-
n  Backward n integers" & eol &
                "
Q   Quit" & eol);

        [ ] 
write(logFile,"Type ? for help" & eol);
        
END;

    
lastCmd := cmd END;

close(dEND;

END "fView"

Figure 26–4. How to Run FVIEW
How to use FVIEW to examine a file that contains host data:

*fview<eol>
File viewer (? for help)
File to viewfview.dat<eol>
FVIEW displays the contents of fview.dat; fview.dat
contains host data

How to use FVIEW to examine a file that contains PDF data:

*fview<eol>
File viewer (? for help)
File to viewpdf>fview.dat<eol>
FVIEW displays the contents of fview.dat; fview.dat
contains PDF data


previous   next   top   contents   index   framed top   this page unframed

MAINSAIL Language Manual, Chapter 26