previous next top contents index framed top this page unframed
The term PDF data refers to data formatted according to PDF. Table 26–1 shows the PDF representation for each MAINSAIL data type. The characters composing the PDF data representation are not human-readable.
PDF text, PDF characters, and PDF STRINGs are just like host text, characters, and STRINGs except that the characters are translated to the PDF character set. This translation does nothing if the host and PDF character sets are identical, as they are on many ASCII machines; see Appendix F.
Table 26–1. Portable Data Format (PDF) Representation of Data
Chars is the number of 8-bit characters (bytes). All values are stored with the high-order byte first (“big-endian” representation). PDF characters are represented as ASCII characters. Appendix F gives the character set translation between host character sets and the PDF character set. |
PDF text and data can be manipulated in the following ways:
When a data file is opened for PDF I/O, all data read as individual MAINSAIL data type values from the file are interpreted as PDF data and all data written to the file as individual MAINSAIL data type values are written as PDF data. Values read as characters or text are interpreted as PDF characters, and values written as characters or text are written as PDF characters. Data written en masse as storage units or pages undergo no change. If the PDF data format is the same as the host data format for all data types and the host character set is the same as the PDF character set, then host and PDF data files are treated identically.
PDF specifies only the representation of individual data types in a file. The individual data in the file are not “tagged”, so it is not possible to read the file unless the sequence of data types in the file is known. For example, if you write one INTEGER, followed by one REAL, followed by one LONG BITS with PDF I/O, you must read the same data using PDF I/O with an INTEGER read, followed by a REAL read, followed by a LONG BITS read. Otherwise, the data will not be correctly interpreted.
When a datum is read from or written to a data file opened for PDF I/O, the effect is undefined of reading or writing a value outside the MAINSAIL guaranteed range for the datum's data type.
When a text file is opened for PDF I/O, all data read as individual MAINSAIL data type values from the file are read by scanning for the appropriate STRING representation and all data written to the file as individual MAINSAIL data type values are written as STRINGs. The STRINGs are in the PDF character set rather than the host character set. Values read as characters or text are interpreted as PDF characters, and values written as characters or text are written as PDF characters. If the host character set is the same as the PDF character set, then host and PDF text files are treated identically.
When text is interpreted as PDF characters on input, it is translated to the host character set; when text in the host character set is written as PDF characters, a translation is also performed. A program reading and writing PDF files may therefore be written to deal only with the host character set, provided that it uses the subset of the host character set that has PDF equivalents.
Some of the I/O PROCEDUREs provide a way to suppress the translation that normally takes place when using PDF I/O.
The choice of whether or not to perform PDF I/O on a file is made at execution time when the file is opened. By default, the host format is used. PDF I/O is specified by including additional information in the file name as described below.
26.2.1. PDF I/O and $storageUnit/$page I/O
$storageUnit I/O (using the PROCEDUREs $storageUnitRead and
$storageUnitWrite) and $page I/O (using the
PROCEDUREs $pageRead and
$pageWrite) simply copy the bytes at the specified
ADDRESS to a file
or vice versa, with no modification. These
PROCEDUREs do not understand
or interpret the data written as MAINSAIL data types: the data read or
written are just a sequence of bytes.
Each PROCEDURE performing PDF I/O must know the data type of each value read or written. If a PROCEDURE does not have a way to specify the data type of the datum read or written, it does not know how to do the translation to or from PDF format, and so cannot support PDF I/O.
$pageRead, $pageWrite, $storageUnitRead, and $storageUnitWrite do not have any way of specifying the data type(s) of the data involved. They may be given arbitrary ADDRESSes at which any sort of data could be stored. Therefore, they cannot support PDF translations.
By contrast, read and write are
GENERIC PROCEDUREs, so a different
instance PROCEDURE is called for each data type,
and the appropriate PDF
translation for that type can be performed.
The other PROCEDUREs that
perform PDF I/O (including $characterRead/$characterWrite)
all operate on text, and so perform translations to and
from the PDF character set.
26.3. Opening a File for PDF I/O
A file with a name starting
with the device prefix PDF is opened for PDF I/O.
Alternately, the bit $pdf may be specified in the
openBits parameter
to open, which forces a file to be opened for PDF I/O whether its
name contains the PDF device prefix or not.
A file opened for PDF I/O is sometimes referred to as a PDF file.
PDF must appear in the file name before any other device prefix specifications, e.g., where the $devModBrk character is >, PDF>LIB(foo.lib)>/baz.
PDF I/O is supported for the following PROCEDUREs:
read cRead $characterRead fldRead scan
write cWrite $characterWrite fldWrite
i.e., if the file is opened for PDF I/O, then all forms of these PROCEDUREs access the data in the file as PDF text (for character and STRING operations and for text files) or PDF data (for non-character, non-STRING operations on data files). Character and STRING operations are considered to be cRead, cWrite, fldRead, fldWrite, scan, and the STRING forms of read and write.
By default, $characterRead and $characterWrite
translate characters
read from or written to PDF files.
This translation can be suppressed by setting the
$noTranslate bit in ctrlBits
(an OPTIONAL BITS parameter to both
$characterRead and $characterWrite).
26.4. Positions in a File Opened for PDF I/O
When a file is opened for PDF I/O, all file positions are in terms of
character units;
i.e., the units used for positioning by relPos and setPos
are character units,
and the positions returned by getPos and $getEofPos are
character positions.
26.5. $ioSize
An application that does not know how data are formatted in a file
must be careful when positioning within the file. For example, if a data
file has been opened for PDF I/O, positions are character units;
otherwise,
they are storage units. Even if a storage unit is equal to a character
unit,
the sizes of host data and PDF data may differ. $ioSize
is provided to
simplify the writing of such applications.
$ioSize(f,x), where x is a MAINSAIL data type code, returns the size of x based on the format of the data in a data file f. For example, if f contains host data, $ioSize(f,x) returns the same value as size(x), but if f contains PDF data, $ioSize(f,x) returns the same value as pdfChars(x) (see the description of PDFMOD in Chapter 27 of the MAINSAIL Utilities User's Guide.
$ioSize returns 0 if f is a text file since
there is no fixed size for the STRING representation of a data type.
Example 26–3 shows a sample program FVIEW that displays data in
a file. It is written to work independently of the format of the data in
the file. Figure 26–4 shows how to run FVIEW. Example 26–3. Data-Format-Independent I/O
How to use FVIEW to examine a file that contains PDF
data:
26.6. PDF Example
PDF is designed so that as many programs as possible can be written
to operate on either PDF data or host data without any special logic
to handle the two different cases; in other words,
as many PROCEDUREs
as possible “do the right thing” for each case.
BEGIN "fView"
# fView is an interactive program that lets the user
# view the contents of a file.
#
# The use of $ioSize when positioning makes the program
# independent of the type of data stored in the file.
INTEGER iSize;
PROCEDURE examine (POINTER(dataFile) d);
BEGIN
INTEGER t;
LONG INTEGER l;
BITS b;
l := getPos(d);
fldWrite(logFile,cvs(l),8,' '); write(logFile,"/ ");
read(d,t); b := cvb(t);
write(logFile,t," '",b," 'H",cvs(b,hex),eol);
setPos(d,(getPos(d) - cvli(iSize)) MAX 0L);
END;
INITIAL PROCEDURE;
BEGIN
INTEGER t,lastCmd,cmd;
LONG INTEGER tt;
STRING s;
POINTER(dataFile) d;
s := $sGet(
"File viewer (? for help)" & eol &
"File to view: ");
open(d,s,random!input);
iSize := $ioSize(d,integerCode); examine(d); lastCmd := 0;
DOB s := $sGet("FVIEW>");
CASE cmd := cvu(first(s)) OFB
['Q'] DONE;
[-1] BEGIN
IF NOT relPos(d,iSize,errorOK) THEN
write(logFile,
"Cannot position beyond end of file"
& eol);
examine(d) END;
['0' TO '9'] BEGIN
tt := cvli(s);
IF NOT setPos(d,cvli(cvlb(tt)),errorOK) THEN
write(logFile,
"Cannot position beyond end of file"
& eol);
examine(d) END;
['^'] BEGIN
IF getPos(d) THEN relPos(d,- iSize,errorOK);
examine(d) END;
['+']['-'] BEGIN
IF length(s) = 1 THEN cWrite(s,'1');
t := cvi(s) * iSize;
IF NOT relPos(d,t,errorOK) THEN
write(logFile,
"Cannot position beyond end of file"
& eol);
examine(d) END;
['?']
write(logFile,
"n Examine position n" & eol &
"eol Step forward through file" & eol &
"^ Step backward through file" & eol &
"+n Forward n integers" & eol &
"-n Backward n integers" & eol &
"Q Quit" & eol);
[ ] write(logFile,"Type ? for help" & eol);
END;
lastCmd := cmd END;
close(d) END;
END "fView"
How to use FVIEW to examine a file that contains host
data:
*fview<eol>
File viewer (? for help)
File to view: fview.dat<eol>
FVIEW displays the contents of fview.dat; fview.dat
contains host data
*fview<eol>
File viewer (? for help)
File to view: pdf>fview.dat<eol>
FVIEW displays the contents of fview.dat; fview.dat
contains PDF data