|
Like to see your
advert somewhere on the Alternative TPF Hompage |
The Control Program
Although this chapter is entitled 'The Control Program' it
really covers more than what is generally referred to as the
Control Program or 'CP'. In describing the TPF system most
experienced technicians will divide it functionally along similar
lines. Most current TPF installations use approximately the same
lines of demarcation for their system software teams. The key
functional areas are:
1. The Control Program
2. The Database
3. Communications
4. 'Middleware'
Exactly what middleware is will be discussed later, with an
attempt at a definition. The other sections seem self-
explanatory, until you realize that there are numerous areas that
do not obviously belong in any of these categories over any
other.
The first thing to realize, since we are in that chapter, is
that the term 'control program' extends beyond that kernel of
code that is link-edited together, loaded as a single stream of
code and resides permanently in main memory. IBM has christened
that particular segment the Core Resident Control Program. This
is meant to imply that there are a number ( a large number) of
segments that are part of the control program, that are DASD
resident, or treated as core resident ECB-controlled programs.
It was when trying to decide on the approach to describing
all the various features of the control program that an important
decision had to be made. One way to lead a reader through the
facilities within TPF would be to follow a sample message through
an imaginary system and explain each feature of TPF as it was
used. This would be an excellent appro`ch for someone new to TPF
that had resolved to read the book from cover to cover to prepare
him or herself for more technical, hands-on, training etc. It
would not be of so much use, however, as any kind of reference
material for someone engaged in work with TPF already. Instead
of being able to group similar areas together in a single section
they would be distributed through the text and more difficult to
refer to. Mainly because of that reason I decided to organize
the materhal in more of a reference format in the detail
chapters. I hope the newcomers do not abandon hope when
presented with this myriad of apparently unconnected
descriptions. To help out I have tried to use diagrams to still
tell the story of a message as it passes through the TPF system
internals and, in fact, whenever possible. A picture is
certainly worth at least a thousand words in describing some of
the processes within TPF.
So now we begin our apparently haphazard discussions of
topics usually associated with the control program (CP) area.
The Control Program
The control program is the heart of the TPF operating system
and consists of many separate pieces of code called 'csects' that
are all linked together to form one logically complete program.
Csect is a name derived from 'control section', an assembly
language term meaning an area of the program that contains
executable code. A 'dsect' on the other hand refers to a 'data
section' which generally has definitions of field names for data
areas used by the program containing the dsect or if it is more
general then it may be usable by other programs as well.
Each csect is further sub-divided into sections of code
called 'copy segments'. These segments are not strictly programs
in their own right but merelx piecer of code that are collected
together when the program is assembled. The reasons for
subdividing the code this way are to save space on source
libraries (since if someone were updathng a csect it would be
necesrary only to have alternate versions of the actual copy
segments being changed instead of the entire csect), collect
logically similar code for increased clarity and to make editing
less cumbersome for the programmer. In these days of object
oriented techniques and the current concentration on the
reusability of code the copy segments of the future will probably
be small pieces of highly optimized code that would be used to
perform common tasks, possibly originally written for use in a
completely different operating environment.
Generally a csect will include code to perform a general
group of functions. For example one csect is called CCSONP and
deals only with pool record handling. The 'CC' prefix is attached
to all control program csects, the 'SONP' refers to the old name
of 'SON' for the addressing format that was used before FARF3
(see Chapter x The TPF Database) and the 'P' implies that the
code is to do with pool handling. Within a csect the copy
segments would further logically subdivide the overall group of
functions into single functions per copy segment or a small
subgroup of routines whichever was most appropriate to prevent
the copy segments from getting too large.
Some functional areas, like SNA support, require multiple
csects to contain all the necessary code. In some cases this is
because code has been separated more completely than normal to
allow certain csects to be left out of the control program
depending on the options chosen for the system at System
Generation time through the definition in SIP. This approach is
particularly to be found in the communications sections of the CP
because there is a need to still support most of the older
communications techniques that were used by the first users of
TPF. Now with more and more systems trying to interact with one
another the scramble for a universal 'language' with which to
communicate is in full swing. We therefore find that many TPF
users are converting to use SNA protocols in as much of their
network as possible and this removes the need to generate the
code to support many of the older protocols. With the size of
the TPF control program continuing to grow as more function is
added any reduction in size is welcomed by the more power hungry
users for whom working storage availability is a continuing
problem.
Low Core
Historically the central csect in TPF was the 'Nucleus' or
to give it its official title: CCNUCL. Originally this csect
contained most of the more general macro service routines, many
crucial low core tables and indicators, but most important of all
it housed the cpu loop. This is no longer true and the cpu loop
now lives in another csect and there is now more than one cpu
loop under certain special circumstances (see Chapter x Advanced
TPF Features). Despite this apparent reduction in importance,
the Nucleus is still critical to the TPF system since it is
loaded first into the system's main memory. For non-assembler
programmers the full significance of this may not be obvious. A
small digression is called for:
Addressing within main memory is accomplished within IBM
assembler using 'base and displacement'. This means that an
address is made up of two parts when referenced in an
instruction, the 'base' is a register name or number and the
'displacement' is 3 hexadecimal digits, which are added to the
contents of the 'base' register.
So far so good. Let us suppose that you didn't use the register
or you had zero in it. The largest address you could have then
would be X'FFF' (decimal 4095). Nowadays this range of addresses,
when starting from zero, is called Page Zero. The importance is
that you can have labelled fields defined within this address
range without needing to have a base register associated with
them and still be able to access them directly, using their
symbolic names. Using this increment of X'FFF' you can see that a
series of 'pages' could be defined throughout the machine's main
memory. In TPF three registers are used by the operating system
to establish this 'paged' addressability to main memory. They are
R11, R12 & R13 (which used to be called RLB, RLC & RLD) and are
initialized during IPL with the values X'1000', X'2000' and
X'3000' respectively.
I think we can return to our main topic again now that we
can appreciate a little more the significance of the code loaded
in what is called 'low core'. In the same way as pieces of data
may be easily accessed when they reside in low core, so too can
actual code.
Page Zero is called the Prefix Area in TPF. As in all
previous versions of TPF the low core area holds the PSWs. This
allows the interrupt routines that are given control after the
PSW swapping to always know where to pick up the return addresses
to continue processing after completion. The remainder of this
special area (below X'1000') is given over to all the various
special pieces of information required to operate TPF in a multi
I-stream environment. This will be discussed in more detail
later, when a complete review of the basic TPF functions has been
accomplished.
As we mentioned earlier, the low core area is not only for
data fields but also for frequently used pieces of code. At
X'1000' lies the small routine called Fast Link Macro Decoder. We
have mentioned the term 'macro' before, which implies the same in
TPF as it does elsewhere, a routine that is often used and can be
referenced by a unique name. In TPF however there are two kinds
of executable macros, Supervisor Calls and Fast Link Macros. The
original type of macro was the Supervisor Call (or SVC), which
generates an operation code of X'0A' in the assembled code. When
the machine detects this byte value it undergoes an SVC
interruption and gives control to the address specified in the
SVC-New PSW in low core. This address is the address of a
special routine called Macro Decoder (the csect: CCMCDC). This
routine examines the byte value immediately following the X'0A'
and based on this value indexes into a table of addresses and
redirects processing to the required 'macro' routine. This
process does not take very much processing but is still not quite
as fast as simply continuing to execute in-line code, without the
SVC-interrupt. Also, as you will see later in the discussion on
multiple I-streams, the action of causing a SVC interrupt
produces an effect know as serialization, which is not always
desirable in multi-I-stream processing. The Fast Link Macro
therefore does not initiate a SVC interrupt but simply branches
to X'1000' and the routine located there decodes another value in
the macro instruction to locate the required routine address from
within the Fast Link Macro table.
File Resident Programs
The term 'file resident' is not used any more in the IBM
documentation for TPF. They prefer to classify programs as ECB-
controlled and Non-ECB-controlled. In my mind it is easier to
use both classifications while remembering that ECB controlled
segments can be either core or file resident.
A file resident program is, as the name suggests, a program
that resides on 'file' (disk) storage. The majority of TPF system
and application segments reside on disk, and are only accessed as
they are called by an active entry within the system. They are
accessed by means of the ENTER suite of macros and read in from
disk at that time. They are reentrant and are able to use the
full range of macros available within TPF. Some suggested
restrictions are placed on the macros allowed for use by
application versus system programs however. These restrictions
are imposed by IBM in the hope that if users abide by them then
IBM will be able to make changes to certain specialized macros
without significant disruption to user written applications.
Core Resident Programs
In an ideal world whenever there was a need to transfer
control to another program during the processing of an entry it
would simply be a case of branching to another location in the
computer's main memory. Clearly, with the number of programs
involved in a complex system such as TPF, and the limits, however
large, on how much main storage is addressable using even 31-bit
addresses, it is frequently necessary to call in program segments
from remote storage, or disk.
Despite the limits on main memory, however, there are a
sufficient number of programs that are used often enough that it
is worth setting aside a portion of the precious main memory area
to permanently hold these programs. These segments are called
'Core Resident' and are further sub-divided into two more
categories, Core Slow and Core Fast. Core Slow programs are
similar to their file resident cousins except they are faster to
enter since there is no I/O operation involved. They are
reentrant and are simply highly used programs, requiring no
special coding practices. At any time, by monitoring the
performance of the TPF system, an analyst may decide to change a
file resident program to core residency (as a core slow segment)
when he or she notices that this segment is being highly
accessed. No other actions except a change to the System
Allocator, reassemble all segments that enter this program (to
produce the correct expansion for entering a core resident
program) and a new program load are required.
A Core Fast segment, on the other hand, is a specialized
segment that must not contain any SVC-type macro calls because it
is intended to be non-reentrant and to complete its processing
before any other processing takes place within the TPF system.
Certain standard TPF segments are corefast, for instance FACE,
the file address compute program, is corefast. It should be
evident that the selection process to determine what processes
need to be corefast should be handled carefully. Generally it is
unlikely that an application program would need to be corefast,
usually it would only be for specialized system requirements.
Globals
Working Storage
Working Storage is the term applied to the area of main
memory that is divided up during the IPL/Restart process to form
a set of pools of differing memory areas, or blocks. These
blocks are then used by programs during their lifetime in the
system. Broadly they correspond to the sizes for the records
stored on disk but there are also some specialized blocks that
are used only within the control program itself to hold various
temporary data items during system processing.
The sizes of the blocks carved into the main memory area are
127 bytes, 381 bytes, 1055 bytes and 4095 bytes. The exact
distribution of these blocks is a tunable item that is generally
unique to a given installation and depends heavily on the
application usage profiles which can be studied using Data
Collection reports.
The Entry Control Block (ECB)
An Entry Control Block (ECB) is the central data block for
an input message entering a TPF system. Although some other
types of data block, Agents Assembly Area (AAA) in airline
systems, for example, are also crucial to the processing of an
entry the ECB has characteristics unique to it. The ECB is used
by the system macros to anchor records retrieved from the TPF
database, indeed the system processes use many areas of the ECB
to record details about the entry's interaction with the system
facilities.
When TPF 1 was released, in December 1979, it brought with
it a new, expanded, ECB. Up to that point there had been some
restrictions on the use of system facilities due to the
limitations of size. One such limitation was on how many file
records could be accessed by one ECB at a time. The limit was 6
before 1979, this was then increased to 16 and even more (by
using a slightly different mechanism). Over the life of ACP and
now TPF the ECB has changed drastically. Those changes have had
to be carefully controlled however since being the central core
area for an entry there has been some abuse of the credos of good
programming. With all system areas documented as to their use
some enterprising application programmers have, over the years,
decided to make direct use of certain fields within the ECB
during their application processing. This has meant that when
IBM needs to change the location of, name of, function of, or
even remove, a field there is an impact to the installed user
community.
Despite all of this change over time the essential
properties and usage of the ECB have changed little. Major
elements of the ECB are:
- User work area
- Core Block Reference Words (CBRWs)
- File Address Reeference Words (FARWs)
- Control Program save areas
- Program Nesting Area (PNA)
- Register Save Area
- Miscellaneous CP information:
- SubSystem (SS)
- SubSystem User (SSU)
- Instruction Stream number (IS)
- Network Address for output (terminal, gateway, etc)
- etc...etc...etc
- Application Program save areas
- Register Save Area
The User Work Area in the latest releases of TPF is actually
two areas of 112 bytes each. This is a symptom of a requirement
to maintain backwards compatibility with previous versions of
ACP/TPF. In the earlier days of ACP application coding practices
were not all they might have been and it was not uncommon to see
references made to ECB work area without using the EB0EB data
macro labels:
ST RG1,24(REB)
This same tendency also applied to the other fields in the ECB,
like the FARWs and CBRWs, which immediately followed the user
work area. When the need arose to increase the available user
work area it was decided to add another similar area rather than
extend the existing one. History has it that American Airlines
was closely involved with the advent of a larger ECB and that IBM
went as far as to poll their user community on how they would
prefer the new, larger, user area to be implemented. The
majority of the users, led by American, requested an enlarged
single user work area. American even went so far as to create
and implement such a change in their own system. IBM then
decided to split the areas in their next upate to the system.
This has caused one or two difficulties for American since then,
as they have migrated to successive releases of TPF, but the
perceived advantages far outweigh the costs in time and effort to
'standardize' on the IBM format.
There are literally dozens of system fields within the ECB,
each deserving an explanation. What follows is a list of what I
consider the most important of these, with the EB0EB label for
it:
CE1CHW 4 This is a chain word field used when ECBs must be
chained together.
CE1BAD 4 This is the ECB Post Interrupt branch address
field. It is used by the system to hold an address
which is then branched to when the system wishes
to activate this ECB after a period of suspension.
CE1WKA 112 This is the first user work area of 112 bytes (it
includes the switch bytes).
CE1SW1 1 This along with the similar fields: CE1SW2,
CE1SW3, CE1RS1 are termed 'Inter-Program' switch
bytes.
CE1CM1 1 This along with similar fields: CE1CM2, CE1CM3 and
CE1ER1 are termed 'Intra-Program' switch bytes.
CE1FA0 4 This is a File Address Reference Word (FARW). This
four byte area is divided further, with two bytes
of Record ID, one byte Record Code Check (RCC) and
one byte for Control bits. In reality the ID is
virtually always used, the RCC is often used in
application programs but the control byte is less
frequently used. This four byte field along with
the next four bytes makes up the entire FARW for
the given level (in this case level '0')
CE1FM0 1 This byte and the next three contain the actual
symbolic file address used by the system to locate
the record on file storage. The naming of the four
individual bytes is a throwback to when the
addressing scheme used was of the form MCHR
(Module, Cylinder, Head, Record). File addressing
is discussed in more detail in the chapter on the
TPF Database. Generally, today, these individual
bytes have no significance, only as a group of
four.
CE1FC0 1
CE1FH0 1
CE1FR0 1
CE1FAP 4 This field and the next make up the special FARW
used to retrieve module resident programs prior to
transfer of control to them for use by the entry
associated with this ECB.
CE1FMP 4
There are 16 FARWs available for data record retrieval and one
used for programs. They each have a counterpart Core Block
Reference Word (CBRW) for each designated 'Level' available in
the ECB. The levels are numbered zero to fifteeen ( '0' - 'F' in
hexadecimal notation).
CE1CR0 4 This along with the next four bytes make up one
CBRW. This four bytes contains the actual core
address of the record or core block 'attached' to
the ECB on this level. The address is a normal
S/370 31-bit core reference since TPF 2.4, before
this it was limited to 24 bits.
CE1CT0 2 These two bytes denote the type of core block
pointed to by the address in CE1CR0. The possible
values of this field are:
0001 - no block held on this level
0011 - a 127 byte block (usually a msg block)
0021 - a 381 byte block ( 'small')
0031 - a 1055 byte block ('large'or '1K')
0041 - a 4095 byte block ('4K')
CE1CC0 2 This halfword contains the block size in bytes.
This value may be X'007F',X'017D',X'041F' or
X'0FFF'. The byte count is the USABLE size of the
block and is not the actual physical size. For
all core blocks the TPF system retains one or more
bytes for its own use for such things as Format
Flag identification to assist in detecting core
corruption. (More of such things later...)
As we mentioned before there are 16 CBRWs followed by the CBRW
reserved for programs:
CE1CRP 4 Program core address
CE1CTP 2 Program block type
CE1CCP 2 Program length
CE1SUD 17 For each data level there is a byte reserved for
showing the status for that level. They are used
to flag and denote errors occurring on data I/O
operations. With a separate byte for each data
level these bytes are called the Detail Error
Indicators. For each byte the bits within it
represent the following possible problems:
Bit Error
0 Hardware error
This bit is turned on if a non-recoverable
hardware error occurred during the operation.
The contents of the core block would be
unpredictable but it is left to the
application program to take action.
1 Record ID check failure
This bit is turned on if the record found at
the file address specified does not have a
matching ID to that present in the FARW.
2 Record code check failure
This bit is turned on if the record code
check byte does not match between the record
on file and the FARW. If this byte is zero in
the FARW the system will not perform the
check.
3 Short record found
This bit is set when a record shorter than
expected was found (tape only)
4 Long record found
This bit is set when a record longer than
expected was found (tape only)
5 End of file
This signifies that the End Of File condition
was detected, either on a tape or unit record
device.
6 Invalid file address
This bit is set when the system determines
tht the file address supplied in the FARW is
invalid after attempting to convert it to a
physical location on the database.
7 Program sharing indicator
CE1SUG 1 This byte is called the Gross Error Indicator and
the value held here after a system operation is
the product of OR'ing all the Detail Error
Indicators together.
CE1PL0 12 This field is the first of the program nesting
levels defined in the ECB. There are nine levels
defined in the ECB and there is a further pointer
to an extension area within the CP for increased
nesting capabilities. The first four bytes of the
nesting area is the program base address which the
entry was running in when it required to enter the
next program. The next four bytes point to the
next sequential instruction (NSI) after the ENTER
macro. This is so the system can return control to
the correct place after the entry returns from the
entered program. The final four bytes include two
bytes of save area for the program base ID and one
byte for the program block type. One byte is
spare.
CE1GLA & CE1GLY These are two four byte core address pointers
that are used in systems having multiple Sub
System Users (SSUs). Each SSU may have
different, user defined, global areas A and
Y. These fields hold the respective base
addresses for each area.
CE1RDA - CE1SVP These fields (11 x 4 bytes) are used by the
CP to save registers R0-R8, R14 and R15
during CP function calls (macros). R9-R13
are reserved for use by the CP and should not
be in use in an application program.
Keypoints & Keypointing
Virtual File Access (VFA)
Delayed Program Flushing (DPF)
Tape Usage In TPF
System States
Initial Program Load
System Restart
The CPU Loop
Macros
Program Nesting
Program Sharing
Interrupts
System Initialization Process (SIP)
System Allocator (SAL)
The System Allocator (SAL), is an offline program used to
allocate 'slots' in the area of disk storage given over to
holding programs. A table is built in a file offline which lists
the program name, size information and desired residency. This
file is used as input during an offline process that builds the
file that will eventually be used to load the programs onto the
TPF database.
Time Initiated Entries
System Error Handling
If there is one area of the TPF CP that is under almost
constant revision it is the error processing (comprised of the
csects CPSE and CPSF). As each new release of the TPF system has
been unpacked systems programmers have sifted through the new
code and examined the changes from the systems they have learned
to love (or not) over the intervening years since the last
pronouncement from Danbury. Almost without fail the one area
they know that will have probably been completely rewritten will
have been the error processing csects. This is not surprising in
the least since with the number of new features normally added
between significant releases of the base software the need for
rationalized error processing is essential.
CPSE is the first stop for all unexpected errors in the TPF
system. On the face of it that sounds a depressing statement,
since if we were thinking positively we would not be expecting
any errors... It is also a well known fact among TPF systems
programmers that if it were not for the application programs in
the system TPF would never go down. However there are times when
the processing program has determined that a system dump of main
memory is required to assist in the resolution of some peculiar
situation it has detected. In such cases a system facility is
invoked that causes an interrupt which transfers control
automatically, via PSW Swapping, to the beginning of CPSE. This
will also happen if the machine should detect a variety of errors
itself, e.g. invalid instruction operation code, access attempted
to a protected core area, etc. CPSE will then examine the source
of the error, evaluate its seriousness (i.e. whether it is a
'Catastrophic' system problem requiring the system to be halted),
move vital information to its own save areas, execute dump
processing which writes system information to a tape for later
analysis and either return control, if the error was not
catastrophic, or force a system IPL if the error would have
jeopardized normal system function.
Error processing is one of the areas of the TPF system that
has been most enhanced by the various members of the user
community. Since TPF systems by their nature are required to be
available at all times the need to resolve system problems in the
shortest possible time is paramount. Most TPF installations
employ an entire support staff, often called Cover`ge, to
maintain a watchful eye on the system 24 hours a day. These
heroic people are the first line of defence when the unhappy user
calls in from the field complaining about the performance of the
system or bemoaning lost data when his transaction errored
unaccountably. They are also intended to be capable of returning
the system to operational condition after a system operator
reports to them that the system just IPL'd itself and wont come
up past 1052 state...
In these situations an accurate picture of the system
environment immediately preceding the problem is vital to aid in
the diagnostic process. Most user improvements have focused on
some means of keeping an online record of recent errors ( in some
cases that has extended to keeping actual dumps available online)
and improving the display facilities for the key system areas.
Nevertheless there is still often the need for the dumps taken by
CPSE to be post-processed offline and printed for analysis by an
expert, especially in the cases of more serious or recurring
system problems.
|
|
|