UK-based website

The Alternative TPF Homepage
Serving the online TPF community since 1995 : Now receiving >12,000 hits per week

Where To Next ?
Home
TPF Scoop
Tornado: TPF Simulator
   TPF Jobs
  Online Store
TPF Talent
TPF Help
TPF Site Search
TPF Net Search
TPF Link Central
What is TPF ?
TPF TravelGuide
TPF vs. Unix
TPF Futures
  Where are they now?
TPF Salary Survey

comp.os.tpf

Any browser will do....

Animated GIFs by:

Like to see your advert somewhere on the Alternative TPF Hompage
 site ?

Disclaimer: The Alternative TPF Homepage is not responsible for the content of external sites

TPF System Overview



      The Transaction Processing Facility (TPF) is intended,  one
would  imagine, to process 'transactions'. So perhaps  the  first
thing to define as we venture into the workings of the system  is
exactly what we mean by a 'transaction'. Typically we are already
in  some difficulty !  For the sake of continuity let us  use  an
example  from  the  airline  industry  to  try  and  explain  the
potential problem.  Imagine you are planning a vacation  but  you
haven't  decided  whether to head for the sun  and  sand  of  the
Caribbean  or  the snow and slopes of Colorado.   You've  decided
that  you must take your vacation in the third or fourth week  in
March and you'd rather not spend more than $2000 for a five night
stay.

     Armed with this information you call your travel agent. Over
the  next  several  minutes  she (or he)  will  ask  you  several
questions   to   obtain  the  dates  of  your  intended   travel,
destination  etc.  You will have been able to  get  several  fare
quotes  as  well, which should help you to decide which  vacation
best  fits your criteria and perhaps you will actually decide  to
go  ahead  and make reservations.  If you decide to finance  your
vacation  with your favorite credit card you could even  purchase
the tickets and reserve your hotel room all in the same telephone
call.

      Now let us look a little closer at what just happened.  You
have  made reservations and paid for the tickets, so in  a  sense
you  have completed a transaction with the travel agent.  Now let
us  look  at the same sequence of events from the perspective  of
the travel agent.  When she answered your call you probably asked
first  about some flight availability around the time you  wanted
to  travel.   She would have typed in a message on  her  computer
terminal that returns a display of flights that still have  seats
available for sale.  She would then discuss with you the  options
you  have, with flight times and dates.  When you asked about the
fares, having chosen one or two possible flights, the agent would
type  anothdr message into her terminal to obtain quotes for  the
various  fare possibilities.  As we all know this would  probably
not  be  a small display as the varieties of airline fares  these
days  are anything but small (or simple). This process of  asking
the  system,  by  means of typed commands, for information  would
continue until you had decided on a particular itinerary and were
ready  to make a reservation. The agent would then type in a  new
sequence  of  messages, only this time she would be recording  in
the  system your own personal details, such as name and telephone
number, plus details of your itinerary.  If you chose to purchase
the tickets via credit card she would also input your card number
to the system.

      As  you  can see if we regard the transaction as being  the
entire  telephone call, up to and including the purchase  of  the
tickets  then  that one transaction involved many messages  being
sent to the computer system.

     Consider another example of a transaction, where you wish to
purchase  an  item with your credit card.  As the cashier  passes
your card through the card reader a message is sent to the credit
card  company's TPF system and the response is sent back with  an
authorization code (we hope) for the transaction.  In  this  case
there  is  nothing  more  to  be  done.  The  entire  transaction
consisted of just one actual message to the TPF system. So we can
see  that a 'tranraction' in the TPF world can be one or  several
messages.   However  if  there is more  than  one  message  in  a
transaction  then the messages must be logically linked  in  some
way.   In  other words they must be contributing to a single  end
like  selling a seat on an airplane or booking a hotel room  etc.
In  fact  the link goes deeper than that for TPF since  there  is
actually   a   key   on   the   keyboard   designated   as    EOT
(End-Of-Transaction).  When the system receives  this  message  a
host  of special processing is activated to tidy up any resources
in use as a result of the previous, associated, queries.

     We could define a transaction in TPF in this way:

      A  transaction is composed of one or more messages into the
system  designed to achheve a particular goal.  The goal and  how
it  is  actually  achieved depends on the implementation  of  the
application and the complexity of the information required by the
agent.

      So  now  we have a definition of a transaction we  need  to
explore  exactly what a 'message' is and how it travels from  the
agent's terminal to the TPF system complex and what happens after
it gets there.

      We're  in  luck  here, as the definition of  a  message  is
relatively simple.  When the agent is typing at her keyboard  the
terminal  is  simply  echoing her typing to the  computer  screen
until she presses the 'Enter' key.  By pressing this key she  has
told  the  terminal to send the whole contents  of  its  keyboard
buffer (everything the agent has just typed) to the next link  in
the  communications chain, which will be some  sort  of  terminal
controller.  For the communications network to function  properly
some  extra  data must be added to the characters  typed  by  the
agent.   Some means of identifying from which terminal the  input
is  coming is required so that the system knows where to send the
response  back  to.   This 'data stream' is  what  is  eventually
delivered to the TPF system as an input 'message'.  At this point
we  should briefly mention what lies between the terminal and the
TPF system.

      One  thing  we  already  know is that  TPF  supports  large
networks  that all access a centralized processing  complex.  The
diagram  in  figure 1 shows a sample TPF network  and  processing
complex.   The terminals are attached to the central  complex  by
means of wide area communications facilities.  This name is  used
to  describe transmission facilities that are supplied by  common
communications carriers.  There are typically two different  ways
in  which  the TPF complex is connected to the terminal  network.
One  way  is  to lease communications channels (sometimes  called
'leased  lines'),  connect  terminal  concentrators  and  form  a
private  network. The other is to lease local access lines  to  a
common  communications carrier's data network (this  network  may
also  be  used by other systems). The terminal concentrators  are
also connected to the network via the local access lines.  As you
would  expect  in a system that routinely supports huge  terminal
networks  TPF is no stranger to communications.  Details  of  the
various  techniques used to integrate TPF into existing  networks
or  to  establish a network solely for TPF will be  discussed  in
Chapter (x).

      Now, in our whirlwind tour of a typical TPF network we have
finally reached the TPF system itself. It is likely however  that
we  will  be  forced to wait a very short period of time  as  the
system  is  almost  certainly busy with other work.   At  regular
intervals  TPF is interrupted and takes time out to  accept  more
messages before resuming where it left off with the work  it  had
currently in progress.  As a result of our message being accepted
into  the system it is placed on the input list, one of the  many
TPF work lists. These lists will be discussed in detail later but
are  simply  used much as you or I would use a 'To  Do'  list  to
schedule activities for the system.

      It  is now time to introduce for the first time the central
process  of  the TPF system: the CPU loop.  The  CPU  loop  is  a
section  of  code  within the control program that  accesses  and
processes the various 'To Do' lists maintained by TPF.  After the
initial  set-up of the system, which will be described in  detail
later, the final action of the set-up programs is to pass control
to  the  CPU loop code.  The loop of code then proceeds to  check
each  list in turn to decide if there is any work for the  system
to  do.  As far as new input from the communications subsystem is
concerned there is a slightly different mechanism for  that.   We
have  already said that TPF is an 'interrupt driven'  system  and
one  demonstration of this is the accepting of fresh input to the
main TPF processor(s).  Based on a predetermined interval of time
an  interrupt  is  generated by the system that  causes  the  TPF
system   to   accept   all  input  currently   waiting   in   the
communications subsystem and place the items of work on the input
list.  Once this has been accomplished the system returns control
to  whatever process was active at the time of the interrupt.  In
this  way input from the communications subsystem is periodically
'collected'  but  TPF  retains the basic premise  that  the  work
within  the system already is more important, hence the need  for
an  interrupt to cause fresh input to be received.  Although  the
process might appear complicated from this brief introduction  it
will be shown in later chapters to be quite simple and effective.

      In  some  documentation about TPF and often in conversation
you  will  come  across  the  term 'entry'.   This  is  an  often
mishandled term that is used to describe anything from an  entire
transaction  to  a  message or something entirely  different  and
equally  imprecise.  Since we are trying to  be  as  accurate  as
possible  the term 'Entry' should really only be applied  to  our
message  after it has been taken off the input list  in  the  TPF
system  and  attached to an Entry Control Block (ECB).   The  ECB
will  remain associated with this entry for its lifetime  in  the
system  and holds within it numerous pointers and work area  that
is  used by the TPF system as processing proceeds.  The ECB  will
be  described  in some detail in a later section  as  it  is  the
central repository for system data associated with the processing
of  the  particular entry.  It is, however, interesting  to  note
here some key features of the ECB:

- ECBs exist only in main memory (not on file)
-  ECBs  contain pointers to main memory areas allocated to  this
entry by the system
- ECBs contain pointers to the file areas that have been accessed
for the processing of this entry
-  ECBs  contain  the  system information necessary  for  nesting
programs  (i.e.  processing in program 1, 'entering'  program  2,
perhaps to perform some small function, then returning to program
1, at the place you left previously)

      Based on a number of criteria associated with the type  and
content  of  the  input  message the  TPF  system  decides  which
application  program should process our message and activates  it
when  our  item of work is next in line on the list.  To  achieve
this some standard TPF 'middleware' programs are used. After  the
creation  of  the ECB, using the Control Program routine  OPZERO,
control is passed to the TPF input message editor, UII.  UII uses
a  TPF  utility program, WGR, to locate the Agents Assembly  Area
(AAA)  associated with the terminal that has input  the  message.
UII  examines the Primary Action Code (PAC), the first  character
of  the input message (excluding communications information)  and
from  tables,  usually, decides which application segment  should
receive  the  entry  for  processing.    Processing  within  that
program  will  now  continue until some TPF system  function,  or
physical  I/O,  is requested. If it is necessary  to  access  the
database  for data the processing of our entry will be  suspended
by  the system to wait for the successful completion of the  data
retrieval  by  the  system routines. When the I/O  completes  our
entry  is reactivated by being placed on another list called  the
ready  list.  Finally  after performing  whatever  processing  is
necessary  to satisfy the requirements of the input  message  the
application package will issue an 'EXITC' macro which causes  all
system  resources, like main storage areas that might  have  been
used, to be returned to the system.  Normally before issuing  the
EXITC  some  sort of output message will have been sent,  to  the
terminal  supplying the input, which will be either the requested
information or perhaps some sort of error message in the case  of
a mistyped message or some other sort of problem in the process.

      Since  we still have the original terminal address that  we
saved  as  we  started our journey through the TPF system  (in  a
field  in the ECB) the communications software is able to  locate
the  correct  terminal  and  process the  response.  This  entire
process,   which   has   been  severely   summarized   for   this
introduction, should have taken no more than three seconds.

      As  we can see immediately there are some major differences
here  from  the  more  widely used mainframe  operating  systems.
There is no concept of a 'job' that is submitted by a user nor is
there  a need to create a Tele-Procdssing (TP) monitor to run  in
the  framework of a batch environment such as is the case in  the
Customer  Information  Control System  (CICS)  which  runs  under
Multiple Virtual Storage (MVS).  There are drawbacks to  the  TPF
design  strategy; batch style work, or those functions  that  are
computation intensive, are not so easily accommodated in the  TPF
system  with  its philosophy of a quick turnaround  for  messages
into  the  system.  On the other hand there is no overhead  of  a
monitoring  system  such  as  are  found  in  the  batch-oriented
environments.   We will see in a later chapter that  some  modern
TPF applications are approaching the line between a real-time and
a  batch  application.   For some time this  situation  has  been
helped  by the ever increasing power of the hardware managing  to
stay just ahead of user demands for system functionality.  In the
last  five  years  the need for ever more sophisticated  packages
have  threatened to swamp even the largest and most powerful  TPF
complexes worldwide.

      The  entire TPF system is structured to permit the  maximum
number  of messages to be processed in a unit interval  of  time,
which  is usually expressed as 'messages per second'.  This  must
be  accomplished while retaining the quick response times to  the
agents  in  the  field.  Response time is measured  as  the  time
between  the  user  pressing  the button  on  his  keyboard  that
initiates   the   transmission  of  the   message   through   the
communications network and the display of the first character  of
the  response,  transmitted from the TPF system, on  the  agent's
screen.   We  can  try  to summarize the characteristics  that  a
typical TPF system will have:

- Realtime interaction with widely dispersed 'agents'
- Relatively short message lengths in both directions
- A common, central, database
- The need to update the database during realtime operation
-   The  need  to  perform  database  maintenance  during  system
operation
- Duplicate data records for performance and reliability
-   A   communications  interface  to  support  a  large,  widely
dispersed, network of terminals (of various types)
- The response time must 'match' the application
- The system must operate 24 hours a day
- High system availability is required
- System restart, in the event of a problem must be fast
-  Some form of dynamic monitoring of the system performance must
be available

       To  try  and  put  these  requirements  in  some  sort  of
perspective I would like to use some figures from an  actual  TPF
installation.  The data is taken from a case study of Trans World
Airlines'  (TWA)  system which appeared in the Communications  of
the ACM in 1984.

-  The  system  had  11,000  - 12,000 terminals  attached  to  it
throughout the world.

-  Unlike  some  other businesses that generate vast  amounts  of
paper to archive information in parallel with any computer system
processing the only paper backup for the airline is a passenger's
ticket.  The airline cannot operate if the computer system is not
available.   To improve availability TWA maintains  full  back-up
power systems that can be online in seconds in the event of power
failure.  The interim period is protected by batteries. All  data
is  duplicated online.  In 1976 (the last year that figures  were
available) the TWA system had 131 incidents of system outage, but
the  estimated  losses from such outages in terms of  revenue  (a
notoriously  subjective  figure),  were  considered  very   small
compared  to  1972's  548  outages,  where  revenue  losses  were
estimated  to  be nearly 10 times as great.  To put this  another
way  the system was scheduled to be available 98.7% of the  total
8784 hours in that year.  It was actually available 99.85% of the
scheduled time.  On 270 days there were no system outages at all,
and  the  mean  outage was 6 minutes, though  one  single  outage
lasted 230 minutes.

- So we can see that the system is available to the agents 98% of
the time and TWA guarantees travel agents connected to its system
an availability of 95% or no payment will be required.  A typical
daily  transaction  volume was 7 million and at  peak  times  the
message rate reached 170 messages per second with spikes of  over
200  messages per second.  (Even at that time another airline was
peaking  at  1000  messages per second and  recently  an  airline
system has processed in excess of 3000 messages per second  using
TPF  3.1)  Average response time was 1.5 seconds and the  airline
attempts  to ensure that 90% of the messages will have a response
time of no greater than 3 seconds.

-  The  online  database  contained between  1  and  1.5  million
passenger  records,  each consisting of  1.2  to  1.5K  of  data.
Passenger records constitute the bulk, although by no means  all,
of the system's data.  So passenger records made up a database of
2 billion bytes, fully duplicated.  In addition to this there was
the inventory database etc.  Secondary storage had a capacity for
over  45 Gigabytes (1 billion bytes = 1 Gigabyte).  Using  native
mode  IBM  3350's  with a capacity of 317.5 megabytes  each  this
would  have required 142 such drives, each about the  size  of  a
household washing machine.

-  Reservations are indexed by passenger name, flight number  and
date  (all three being required to retrieve a reservation)  while
inventory can be accessed by date, departure time or city.


     It should be clear from this, albeit dated, glimpse into the
real  world, if it wasn't already plain, that TPF systems operate
under  severe constraints.  Since most TPF systems are  what  are
known as 'Strategic Systems', and are highly prized, and coveted,
corporate assets by their owners it is not always possible to get
up to date performance figures.  This is analogous to Formula One
motor  racing  teams  not  wishing to publish  how  their  latest
modifications have improved their car's cornering or speed on the
straight.  They prefer to demonstrate any increases in  power  by
the  number of new users they can sign onto their systems or  how
easily  they  can cope with major activity spikes (e.g.  the  day
after  a  major airline fare reduction...). Perhaps  we  need  to
mention  here  some general ways in which TPF  attempts  to  meet
these stringent requirements.

      TPF  should  be  considered a high  performance,  realtime,
message  driven operating system.  That last term  describes  the
way in which the incoming messages are collected and concentrated
by  terminal concentrators on the network and then polled by  the
TPF  system.  This implies that the messages are  arriving  quite
randomly and hence the term 'message-driven' for TPF.

      From  a  design  perspective there are a  number  of  basic
techniques that can be employed to cope with the TPF constraints.
One  is  that  some decisions that need to be made during  system
operation could be made by either the system software  or  by  an
operator at the system terminal. The decision to be made here  is
whether  it  is  better to spend the time  waiting  for  a  human
operator  to respond to a situation or whether the time saved  by
allowing  the  system  to  make a given  decision  is  worth  the
complexities that would be introduced into the software to handle
it.   What  often  happens in TPF is that only  decisions  either
requiring  knowledge  the system could not have,  or  that  would
potentially  be  destructive to data within the system,  are  the
responsibility of system requested human intervention. In  almost
all other cases the system has a 'default' course of action if it
cannot solve the problem any other way.

      In  any  large  computer system there is a  need  for  some
flexibility in the actual configuration of the components of  the
system.   There  is  often  an ongoing need  to  add  and  remove
devices,  perhaps  for  maintenance or possibly  expansion.   The
system's configuration is established by a process called  System
Generation (sysgen) in which the software of the operating system
is  set  up  to reflect the actual configuration of the hardware.
In  some operating environments configuration information can  be
supplied to the system software during actual initialization (the
act of bringing the system to an operable state, perhaps after  a
power  failure or nightly maintenance etc).  In the case of  TPF,
where  the  limiting of any outage is central to its  philosophy,
all  configuration information must be supplied  at  sysgen  time
since  the  path  through initialization  must  be  as  short  as
possible  to  limit any downtime. This is not to say  that  every
INDIVIDUAL device must be known about before the system can  IPL.
It means that every TYPE of device should be known about.

      TPF  is also a highly structured system.  It achieves  this
structure by making each item of work as small as possible, using
as  few  resources  as  possible, and  doing  relatively  trivial
amounts of actual computation work during its lifetime (it  might
be  wise  to  mention  here that the phrase  'trivial  amount  of
computation'  is  widely used in computer science  literature  to
denote   a  small  amount  of  processing,  in  comparison   with
communications delays and I/O operational delays,   and  it  does
not imply 'unimportant' !). Of course it is possible to have many
hundreds  of  tasks active simultaneously within  the  system  so
there is a very high degree of multiprogrammhng involved.

      To  revisit  the aim of 24 hour availability, there  should
always be a backup mainframe computer standing by in the case  of
a  hardware failure on the main system.  In some cases there will
be  times when a switch over to an alternate machine will be done
for regular maintenance also.  The backup machine must be capable
of being IPL'd as a TPF system almost immediately in the event of
a  disaster so although it may be used for other work  that  work
should either be easy to repeat or easy to suspend in case of  an
emergency.   The failure of a machine is a fairly rare occurrenbe
these  days  but nonetheless it does still happen  and  most  TPF
installations  will  have  more  than  one  machine  capable   of
providing backup to its online TPF machine(s).

      A  final general consideration to help meet the demands  of
the TPF philosophy is the duplication of data within the database
and  its  actual  arrangement within  the  TPF  database.  It  is
possible  to choose to duplicate all data or a selection.   Again
most  operating  TPF  installations choose the  safer  route  and
duplicate  all  data.  The arrangement of  the  application  data
within  the  database  can be crucial in  minimizing  I/O  delays
during  processing.  Response times can be dramatically  improved
by the following:

- Allocating shared data resources prior to online execution

- Organizing the physical structure of the shared data to improve
data accessibility. The TPF users are responsible for customizing
this  aspect  of  the  system to match their  unique  application
requirements.

-  Placing  the  logic  used to access the  data  in  the  system
software   which,  in  some  systems,  is  frequently  found   in
application appendages called 'access methods'.


      It  would  probably be wise at this point  to  revisit  the
concept  of  performance and the way it is customarily summarized
in  the  TPF world, by messages processed per second.  Unless  we
have  a  fairly good idea of what processing is involved  with  a
'message' how can we judge the real performance of one TPF system
to  another or a TPF system to any other system ?  It  is  common
practice  nowadays to issue benchmark test results  for  Personal
Computers  for  example.  These results show the  performance  of
competing  brands when processing 'standard' test  programs  that
are  designed  to  test  the computer in certain  key  functional
areas.   Wouldn't  it be convenient if such a  set  of  benchmark
programs existed for TPF ?  Well yes it would but because of  the
nature  of the uses of TPF and the different developments in  the
applications  areas  within  various  TPF  installations  no  one
benchmark program exists.

      To  approach this problem somewhat scientifically we  would
need to select a 'typical' input message that we might expect  on
a  given TPF system.  This 'typical' message should be similar to
as  many  other  messages likely to be input  across  the  entire
applications family as possible.  Clearly it is no small task  in
itself  to  identhfy  such  a message. It  requires  an  intimate
knowledge  of the various applications, including the  number  of
database  accesses a given enquiry would generate.   Having  once
identified this typical message one would then write a program to
deliver  a  number of these similar messages into the TPF  system
being  tested and using the monitoring packages supplied it would
be  possible  to  observe how the various  parts  of  the  system
reacted to the message load.  In an ideal world this would be the
preferred  way  to  assess  the  performance  of  a  TPF   system
configuration  for your typical system messages.  The  catch,  of
course,  is that selection of the typical message. There is  also
the  concept of the 'message mix'. Since at any given moment  the
TPF  system is likely to be handling many messages simultaneously
the  types of messages active concurrently can be a factor in the
overall   performance  picture.   If  all  active   entries   are
computation  intensive  relative to their I/O  requirements  they
could  slow each other down. Similarly if several of the messages
each  need  the same record from DASD they could find  themselves
having to wait for it.

      The  way  that most large TPF installations have approached
this problem is to simulate a normal message mix when testing  or
stress  testing their systems.  They do this either by  capturing
incoming messages to their online or programmer test systems  and
replaying them into the system being measured.  This is also  not
a  trivial  exercise as the database must be closely  coordinated
with  the  system from which the messages were captured otherwise
the  entries  will  not process correctly and the  test  will  be
useless.   The  longer  and more tedious method  is  to  manually
input,  into  a  package  designed  to  transmit  messages  at  a
predetermined  rate  from  an  intelligent  termin
 


Updated: 14/05/02