A Better PDS Load Program

After completing the simple PDS load example I decided to go back and implement some of the suggestions I had made concerning making it better.  The result is LDPDS which will load a PDS from a card image stream.  There are several improvements and I will not go through all the code here but I will try to point out some of the highlights.  The code was written in small increments here and there as time was available.  It was pretty much designed on the fly.  Now that it is complete and I look at the resulting code it reflects the design on the fly construction.  Still it is not the worst code I have ever written.

It is divided into several source code modules as follows:

  • LDPDS           – Main Routine
  • LDPDSCHN – Directory Entry Chain Builder
  • LDPDSCVT  – TTR to MBBCCHHR Conversion
  • LDPDSDIO  – Directory Block I/O (Read and Write)
  • LDPDSIOB   – I/O Error, Format IOB Fields
  • LDPDSPRM – Program Parameter Processing
  • LDPDSSIO   – Sequential I/O (Write)

Chained Save Areas

The Main Routine establishes a set of prechained save areas that are used by the subroutines.  This simplifies the code for the subroutine linkage.  As a subroutine is entered it simply saves the caller’s registers and then points to the next prechained save area.

SUBRTN   DS    0H
         STM   R14,R12,12(R13)
         L     R13,8(,R13)
             :
             :
             :
         L     R13,4(,R13)
         BR    R14

$PRINTF Macro

I also am using my $PRINTF routines for printing to SYSPRINT.  After spending years of coding in C I have really become spoiled to the printf( ) function.  I created a simplified version that works with assembler.  By design the $PRINTF routines can be used in reentrant code.  This resulted in the idea of the $PRINTF File – which is actually a pointer to a list three routines.  A default file exists for $PRINTF that will use TPUT if running in a TSO environment or else WTO.  To write to the SYSPRINT DCB a $PRINTF file must be supplied.  This is specified at the beginning of each source module by overriding a Global Symbolic Symbol.

         GBLC  &PRFILE       
&PRFILE  SETC  '=V(LDPDSPRT)'

This specifies the address of the list of routines is the external symbol LDPDSPRT which is defined as follows:

         ENTRY LDPDSPRT                      
LDPDSPRT DC    A(PRTFGBUF,PRTFWBUF,PRTFFBUF)

Three routines are specified. The first returns the address of a work buffer that $PRINTF can use to build the output message. The second routine is called to write the formatted buffer. The third routine is called to free the buffer when processing for the current message is complete.

Since LDPDS is not reentrant code I don’t have to dynamically allocate and free the work buffer each time using GETMAIN and FREEMAIN. This makes the GETBUF and FREEBUF routines pretty simple.

PRTFGBUF DS    0H                 --- RETURN WORK BUFFER ADDRESS ---
         USING *,R15                                                
         L     R1,=A(##PRBUF)     RETURN BUFFER ADDRESS             
         LA    R0,256             RETURN BUFFER SIZE                
         SLR   R15,R15                                              
         BR    R14                                                  
         DROP  R15                                                  
*                                                                   
PRTFFBUF DS    0H                 --- FREE BUFFER ---               
         USING *,R15                                                
         SLR   R15,R15                                              
         BR    R14                                                  
         DROP  R15

The PRINTBUF routine is not much more complicated.

PRTFWBUF DS    0H                 --- PRINT BUFFER TO SYSPRINT ---
         USING *,R11                                              
         STM   R14,R12,12(R13)    SAVE CALLER'S REGISTERS         
         L     R13,8(,R13)        NEXT CHAINED SAVE AREA          
         LR    R11,R15            LOAD BASE REG                   
*                                                                 
         L     R2,4(,R1)          GET PRINTF BUFFER ADDR          
         L     R3,8(,R1)          GET LENGTH                      
         LTR   R3,R3              CHECK FOR ZERO                  
         BZ    PRTFW900                                           
*                                                                 
         L     R4,=A(##PRLINE)    POINT TO PRINT I/O BUF          
         MVI   0(R4),C' '         CLEAR                           
         MVC   1(131,R4),0(R4)         PRINT BUF                  
*                                                                 
         C     R3,=F'132'         MAX LEN                         
         BL    PRTFW010                                           
*                                                                 
         LA    R3,132             TRUNCATE TO 132 BYTES           
PRTFW010 DS    0H                                                 
         BCTR  R3,0               SUBTRACT ONE FROM LENGTH        
         EX    R3,PRTFWMVC        COPY PRINTF BUFFER
PRTFWMVC MVC   0(1,R4),0(R2)                        
*                                                   
         L     R2,=A(SYSPRINT)    GET DCB ADDRESS   
         PUT   (R2),(R4)          PRINT LINE        
*                                                   
PRTFW900 DS    0H                                   
         L     R13,4(,R13)        UNCHAIN SA        
         LM    R14,R12,12(R13)    RESTORE REGS      
         SLR   R15,R15                              
         BR    R14                                  
*                                                   
         DROP  R11

The format string using by $PRINTF is a simplified subset of what printf( ) accepts. Format specifiers begin with the the escape character ‘%’ and may be followed by an optional width specification which is followed by a format specification charcter.

C - Print one or more characters
Z - Print a blank terminated string
D - Print a decimal value
N - Print a zero padded decimal value
X - Print a hexadecimal value

Example:  
$PRINTF 'HELLO, %Z NUMBER=%D TIME=%2D:%2N  HEX VALUE=%X',NAME,(R3),*(R4),*(R5)

The parameters may be an assembler label or a register value. If the label or register is preceeded by an asterisk then indirect addressing is used. For example for a decimal value %D, a parameter specification of (R3) would indicate register three contains the numeric value to format. A specification of *(R3) would indicate register three contains the address of a full word containing the numeric value to format.

The $PRINTF routines are made up of the $PRINTF macro and three source modules named SSIPRINTF, SSISPRTF, and SSIWPRTF.

LDPDS Main Routine

The three DCBs are opened, SYSPRINT, SYSIN, and SYSUT1.  A check is then performed to veriy the output dataset for SYSUT1 resides on a 3350 DASD device.  All disk parameters are hardcoded and only 3350 devices are supported.  A check is also done to verify the output LRECL is 80 bytes.  Next the LDPDSPRM module is called to anaylze the parameters supplied on the PARM= in the JCL.  The result is passed back as four bytes contained in register 1.

Valid parameters are “SPF” to indicate SPF statistics should be reloaded if available in the input stream, “UPDTE(xx)” to indicate the characters xx should be replaced with “./” when they occur in bytes one and two of the input records, and “REPL(ALL)” to indicate that a request for an ADD should replace an existing member.  A parm sepcification would look something like PARM=’SPF,UPDTE(><),REPL(ALL)’.

Next the existing PDS directory is read and loaded into the in-memory chain maintained by module LDPDSCHN.  The directory I/O is performed in the module LDPDSDIO.  LDDPSDIO is called to read directory blocks and return them one at a time.  The directory block is parsed and member entries are added the in-memory chain.  This continues until the logical end of file is detected (a member name of x’FFFFFFFFFFFFFFFF’).  The remaining directory blocks are read (but not parsed) until a physical EOF is detected so we can get a count of how many directory blocks exist.

The main processing loop is then entered.  A record is read from SYSIN and it must be a control card (./) in bytes one and two. The control card analyze module LDPDSCTL is called to parse the control card and build a directory entry for the member being added or replaced.  Most of the processing in LDPDSCTL is to build the SPF statistics in the directory entry.

The directory chain module LDPDSCHN is called to ADD/REPLACE the direcotry entry.  Input records from SYSUT1 are then read and loaded into an I/O buffer.

The I/O buffers are managed in the module LDPDSSIO.  Initially I started writing what I thought was a simple buffering solution that quickly became overly complicated.  I started over and begin by allocating a buffer area equal to the size of the nubmer of full-size blocks that will fit on two tracks plus one extra block.  My initial approach was to have a number of fixed buffers each the size of one block.  The problem with that was short blocks (a multiple of the LRECL) corresponding to small size members consumed all the blocks before a set of buffer for a full track were filled.  The worst case is a number of members with only one record in each member.

To solve the problem the buffer allocate routine always returns a block that will hold the maximum block size.  The buffer put routine is called when the block is filled (either complete or a complete short block).  The buffer put routine will then update the pointer to the next buffer by truncating the current buffer if necessary.  The buffers are maintained as a circular buffer, when the end of the buffer is reached it loops back to the beginning.

The put buffer routine determines if the block will fit on the current track that is being written.  This can not be determined until the block is filled with input because there is no way of knowing in advace how many records will make up the block.  The TTR for the block is calcualted and the QUEBUF routine in LDDPSSIO is called to add the buffer to the CCW chain being constructed.

The QUEBUF routine maintains two I/O requests each consisting of an IOB, and dynamically constructed CCW chain, and a dynamically created COUNT area for the Write CKD.  When it is called the TTR of the block being passed as input is compared to the TT of the current request.  If it is the same the block is appended to the CCW chain.  If it is different a new track is started.  The CHECKIO routine is first called to verify the previous I/O for the IOB is complete before starting a new track.

   IOR
+--------+
| NEXT   |-------->(Next IOR)
+--------+    
| ECB    |-------->(ECB)
+--------+
| IOB    |-------->(IOB)
+--------|
| CTBA   |---------.
+--------+         |
| CCBA   |---------|-------------------.
+--------+         |                   |
| TT     |         |                   |
+--------+         |                   |
                   v  (CTBA)           v  (CCBA)
              +--------+             +--------+
              |CCHHRKDL|<-------.    |SRCH-CC |  (Search ID= - Cmdb Chain)
              +--------+        |    +--------+
              |CCHHRKDL|<---.   |    |TIC     |
              +--------+     |  |    +--------+
              |CCHHRKDL|<-.  |  .----|WCKD-DC |  (Write CKD - Data Chain)
              +--------+  |  |       +--------+
              |        |  |  |       |    -CC |  (ptr to I/O Buf - Cmd Chain)
              |        |  |  |       +--------+
              |        |  |  .-------|WCKD-CC |  (Write CKD - Cmd Chain) [EOF DL=0]
              |        |  |          +--------+
              |        |  .----------|WCKD-DC |  (Write CKD - Data Chain)
              |        |             +--------+
              |        |             |        |  (ptr to I/O Buf)
              |        |             +--------+
              |        |             |        |
               \\\\\\\                \\\\\\\\
              |        |             |        | 
              +--------+             +--------+
      

If the relative track number (TT) exceedes the extent boundary of the dataset the EOV macro is called to attempt to add additional extents.  In my initial testing I would encounter some strange S200 Abends in EOV processing.  It was not always consistent and didn't happen every time.  I finally realized I had failed to call the CHECKIO routine to wait for all outstanding I/O to compelte before issueing the EOV macro.

The main processing routine in LDPDS always checks after every call to PUTBUF (which calls QUEBUF) to see if the beginning TTR of the member needs to be updated in the directory entry.  This is because the TTR can not be known until the first block of the member is actually written - because we don't know how big it will be.

When a control card is detected in the input the current member is completed by writing out the contents of the last block followed by and EOF record.  Processing for the new member is then started.

When EOF is reached on SYSIN the current member is closed out by writing out the last buffer followed by an EOF record.  The directory is then written back to the dataset from the in-memory chain. This does cause a problem if there are not enough directory blocks to hold all the member entries. To solve this problem a check is done each time an entry is added to the in-memory chain. A check evaulation is done to determine how many directory blocks are needed to contain the entries in the chain. If the number of available directory blocks is exceeded the last entry added is removed from the in-memory chain and load processing is terminated. The directory is rewritten and all updated to that point are saved.

An additional approach could be to expand the directory by relocating one or more members to the end of the dataset as necessary. This would allow the number of directory blocks to be expanded.  This would be the most elegant solution but was more than I was willing to take on at this time.

The directory blocks are written by calling the LDPDSDIO module.  Directory blocks are written one at a time, unlike the way the sequential blocks are written.  This is because an update write (Write Data or Write Key & Data) must always follow a Search ID Equal. The Write CKD can be chained from another Write CKD which allows us to easily write out a full track with one EXCP when we are formatting the blocks instead of updating.

A single EXCP could be used to update multiple blocks by placing a Search ID Equal before each write update.

Search ID Eq (R0)
TIC *-8
Write Update (R1)
Search ID Eq (R1)
TIC *-8
Write Update (R2)

The problem with this approach is that we would always have to wait for the disc platter to rotate a complete revolution to satisfy each Search ID Equal after the first. This would introduce a lot of latency into the I/O operation and probably would not be any faster (and possibly even slower) than multiple EXCP requests. The solution is to update every other record which would allow us to update every block on the track in only two revolutions of the disc.

Search ID Eq (R0)
TIC *-8
Write Update (R1)
Search ID Eq (R2)
TIC *-8
Write Update (R3)
Search ID Eq (R4)
TIC *-8
Write Update (R5)
Search ID Eq (R1)
TIC *-8
Write Update (R2)
Search ID Eq (R3)
TIC *-8
Write Update (R4)

This applies to a "real" 3350 not a emulated 3350 like we have with Hercules. It also doesn't necessarly apply to modern RAID storage devices that may emulate a CKD type device. I suspect that with Hercules it really doesnt matter much the order in which you access the blocks - but we are pretending it is 1980 and we have a "real" 370 mainframe connected to "real" 3350 dasd units.

This is a pretty quick overview of the code but with a little effort you should be able to make you way through the source code. It does get a little messy in a couple of places but again since I am not getting paid to write production code I have chosen to call it quits for now.