12 Programming Style Suggestions

This chapter deals with a variety of stylistic issues that may be of interest to someone who is just starting out learning and using COBOL. Much of this chapter makes recommendations and suggestions for how to write your own programs. The sample programs in the Sample Programs document ( Top (in Sample Programs)) were coded using almost all of these recommendations.

There’s no particular order of importance to the topics presented here.

12.1 Marking Changes in Programs

Historically in the early 60’s programs were first punched on to paper tape and by the mid 60’s that was replaced almost totally, by punched cards although paper tape was still used by programmers for the odd few changes to their sources held on magnetic tape or disk as a portable paper tape punch could be put in your pocket. Now the problem with punched cards were there was 2,000 cards per box and that they could and did, get dropped. So, cc (column) 1 through 6 had the card sequence number in and that way if a box was dropped they could be feed in to a card sorter to be fixed. This was after the cards was cleaned up so that they were all in the same direction which one corner cut out helped.

In the late 70’s cards was also on its way out to the point where P.C’s started being used (and no they were not made by IBM), so these columns could be used for other purposes including cc 73 - 80 instead of indicating the 8 character program name which was the maximum size allowed on a IBM system.

For quite a while now (back to the late 1970’s), the sequence number area’ of a COBOL statement (columns 1-6) has come to be used as a change indicator area. Programmers would place a code in columns 1-6 of every line they changed in a program. The author works in a COBOL shop where change indicators of the form “xxmmyy” are required on every altered line of a program — “xx” is the initials of the programmer while “mmyy” are the month and two-digit year of the date the change was made. This is frequently accompanied by a comment block at or near the top of a COBOL program providing general documentation of what changes were made and what change indicator was used to mark that change.

The GCic sample program source listing ( GCic (in Sample Programs)) provides an excellent example of such documentation.

This technique of using columns 1-6 as a change indicator will only work if fixed source-record format is in effect.

Some COBOL shops prefer to use the eight-character Program Name Area (columns 73-80) as a change code area.

Marking changes becomes more of a challenge when free-format source code is in effect. Creating a top-of-program comment block to generically describe changes that have been made isn’t difficult, even in free-form. What is difficult, however, is coming up with a scheme for per-statement mark up of changes that doesn’t introduce a ridiculously excessive number of source lines to the program. I’m not sure there is a good answer to this problem (if a reader has one, please let me know). Generally, I’ve noticed that shops using free-format conventions for their COBOL source tend to stick with just the top-of-program comment block combined with minimal comment blocks sprinkled throughout the program noting areas that underwent major changes.

12.2 Data Item Coding and Naming Conventions

When programs get very large, it becomes more and more challenging to keep track of the data items that will be used in the program. Here, in no particular order of importance, are a variety of conventions that can simplify that problem.

Remember that the points described here are intended to make things easier for you, the programmer. No COBOL compiler cares one way or another whether any of these suggestions are followed.

  1. Avoid the use of level 77 data items in new programs. Once (1968 and before) there were valid reasons for creating level-77 data items, but since the 1974 ANSI standard of COBOL there really hasn’t been any reason why an elementary level-01 data item couldn’t have been used instead of a level-77 item.

  2. Allocate level-01 data items in alphabetical sequence in the program source wherever practical. This will make it vastly easier to locate the definitions of 01-level items in the program source without having to resort to a compilation cross-reference listing and/or text editor “find” command to locate them.

  3. Consider prefixing data items with an indication of where in the program structure they were created. For example:

    • Start everything defined in the file section with “F-”

    • Start everything defined in working-storage with “WS-”

    • Start everything defined in local-storage with “LS-”

    • Start everything defined in the linkage section with “L-”

    • Start everything defined in the screen section with “S-”

    • Start everything defined in the report section with “R-”

    A convention such as this makes it simple, when you’re reviewing code in the procedure division, to know in which section of the data division you should look in when locating the detailed description of a data item. Once you’re in the right division, coding convention #2 will assist in locating the data item definition.

  4. Consider including a trailing descriptor of the nature of all data items in their names. The following chart presents a variety of such descriptors the author has encountered and used through the years.

    • -ADDR

      The data item contains all or a part of an Address (City-ADDR, State-ADDR, Street-ADDR, …)

    • -BOOL

      A level-88 data item (which only has the value TRUE or FALSE)

    • -CD

      A code whose value denotes information content above and beyond that of the mere value itself. Some examples could be Error-CD, Status-CD, Billing-CD

    • -CHR

      A data item containing a single character of data.

    • -CONST

      A constant, specified as a level-78 data item, a level-01 item with the CONST attribute

    • -DT

      The data item contains a complete or partial date (Birth-DT, Birth-Month-DT, Birth-Year-DT, …)

    • -DTTM

      A data item containing both a date and a time

    • -FILE

      A file name. Note that these items would probably also have a “F-” prefix.

    • -IDX

      A data item used as a table index (see section 12.3)

    • -NM

      All or a portion of a person’s name. These could be extended to include business names, product names, etc.

    • -PTR

      A data item whose USAGE is POINTER

    • -NUM

      A generic numeric data item that doesn’t fit into any of the other categories

    • -QTY

      A count of something

    • -REC

      An 01-level item defined in the FILE SECTION (constituting the layout of a record within a file). Note that these items would probably also have a “F-” prefix.

    • -SCR

      The data item contains a complete or partial screen description (appropriate for SCREEN SECTION 01-level data items).

    • -SUB

      A numeric item used as a table subscript (see section 12.3)

    • -TEL

      All or part of a telephone number

    • -TM

      The data item contains a complete or partial time value

    • -TXT

      The data item contains generic alphanumeric text that doesn’t fit into any of the other categories.

    The above is by no means an exhaustive list, but good programmers will use as few of these descriptors as possible as having too many defeats any benefits of such classification/documentation efforts.

  5. Consider including an acronym to be inserted into the name of any data item defined directly or indirectly subordinate to an 01-level item, typically to be specified after any section-level tag, if you’re using them. For example, consider the names used in the following structure:

    01  WS-File-Status-Message-TXT.
        05 FILLER                     PIC X(13) VALUE 'Status Code: '.
        05 WS-FSM-Status-CD           PIC 9(2).
        05 FILLER                     PIC X(11) VALUE ', Meaning: '.
        05 WS-FSM-Msg-TXT             PIC X(25).
    ....
    01  WS-OI-SUB                     PIC 99  COMP.
    01  WS-OI-IDX                     PIC 99  COMP.
    

    The “-FSM-” acronyms make it easier to locate the description of the 01-item the status code and message text items belong to.

12.3 Table Subscripting versus Table Indexing

The elements of a table may be referenced either using a subscript or an index. Syntactically, this is coded using parenthesis, as per the following three examples, all of which store the letter ‘A‘ into the 17th occurrence of a data item named WS-Output-Image-TXT:

  1. MOVE 'A' TO WS-Output-Image-TXT (17)

  2. MOVE 17 TO WS-OI-SUB
    MOVE 'A' TO WS-Output-Image-TXT (WS-OI-SUB)
  3. SET WS-OI-IDX TO 17
    MOVE 'A' TO WS-Output-Image-TXT (WS-OI-IDX)

The 1st and 2nd examples are referred to as Subscripting while the 3rd is known as Indexing. The distinction is fairly simple.

Indexing is the process of referencing an element of a table utilizing a data item with an explicitly or implicitly defined USAGE ( 6.9.61 USAGE) of INDEX to select the desired occurrence, while …

Subscripting is the process of referencing an element of a table utilizing either a numeric constant or an unedited numeric data item to select the desired occurrence.

Various implementations of COBOL generate object code that is quite different in each of these three situations, and GnuCOBOL is no exception.

In general, table references such as example #1 (constant subscript) generate the smallest, simplest and fastest object code while table references such as example #2 (numeric data item subscript) generate the largest, most-complicated and slowest object code.

Table references such as example #3 (table indexing) generate object code that falls in the middle of the other two but is far closer in efficiency to example #1 than #2.

Some COBOL statements (SEARCH ( 7.8.39 SEARCH), SEARCH ALL ( 7.8.40 SEARCH ALL) and the table-based SORT ( 7.8.42.2 Table SORT)) require you to index the affected table and to utilize that index with those statements. With any other references to tables, the choice is left to the programmer as to which approach should be used. In general, follow these rules:

  1. Use constant subscripts (example #1) wherever possible/practical.

  2. If references to table elements are going to be performed many, many times (tens or hundreds of thousands of times or more) during program execution, you will probably see a noticeable reduction in program execution time if you use indexing versus subscripting.

It’s impossible to perform any arithmetic operation against an index data item directly (other than a simple incrementation or decremental operation via the SET UP/DOWN statement ( 7.8.41.5 SET UP/DOWN)). Situations where any non-trivial computations are required to calculate the effective occurrence number for a table reference will require you to use a conventional unedited numeric data item as the receiving field for the calculation. That calculated value would then need to be saved into the index data item via a SET Index statement.

If you only need to use the computed occurrence number once, you might as well just use the computed occurrence number data item as a subscript. If, however, you will need to use a computed “subscript” many more times than once, the run-time overhead of converting that occurrence value to an index (via SET Index) will be worth the coding effort.

Whew!

If references to table elements are not going to be performed many, many times it probably won’t make much difference whether you use indexing or subscripting.

If you are comfortable with the C programming language, you might find the following simple GnuCOBOL program useful in exploring the differences between subscripting and indexing:

IDENTIFICATION DIVISION.
PROGRAM-ID.  SUBVSINDEX.
DATA DIVISION.
WORKING-STORAGE SECTION.
01  WS-TABLE-SUB                BINARY-LONG.
01  WS-TABLE.
    05 WS-TABLE-ENTRY           OCCURS 20 TIMES
                                INDEXED BY WS-TABLE-IDX
                                PIC X(1).
PROCEDURE DIVISION.
000-Main SECTION.
E1. MOVE 'A' TO WS-TABLE-ENTRY (17)
    .
E2. MOVE 17 TO WS-TABLE-SUB
    MOVE 'A' TO WS-TABLE-ENTRY (WS-TABLE-SUB)
    .
E3. SET WS-TABLE-IDX TO 17
    MOVE 'A' TO WS-TABLE-ENTRY (WS-TABLE-SUB)
    .

Compile this program as follows (the assumption is made that you are executing the cobc command from the directory in which the above program source code (subvsindex.cbl) exists.

cobc -C -save-temps subvsindex.cbl\

After this command is executed, the file subvsindex.c will contain the procedure division C code and subvsindex.c.1.h will contain the working-storage C code. Compare the generated C code for each of the three MOVE statements.

12.4 Copybook Naming Conventions and Usage

Since the intent of a copybook is to introduce COBOL code into a particular spot in a program via the COPY statement ( 3.2 COPY), it is always a good idea to prefix copybook names with a character sequence that identifies where in a program its contents are intended to be COPYed.

For example:

  • IDxxxxxxxx

    Copybooks containing code intended for the identification division. These will be rare as you almost never encounter copied code in the identification division.

  • EDxxxxxxxx

    Copybooks containing code intended for use in the environment division. These copybooks are generally used for predefined SPECIAL-NAMES ( 5.1.3 SPECIAL-NAMES) or FILE-CONTROL ( 5.2 INPUT-OUTPUT SECTION) syntax,

  • DDxxxxxxxx

    Copybooks that contain data definitions.

  • PDxxxxxxxx

    Copybooks that contain executable instructions.

12.5 PROCEDURE DIVISION Sections Versus Paragraphs

The issue of whether to use section and/or paragraph names (collectively referred to as procedure names) within the procedure division is one of near religious significance with many COBOL programmers.

COBOL programming standards used by many organizations that use the language generally call for procedure names to:

  1. Contain a leading numeric component (for example: 2000-Update-Customer), AND…

  2. Be defined in the procedure division in non-decreasing sequence of that numeric component.

When you are looking at or editing any large COBOL program that has been created with programming standards that include these two rules, it is always a simple thing to know whether a reference to a procedure is being made to code that exists before or after your current location in the program, simply by comparing the numeric component of the current procedure’s name with the one in question.

Technically, GnuCOBOL does not require ANY procedure names be defined unless:

  1. You are using the ALTER statement ( 7.8.4 ALTER) (the use of which should be avoided at all costs)

  2. You are using a procedural PERFORM statement ( 7.8.31.1 Procedural PERFORM)

  3. You are using a GO TO statement ( 7.8.22 GO TO)

  4. You are using a MERGE statement ( 7.8.27 MERGE) with an OUTPUT PROCEDURE

  5. You are using a SORT statement ( 7.8.42 SORT) with either (or both) an INPUT PROCEDURE or OUTPUT PROCEDURE

  6. You are using DECLARATIVES ( 7.5 DECLARATIVES)

Since it is difficult to write any non-trivial COBOL program that uses none of the above, lets assume you will be including at least one section or paragraph in your GnuCOBOL programs.

I like to use procedure division paragraphs and sections as follows:

  1. The very first procedure defined in the procedure division of my programs, assuming no DECLARATIVES ( 7.5 DECLARATIVES) are defined, will be a section named 000-Main. The declaration of this procedure will immediately follow the procedure division header (or END DECLARATIVES if DECLARATIVES are used).

  2. Any procedures referenced by MERGE, PERFORM, or SORT statements will be defined as sections.

  3. Any procedures referenced by GO TO statements will be defined as paragraphs, and those paragraphs will be defined in the same section as the GO TO statements that reference them. In other words, GO TO statements may not be used to transfer control to a point in a different section. This is not a GnuCOBOL rule — this is my own personal programming practice intended to improve the readability and maintainability of my programs.

  4. I always include a numeric prefix to all procedure names I define, for the reasons stated earlier.

  5. I do not use THRU on any MERGE, PERFORM or SORT statement unless the programming standards of the shop in which I am working require it. My reasoning for this is that it is too easy to accidentally introduce a new procedure into the scope of a THRU.

12.6 COMPUTE Versus ADD-SUBTRACT-MULTIPLY-DIVIDE

Over the years, there has been much debate over the efficiency and arithmetic accuracy of using the COMPUTE statement ( 7.8.9 COMPUTE) rather than the four basic arithmetic operation statements.

Here are the facts — draw your own conclusions as to which approach is more appropriate under which circumstances.

  1. The COMPUTE statement supports exponentiation (via the ‘**‘ operator) — there is no equivalent basic arithmetic statement. Although you could simulate integral exponentiation (raising a value to the third power, for example) using MULTIPLY statements, and you may use the SQRT intrinsic function ( 8.1.87 SQRT) to find a square root, there’s just no (easy) way to find the cube-root of a value without using the COMPUTE statement.

  2. For non-trivial computations, COMPUTE statements “read” better. Take this, for example:

    COMPUTE R = (A + B * C) / D
    

    As compared to:

    MULTIPLY B BY C GIVING TEMP
    ADD A TO TEMP
    DIVIDE TEMP BY D GIVING R
    

    For non-trivial computations, COMPUTE statements may execute faster than the equivalent chain of basic arithmetic statements. For example, the COMPUTE statement shown above executes about 25% faster on my computer using GnuCOBOL than does the MULTIPLY-ADD-DIVIDE sequence.

  3. For trivial computations, on the other hand, I prefer the inherent readability of a statement such as this:

    ADD 1 TO WSS-Input-Trans-QTY
    

    to this:

    COMPUTE WS-Input-Trans-QTY = WS-Input-Trans-QTY + 1