11 Sub-Programming

11.1 Subprogram Types

Simply stated, a Subprogram is a program that is invoked by another program; the subprogram performs whatever its designed operations are and — when complete — typically returns control back to the program that invoked it. There are two different types of subprograms supported by GnuCOBOL, subroutines and user-defined functions. The distinction between these two subprogram types lies in the manner in which they are executed.

When program A invokes subprogram B as a Subroutine, it does so using a special statement dedicated to that function (the CALL statement ( 7.8.5 CALL), just as if B were one of the built-in system subroutines.

When program A invokes program B as a User-Defined Function, it does so in a manner identical to how B would have been invoked had it been one of the many built-in intrinsic functions.

In either instance, program A is referred to as the Calling Program while program B is known as the Called Program. GnuCOBOL programs may be a calling program, a called program or both.

A program written in the C programming language may serve as either the calling or called program too. A called program may act as a calling program to another called program. When a calling program does not serve as a called program to any program, that calling program is known as a Main Program.

Both subroutines and user-defined functions may return a value. The value they return must be an integer in the range -2147483648 to +2147483647. This value will be available in the RETURN-CODE special register ( 7.7 Special Registers) and also as the value of the data item specified on the RETURNING ( 7.8.5 CALL) clause of a subroutine’s CALL.

11.2 Independent vs Contained vs Nested Subprograms

Subprograms (either subroutines or user-defined functions) can be implemented in three different ways.

  • Independent Subprograms

    These are subprograms that are coded as the only COBOL program in their Compilation Unit ( Compilation Unit).

  • Contained Subprograms

    These are subprograms which occur in the same Compilation Unit as a main program and/or other subprograms. Each contained subprogram is separated from the next via an END PROGRAM marker line. As an example…

    IDENTIFICATION DIVISION.
    PROGRAM-ID. SUB1.
    ...
    END PROGRAM SUB1.
    IDENTIFICATION DIVISION.
    PROGRAM-ID. SUB2.
    ...
    END PROGRAM SUB2.
    

    Program source code may be concatenated as shown here, provided an END PROGRAM marker naming the PROGRAM-ID of the just-completed program is used to separate one program from another.

    There’s no reason that user-defined functions cannot be included too — they’ll just have FUNCTION-IDs and will be ended by END FUNCTION markers.

    The last program in any GnuCOBOL source file need not have an END marker.

    When multiple programs occur in a source file, it is assumed that the programs are related to one another in that they will be CALLed or executed as functions from the others.

  • Nested Subprograms

    It is also possible to create source files where GnuCOBOL programs are nested inside each other. Take for example these four GnuCOBOL programs:

    IDENTIFICATION DIVISION.
    PROGRAM-ID. PROG1.
    ...
    IDENTIFICATION DIVISION.
    PROGRAM-ID. PROG2.
    ...
    IDENTIFICATION DIVISION.
    PROGRAM-ID. PROG3.
    ...
    END PROGRAM PROG3.
    END PROGRAM PROG2.
    IDENTIFICATION DIVISION.
    PROGRAM-ID. PROG4.
    ...
    END PROGRAM PROG4.
    END PROGRAM PROG1.
    

    Here we see that PROG2 is nested inside of PROG1 because there is no END PROGRAM marker separating them. This means that data items or files defined within PROG1 can be used within PROG2 simply by attaching the GLOBAL ( 6.9.23 GLOBAL) attribute to them back in PROG1 when they are defined.

    Similarly, since there is no END PROGRAM marker separating PROG3 from PROG2, it is possible for PROG3 to access GLOBAL files and data items defined within PROG2. Since PROG2 is nested within PROG1, any GLOBAL resources defined within PROG1 will be available to PROG3 as well.

    The two END PROGRAM markers for PROG3 and PROG2 (note their sequence) mean that PROG4 is nested within PROG1 only. It will not have access to any GLOBAL resources defined within either PROG2 or PROG3.

    The END PROGRAM PROG1. marker, since it is the last line in the source file, is entirely optional.

11.3 Alternate Entry Points

Any subroutine may have multiple entry-points defined within it. This means the subroutine could be called either via a CALL '<program-id>' or a CALL '<entry-point>' statement. There may be any number of alternate entry-points defined within a subroutine.

Alternate entry-points provide multiple ways in which the same subroutine may be called; presumably, each entry-point will provide some different functionality to the calling program. For example, if you wished to write a subroutine that manipulates “student” records in a database, you might have the primary entry-point name retrieve a student record from the database, while the alternate entry points Add-Student, Update-Student and Delete-Student could provide the alternate functions implied by their entry-point names.

The alternative to using multiple entry points in your subroutine, by the way, would be to include an additional argument to the primary (and only) entry point of the subroutine; this new argument might be named STUDENT-FUNCTION and might have values of ‘FETCH‘, ‘ADD‘, ‘UPDATE‘ or ‘DELETE‘.

The primary entry-point for any subroutine is always the first executable statement following any DECLARATIVES ( 7.5 DECLARATIVES) in the procedure division. The name of that entry-point (the name that will be called) is the subroutine’s PROGRAM-ID ( 4 IDENTIFICATION DIVISION).

An alternate entry point is added to a subroutine using the ENTRY statement ( 7.8.14 ENTRY).

When an alternate entry-point is called, execution within the subroutine will begin at the first executable statement following the ENTRY statement.

11.4 Dynamic vs Static Subprograms

Any subprogram may be either statically or dynamically loaded into memory.

A Static Subprogram is one which was in the same Compilation Unit ( Compilation Unit) as the other program(s) which call it, therefore meaning that its executable object code is part of the same executable file as its calling program. The static subprogram was therefore loaded into memory as part of and at the same time as the calling program.

A Dynamic Subprogram is one whose executable object code exists as an executable file separate from that containing the calling program; these two programs were therefore each compiled in their own separate Compilation Group ( Compilation Group). Dynamic subprograms are located and loaded into memory the first time they are executed. Dynamic subprograms may be unloaded from memory via the CANCEL statement ( 7.8.6 CANCEL), if desired.

GnuCOBOL subprograms may be created as either static or dynamic subprograms, as desired by the programmer.

To demonstrate, assume that a GnuCOBOL Main Program (whose code resides in the file M.cbl) will be calling three subprograms, named A, B and C (these are the PROGRAM-IDs of the three subprograms, and their source code may be found in the files A.cbl, B.cbl and C.cbl, respectively.

Here is how these four programs would be compiled if the three subprograms are to be static:

cobc -x M.cbl A.cbl B.cbl C.cbl

This command informs the compiler (cobc) that four programs are to be compiled (the first named on the command must always be the main program), and a single executable file is to be created (due to the -x switch).

Here is how the main program and the three subprograms could be compiled if the three subprograms are to be dynamic:

cobc -x M.cbl
cobc -m A.cbl B.cbl C.cbl

These commands will create an executable file for the main program (-x switch) and three separate dynamically-loadable libraries (see -m switch), one for each of the three subprograms. Had we wished, we could have created a single dynamically-loadable library containing all three subprograms by adding the -b switch to their compilation:

cobc -m -b A.cbl B.cbl C.cbl

Dynamically-loadable libraries are also known by the term dynamically-loadable modules. The two terms are synonymous.

Here are the rules about GnuCOBOL dynamically-loadable modules:

  1. There may be multiple GnuCOBOL subprograms contained within a single dynamically-loadable library if the -b switch is used in addition to -m. If not, each subprogram will be compiled to a separate dynamically-loadable library.

  2. Dynamically-loadable modules will be named <xxxxxxxx>.dll on a Windows system, <xxxxxxxx>.so on a Unix system or <xxxxxxxx>.dylib on an OSX system, where <xxxxxxxx> exactly matches, including the usage of upper- and lower-case letters, the primary entry-point name (PROGRAM-ID or FUNCTION-ID) or an alternate entry point name defined via the ENTRY statement ( 7.8.14 ENTRY) of any one of the GnuCOBOL programs included in that module.

  3. The first time any of the GnuCOBOL subprograms in a dynamically-loadable module are invoked, the entry-point referenced must be the one for which the .dll, .so or .dylib file is named.

  4. When a dynamically-loadable module needs to be loaded (because it is not already in memory from a previous subprogram execution), the dynamically-loadable library will be sought in the same directory from which the main program was loaded. If it cannot be found there, each directory named in the run-time environment variable ( 10.2.3 Run Time Environment Variables) will be searched. If it was not located in any of those directories, the library specified by the run-time environment variable will be searched. Finally, if it still cannot be located, execution will be terminated with an error message (libcob: Cannot find module ‘xxxxxxxx’).

  5. Once the dynamically-loadable module has been successfully loaded, any of the entry-points contained within it are now available for reference.

  6. Dynamically-loadable modules may be removed from memory via the CANCEL statement ( 7.8.6 CANCEL).

  7. Once a dynamically-loadable module is actually loaded into memory, even if it is subsequently unloaded (via the CANCEL statement), its list of entry-points remain available to the GnuCOBOL run-time library and subsequent re-executions of any of those entry points will be able to bypass the search (rule #4) as well as the first-execution rule (rule #3).

Consult the documentation on the run-time environment variable, run-time environment variable and run-time environment variable run-time environment variables ( 10.2.3 Run Time Environment Variables) for additional options when using dynamically-loadable modules.

11.5 Subprogram Execution Flow

When a subprogram is invoked, the flow of execution will differ slightly depending on whether the subprogram is a subroutine or a user-defined function.

11.5.1 Subroutine Execution Flow

When a subroutine is CALLed:

  1. The calling program issues a statement of the form CALL '<entry-point>' USING ... to transfer control to the subroutine.

  2. The executable for the called program will be located and loaded into memory:

    1. If it is a static subroutine, it will already be part of the executable program issuing the CALL ( 7.8.5 CALL).

    2. If it is a dynamic subroutine, the GnuCOBOL run-time system will check to see if a dynamically-loadable module containing the subprogram’s entry point was already located. If it was, no further “location” activity is needed. If not, the dynamically-loadable module will be located ( Locating Dynamically-Loadable Modules).

    3. Once the module has been located (if location was needed), it will be loaded into memory (if not already loaded).

  3. Execution of the calling program is suspended and control will transfer to the called program, as follows:

    1. If the PROGRAM-ID ( 4 IDENTIFICATION DIVISION) clause of the subprogram included the INITIAL clause, the program will be reinitialized back to its compile-time state. This will happen regardless of the INITIAL clause the first time the subprogram is executed.

    2. Local-storage, if any, will be allocated and initialized.

    3. Execution will begin at the first executable statement following the subprograms entry-point. The entry point will be either the first executable statement following any DECLARATIVES ( 7.5 DECLARATIVES) that might be present (if the subprogram was invoked using its primary entry-point name) or the first executable statement following the ENTRY statement ( 7.8.14 ENTRY) naming the entry-point specified on the CALL if the subprogram was invoked using an alternate entry point.

  4. The flow of execution will then progress through the coding of the subprogram as it would with any other program.

  5. If the subprogram issues a STOP statement ( 7.8.44 STOP) with the RUN option, program execution ceases and control returns to the operating system or whatever execution shell invoked the main program.

  6. If the subprogram wishes to return control back to the calling program, it will do so using either the GOBACK statement ( 7.8.21 GOBACK) or the EXIT PROGRAM statement ( 7.8.18 EXIT). At this time:

    1. If the subprograms procedure division header or ENTRY statement included a RETURNING, the value of the data item found on that clause is moved to the RETURN-CODE special register ( 7.7 Special Registers); this behaviour can be altered utilizing the CALL-CONVENTION ( 5.1.3 SPECIAL-NAMES) feature to leave RETURN-CODE unchanged.

    2. Local-storage, if any, is de-allocated.

    3. If the calling program included a RETURNING clause on the CALL statement that invoked the subprogram, the value of the RETURNING data item in the subroutine is moved to that data item. If there was no RETURNING specified in the subroutine, the value of the RETURN-CODE special register is moved to that data item.

    4. Execution will resume back in the calling program with the first executable statement following the CALL that invoked the subprogram.

11.5.2 User-Defined Function Execution Flow

When a user-defined function is executed:

  1. The object code for the called program (the user-defined function) will be located, as follows:

    1. If it is a static user-defined function, it will already be part of the executable file containing the calling program.

    2. If it is a dynamic user-defined function, the GnuCOBOL run-time system will check to see if a dynamically-loadable module containing the function’s entry point was already located. If it was, no further “location” activity is needed. If not, the dynamically-loadable module will be located ( Locating Dynamically-Loadable Modules).

    3. Once the module has been located (if location was needed), it will be loaded into memory (if not already loaded).

  2. Execution of the calling program is suspended and control will transfer to the called program, as follows:

    1. Local-storage, if any, will be allocated and initialized.

    2. Execution will begin with the first executable statement in the procedure division following any DECLARATIVES ( 7.5 DECLARATIVES) that might be present.

  3. The flow of execution will then progress through the coding of the function as it would with any other program.

  4. If the function issues a STOP statement ( 7.8.44 STOP) with the RUN option, program execution ceases and control returns to the operating system or whatever execution shell invoked the main program.

  5. If the function wishes to return control back to the calling program, it will do so using either the GOBACK statement ( 7.8.21 GOBACK) or the EXIT FUNCTION statement ( 7.8.18 EXIT). At this time:

    1. The value of the data item found on the user-defined functions PROCEDURE DIVISION RETURNING ( 7.3 PROCEDURE DIVISION RETURNING) clause is moved to the RETURN-CODE special register ( 7.7 Special Registers).

    2. Local-storage, if any, is de-allocated.

    3. Execution will resume back in the calling program at the point where the returned value of the function is needed. At that point, the value in the RETURN-CODE special register will be used for the function’s value.

11.6 Sharing Data Between Calling and Called Programs

11.6.1 Subprogram Arguments

11.6.1.1 Calling Program Considerations

Data items defined in a calling program may be passed to either type of called program (subroutine or user-defined function) as arguments.

Arguments must be described in both the calling and called programs, and while they don’t need to have the same names in both programs, they should be described in an identical manner with regard to the following characteristics:

A subroutine may be passed a maximum of 251 arguments; if you build the GnuCOBOL software yourself from the distributed source, you CAN change this value by altering the defined value of COB_MAX_FIELD_PARAMS in the call.h header file but also see 7.8.5.11 for more information. There is no built-in GnuCOBOL limit to how many arguments a user-defined function may be passed.

Whether or not changes made to an argument within a subroutine will be “visible” to the calling program depends on how the argument was passed. There are three ways in which arguments may be passed from a calling program to a subroutine, as defined by the use of optional BY clauses in the CALL ( 7.8.5 CALL) statement’s list of arguments.

As an example, the following statement passes three arguments to a subroutine — each argument is passed differently.

CALL "subroutine" USING BY REFERENCE arg-1
                        BY CONTENT arg-2
                        BY VALUE arg-3
END-CALL

The three ways arguments are passed are as follows.

  • BY REFERENCE

    When a subroutine argument is passed BY REFERENCE, the subroutine is passed the address of the actual data item being passed as an argument. The item may be anything defined within the data division of the program. If the subroutine modifies the contents of this argument, the calling program will “see” the results of that change when the subroutine returns control. This is the default manner in which GnuCOBOL passes arguments to a subroutine, should no BY clauses be included on the CALL.

  • BY CONTENT

    When a subroutine is passed an argument BY CONTENT, the subroutine is passed the address of a copy of the actual data being passed as an argument. The item may be anything defined within the data division of the program. The copy is made each time the CALL statement is executed, immediately before the CALL actually takes place. If the subroutine modifies the contents of this argument, it will be the copy that is modified, not the original data item; the calling program will therefore not “see” the results of that change when the subroutine returns control.

  • BY VALUE

    Passing a subroutine argument BY VALUE passes the actual value of the data being passed as an argument. The item may be any elementary binary numeric item defined within the data division of the program. If the subroutine modifies the contents of this argument, the calling program will not “see” the results of that change when the subroutine returns control.

The first two ways in which arguments may be passed (BY REFERENCE and BY CONTENT) are intended for use when a GnuCOBOL program is being called, while the first and third (BY REFERENCE and BY VALUE) are intended for use when a C program is being called. You can use BY VALUE arguments when calling GnuCOBOL subroutines, but remember that those arguments are limited to being a numeric binary data item.

Arguments to user-defined functions are always passed BY REFERENCE.

11.6.1.2 Called Program Considerations

When coding a GnuCOBOL subprogram (a subroutine or user-defined function), all arguments to the subprogram must be defined in the subprogram’s linkage section.

These arguments must be explicitly included on the PROCEDURE DIVISION USING ( 7.1 PROCEDURE DIVISION USING) clause that lists the arguments in the sequence in which they will be passed to the subprogram.

These arguments described in the PROCEDURE DIVISION USING clause may each be defined as either BY REFERENCE, if the calling program is passing them either BY REFERENCE or BY CONTENT, or as BY VALUE if they are being passed BY VALUE.

By default, all arguments are assumed to be BY REFERENCE unless explicitly stated otherwise on the procedure division header.

Arguments to a user-defined function are always to be specified as BY REFERENCE (either explicitly or by not using any BY).

If the subprogram returns a value, the data item in which the value is returned must also be defined in the subprogram’s linkage section, with a USAGE ( 6.9.61 USAGE) of BINARY-LONG SIGNED, or its equivalent.

11.6.2 GLOBAL Data Items

Another way in which a data item may be shared between a calling program (A) and a called program (B) is by defining the data item in the calling program and attaching the GLOBAL ( 6.9.23 GLOBAL) clause to it so that it may be used within the called program. In order for this to work, program B (the one called by program A) must be a nested subprogram within program A.

Here’s a small example:

IDENTIFICATION DIVISION.
PROGRAM-ID. DemoGLOBAL.
ENVIRONMENT DIVISION.
DATA DIVISION.
WORKING-STORAGE SECTION.
01  Arg GLOBAL                     PIC X(10).
PROCEDURE DIVISION.
000-Main.
    MOVE ALL "X" TO Arg
    CALL "DemoSub" END-CALL
    DISPLAY "DemoGLOBAL: " Arg END-DISPLAY
    GOBACK
    .
IDENTIFICATION DIVISION.
PROGRAM-ID. DemoSub.
PROCEDURE DIVISION.
000-Main.
    MOVE ALL "*" TO Arg.
    GOBACK
    .
END PROGRAM DemoSub.
END PROGRAM DemoGLOBAL.

When the program runs, it produces the output:

DemoGLOBAL: **********

11.6.3 EXTERNAL Data Items

The final way in which a data item may be shared between a calling program (A) and a called program (B) is by defining the data item (with the same name) in both programs and attaching the EXTERNAL ( 6.9.18 EXTERNAL) clause to it (again, in both programs). This approach works regardless of whether the called program is nested within the calling program or not. It also works even if the two programs are compiled separately.

Here’s a demonstration:

IDENTIFICATION DIVISION.
PROGRAM-ID. DemoEXTERNAL.
ENVIRONMENT DIVISION.
DATA DIVISION.
WORKING-STORAGE SECTION.
01  Arg EXTERNAL                PIC X(10).
PROCEDURE DIVISION.
000-Main.
    MOVE ALL "X" TO Arg
    CALL "DemoSub" END-CALL
    DISPLAY "DemoEXTERNAL: " Arg END-DISPLAY
    GOBACK
    .
END PROGRAM DemoEXTERNAL.
IDENTIFICATION DIVISION.
PROGRAM-ID. DemoSub.
DATA DIVISION.
WORKING-STORAGE SECTION.
01  Arg EXTERNAL                PIC X(10).
PROCEDURE DIVISION.
000-Main.
    MOVE ALL "*" TO Arg.
    GOBACK
    .
END PROGRAM DemoSub.

When the program runs, it produces the output:

DemoEXTERNAL: **********

11.7 Recursive Subprograms

A subroutine may CALL itself, either directly or indirectly from another subroutine or user-defined function that it CALLs. Any subroutine that indulges in this sort of behaviour (called recursion) is called a Recursive Subprogram.

Any GnuCOBOL subroutine can be recursively invoked only if it is defined to the GnuCOBOL compiler as being a recursive subroutine. This is accomplished by adding the RECURSIVE attribute to its PROGRAM-ID ( 4 IDENTIFICATION DIVISION).

All User-defined functions are automatically capable of being executed recursively.

Here is an example of a main program (DEMOFACT) that calls both a subprogram (SUB) and a user-defined function (FUNC) to compute the factorial value of a number.

IDENTIFICATION DIVISION.
PROGRAM-ID. DEMOFACT.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
REPOSITORY.
    FUNCTION RECURSIVEFUNC.
DATA DIVISION.
WORKING-STORAGE SECTION.
01  Result    USAGE BINARY-LONG.
01  Arg       USAGE BINARY-LONG.
PROCEDURE DIVISION.
000-Main.
    MOVE 6 TO Arg
    CALL "RECURSIVESUB"
        USING BY CONTENT Arg
        RETURNING Result
    DISPLAY Arg "! = "
            Result
    DISPLAY Arg "! = "
            RECURSIVEFUNC(Arg)
    GOBACK
    .
END PROGRAM DEMOFACT.
IDENTIFICATION DIVISION.
PROGRAM-ID. SUB RECURSIVE.
DATA DIVISION.
WORKING-STORAGE SECTION.
01  Result      USAGE BINARY-LONG.
01  Next-Arg    USAGE BINARY-LONG.
01  Next-Result USAGE BINARY-LONG.
LINKAGE SECTION.
01  Arg         USAGE BINARY-LONG.
PROCEDURE DIVISION USING Arg
               RETURNING Result.
000-Main.
    DISPLAY "Entering SUB"
            " Arg=" Arg
    IF Arg = 1
      MOVE 1 TO Result
      DISPLAY "Leaving SUB"
              " Returning " Result
    ELSE
      SUBTRACT 1 FROM Arg
          GIVING Next-Arg
      CALL "SUB"
           USING BY CONTENT Next-Arg
           RETURNING Next-Result
      COMPUTE Result =
              Arg * Next-Result
      DISPLAY "Leaving SUB"
              " Returning "
              Result "=" Arg "*"
              Next-Result
    END-IF
    GOBACK
    .
END PROGRAM SUB.
IDENTIFICATION DIVISION.
FUNCTION-ID. FUNC.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
REPOSITORY.
    FUNCTION RECURSIVEFUNC.
DATA DIVISION.
WORKING-STORAGE SECTION.
LINKAGE SECTION.
01  Arg     USAGE BINARY-LONG.
01  Result  USAGE BINARY-LONG
            SIGNED.
PROCEDURE DIVISION USING Arg
               RETURNING Result.
000-Main.
    DISPLAY "Entering FUNC"
            " Arg=" Arg
    IF Arg = 1
      MOVE 1 TO Result
    ELSE
      COMPUTE Result = Arg *
              FUNC(Arg - 1)
    END-IF
    DISPLAY "Leaving FUNC"
            " Returning " Result
    GOBACK
    .
END FUNCTION FUNC.

When DEMOFACT is executed, the output shown below is generated.

E:\Programs\Demos>demofact
Entering RECURSIVESUB Arg=+0000000006
Entering RECURSIVESUB Arg=+0000000005
Entering RECURSIVESUB Arg=+0000000004
Entering RECURSIVESUB Arg=+0000000003
Entering RECURSIVESUB Arg=+0000000002
Entering RECURSIVESUB Arg=+0000000001
Leaving RECURSIVESUB Returning +0000000001
Leaving RECURSIVESUB Returning +0000000002=+0000000002*+0000000001
Leaving RECURSIVESUB Returning +0000000006=+0000000003*+0000000002
Leaving RECURSIVESUB Returning +0000000024=+0000000004*+0000000006
Leaving RECURSIVESUB Returning +0000000120=+0000000005*+0000000024
Leaving RECURSIVESUB Returning +0000000720=+0000000006*+0000000120
+0000000006! = +0000000720
Entering RECURSIVEFUNC Arg=+0000000006
Entering RECURSIVEFUNC Arg=+0000000005
Entering RECURSIVEFUNC Arg=+0000000004
Entering RECURSIVEFUNC Arg=+0000000003
Entering RECURSIVEFUNC Arg=+0000000002
Entering RECURSIVEFUNC Arg=+0000000001
Leaving RECURSIVEFUNC Returning +0000000001
Leaving RECURSIVEFUNC Returning +0000000002
Leaving RECURSIVEFUNC Returning +0000000006
Leaving RECURSIVEFUNC Returning +0000000024
Leaving RECURSIVEFUNC Returning +0000000120
Leaving RECURSIVEFUNC Returning +0000000720
+0000000006! = +0000000720

11.8 Combining GnuCOBOL and C Programs

The upcoming sections deal the issues pertaining to calling C language programs from GnuCOBOL programs, and vice versa. Two additional sections provide samples illustrating specifics as to how those issues are overcome in actual program code.

11.8.1 GnuCOBOL Run-Time Library Requirements

Like most other implementations of the COBOL language, GnuCOBOL utilizes a run-time library. When the first program executed in a given execution sequence is a GnuCOBOL program, any run-time library initialization will be performed by the compiled COBOL code in a manner that is transparent to the C-language programmer. If, however, a C program is the first to execute, the burden of performing GnuCOBOL run-time library initialization falls upon the C program. 11.8.5 C Main Programs Calling GnuCOBOL Subprograms, for an example of how to do this.

11.8.2 String Allocation Differences Between GnuCOBOL and C

Both languages store strings as a fixed-length continuous sequence of characters.

COBOL stores these character sequences up to a specific quantity limit imposed by the PICTURE ( 6.9.37 PICTURE) clause of the data item. For example: 01  LastName   PIC X(15)..

There is never an issue of exactly what the length of a string contained in a USAGE DISPLAY ( 6.9.61 USAGE) data item is — there are always exactly how ever many characters as were allowed for by the PICTURE clause. In the example above, LastName will always contain exactly fifteen characters; of course, there may be anywhere from 0 to 15 trailing SPACES as part of the current LastName value.

C actually has no “string” data type; it stores strings as an array of char data type items where each element of the array is a single character. Being an array, there is an upper limit to how many characters may be stored in a given “string”. For example:

char lastName[15]; /* 15 chars: lastName[0] through lastName[14] */

C provides a robust set of string-manipulation functions to copy strings from one char array to another, search strings for certain characters, compare one char array to another, concatenate char arrays and so forth. To make these functions possible, it was necessary to be able to define the logical end of a string. C accomplishes this via the expectation that all strings (char arrays) will be terminated by a NULL character (x'00'). Of course, no one forces a programmer to do this, but if [s]he ever expects to use any of the C standard functions to manipulate that string they had better be null-terminating their strings!

So, GnuCOBOL programmers expecting to pass strings to or receive strings from C programs had best be prepared to deal with the null-termination issue, as follows:

  1. Pass a quoted literal string from GnuCOBOL to C as a zero-delimited string literal (Z'<string>').

  2. Pass alphanumeric (PIC X) or alphabetic (PIC A) data items to C subroutines by appending an ASCII NUL character (X'00') to them. For example, to pass the 15-character LastName data item described above to a C subroutine:

    01  LastName-Arg-to-C     PIC X(16).
    ...
        MOVE FUNCTION CONCATENATE(LastName,X'00') TO LastName-Arg-to-C
    

    And then pass LastName-Arg-to-C to the C subprogram!

  3. When a COBOL program needs to process string data prepared by a C program, the embedded null character must be accounted for. This can easily be accomplished with an INSPECT statement ( 7.8.26 INSPECT) such as the following:

    INSPECT Data-From-a-C-Program
        REPLACING FIRST X'00' BY SPACE
                  CHARACTERS BY SPACE AFTER INITIAL X'00'
    

11.8.3 Matching C Data Types with GnuCOBOL USAGE’s

Matching up GnuCOBOL numeric Usage’s with their C language data type equivalents is possible via the following chart:

ERROR: uninterpreted block “float”

  • COBOL C

  • BINARY-CHAR UNSIGNED unsigned char

  • BINARY-CHAR [ SIGNED ] signed char

  • BINARY-SHORT UNSIGNED unsigned
    unsigned int
    unsigned short
    unsigned short int
  • BINARY-SHORT [ SIGNED ] int
    short
    short int
    signed int
    signed short
    signed short int
  • BINARY-LONG UNSIGNED unsigned long
    unsigned long int
  • BINARY-LONG [ SIGNED ]
    BINARY-INT long
    long int
    signed long
    signed long int
  • BINARY-C-LONG [ SIGNED ] long

  • BINARY-DOUBLE UNSIGNED unsigned long long
    unsigned long long int
  • BINARY-DOUBLE [ SIGNED ]
    BINARY-LONG-LONG long long int
    signed long long int
  • COMPUTATIONAL-1 float

  • COMPUTATIONAL-2 double

  • N/A (no GnuCOBOL equivalent) long double

These sizes conform to the COBOL standard and the minimum sizes of the COBOL types are the same as the minimum sizes of the corresponding C data types. There’s no official compatibility between them. Note that values in square braces ‘[]’ are the defaults.

11.8.4 GnuCOBOL Main Programs CALLing C Subprograms

Here’s a sample of a GnuCOBOL program that CALLs a C subprogram.

COBOL Calling Program               C Called Program
==================================  ===============================
IDENTIFICATION DIVISION.            #include <stdio.h>
PROGRAM-ID. maincob.                int subc(char *arg1,
DATA DIVISION.                               char *arg2,
WORKING-STORAGE SECTION.                     unsigned long *arg3) {
01  Arg1     PIC X(7).                char nu1[7]="New1";
01  Arg2     PIC X(7).                char nu2[7]="New2";
01  Arg3     USAGE BINARY-LONG.       printf("Starting subc\n");
PROCEDURE DIVISION.                   printf("Arg1=%s\n",arg1);
000-Main.                             printf("Arg2=%s\n",arg2);
    DISPLAY 'Starting maincob'        printf("Arg3=%d\n",*arg3);
    MOVE Z'Arg1'   TO Arg1            arg1[0]='X';
    MOVE Z'Arg2'   TO Arg2            arg2[0]='Y';
    MOVE 123456789 TO Arg3            *arg3=987654321;
    CALL 'subc'                       return 2;
        USING BY CONTENT   Arg1,    }
              BY REFERENCE Arg2,
              BY REFERENCE Arg3
    DISPLAY 'Back'
    DISPLAY 'Arg1=' Arg1
    DISPLAY 'Arg2=' Arg2
    DISPLAY 'Arg3=' Arg3
    DISPLAY 'Returned value='
            RETURN-CODE
    STOP RUN
    .

The idea is to pass two string and one full-word unsigned arguments to the subprogram, have the subprogram print them out, change all three and pass a return code of 2 back to the caller. The caller will then re-display the three arguments (showing changes only to the two BY REFERENCE arguments), display the return code and halt.

While simple, these two programs illustrate the techniques required quite nicely.

Note how the COBOL program ensures that a null end-of-string terminator is present on both string arguments.

Since the C program is planning on making changes to all three arguments, it declares all three as pointers in the function header and references the third argument as a pointer in the function body. It actually had no choice for the two string (char array) arguments — they must be defined as pointers in the function even though the function code references them without the leading ‘*‘ that normally signifies pointers.

These programs are compiled and executed as follows.

$ cobc -x maincob.cbl subc.c
$ maincob
Starting maincob
Starting subc
Arg1=Arg1
Arg2=Arg2
Arg3=123456789
Back
Arg1=Arg1
Arg2=Yrg2
Arg3=+0987654321
Returned value=+000000002
$

Remember that the null characters are actually in the GnuCOBOL Arg1 and Arg2 data items. They don’t appear in the output, but they are there.

Did you notice the output showing the contents of Arg1 after the subroutine was called? Those contents were unchanged! The subroutine definitely changed that argument, but since the COBOL program passed that argument BY CONTENT, the change was made to a copy of the argument, not to the Arg1 data item itself.

11.8.5 C Main Programs Calling GnuCOBOL Subprograms

Now, the roles of the two languages in the previous section will be reversed, having a C main program execute a GnuCOBOL subprogram.

C Calling Program                              GNU-COBOL Called Program
=============================================  =================================
#include <libcob.h> /* COB RUN-TIME */         IDENTIFICATION DIVISION.
#include <stdio.h>                             PROGRAM-ID. subcob.
int main (int argc, char **argv) {             DATA DIVISION.
   int returnCode;                             LINKAGE SECTION.
   char arg1[7] = "Arg1";                      01  Arg1      PIC X(7).
   char arg2[7] = "Arg2";                      01  Arg2      PIC X(7).
   unsigned long arg3 = 123456789;             01  Arg3      USAGE BINARY-LONG.
   printf("Starting mainc...\n");              PROCEDURE DIVISION USING
   cob_init (argc, argv); /* COB RUN-TIME */       BY VALUE     Arg1,
   returnCode = subcob(arg1,arg2,&arg3);           BY REFERENCE Arg2,
   printf("Back\n");                               BY REFERENCE Arg3.
   printf("Arg1=%s\n",arg1);                   000-Main.
   printf("Arg2=%s\n",arg2);                       DISPLAY 'Starting cobsub.cbl'
   printf("Arg3=%d\n",arg3);                       DISPLAY 'Arg1=' Arg1
   printf("Returned value=%d\n",returnCode);       DISPLAY 'Arg2=' Arg2
   return returnCode;                              DISPLAY 'Arg3=' Arg3
}                                                  MOVE 'X' TO Arg1 (1:1)
                                                   MOVE 'Y' TO Arg2 (1:1)
                                                   MOVE 987654321 TO Arg3
                                                   MOVE 2 TO RETURN-CODE
                                                   GOBACK
                                                   .

Since the C program is the one that will execute first, before the GnuCOBOL subroutine, the burden of initializing the GnuCOBOL run-time environment lies with that C program; it will have to invoke the cob_init function, which is part of the libcob library. The two required C statements are shown highlighted.

The arguments to the cob_init routine are the argument count and value parameters passed to the main function when the program began execution. By passing them into the GnuCOBOL subprogram, it will be possible for that GnuCOBOL program to retrieve the command line or individual command-line arguments. If that won’t be necessary, cob_init(0,NULL); could be specified instead.

Since the C program wants to allow arg3 to be changed by the subprogram, it prefixes it with a ‘&‘ to force a CALL BY REFERENCE for that argument. Since arg1 and arg2 are strings (char arrays), they are automatically passed by reference.

Here’s the output of the compilation process as well as the program’s execution. The example assumes a Windows system with a GnuCOBOL build that uses the GNU C compiler on that system; the technique works equally well regardless of which C compiler and which operating system you’re using.

C:\Users\Gary\Documents\Programs> cobc -S subcob.cbl
C:\Users\Gary\Documents\Programs> gcc mainc.c subcob.s -o mainc.exe -llibcob
C:\Users\Gary\Documents\Programs> mainc.exe
Starting mainc...
Starting cobsub.cbl
Arg1=Arg1
Arg2=Arg2
Arg3=+0123456789
Back
Arg1=Xrg1
Arg2=Yrg2
Arg3=987654321
Returned value=2
C:\Users\Gary\Documents\Programs>

Note that even though we told GnuCOBOL that the 1st argument was to be BY VALUE, it was treated as if it were BY REFERENCE anyway. String (char array) arguments passed from C callers to GnuCOBOL subprograms will be modifiable by the subprogram. It’s best to pass a copy of such data if you want to ensure that the subprogram doesn’t change it.

The third argument is different, however. Since it’s not an array you have the choice of passing it either BY REFERENCE or BY VALUE.