dos_compilers/Microsoft COBOL v45/DOCS/OPTIMIZE.DOC
2024-07-24 07:18:17 -07:00

331 lines
12 KiB
Plaintext

CREATING OPTIMIZED PROGRAMS
===========================
This document should be regarded as a supplement to the chapter Writing
Programs in your Operating Guide. It gives some guidelines which, if
followed, allow your COBOL system to optimize fully the native code
produced for your programs - resulting in smaller and faster applications.
TABLE OF CONTENTS
INTRODUCTION
DATA DIVISION CONSIDERATIONS
Data Types
Linkage Items
Large Data Divisions
PROCEDURE DIVISION CONSIDERATIONS
Arithmetic Statements
Alphanumeric Data Manipulation
Tables
Conditional Statements
Loops and Subroutines
CALL statements
Large Procedure Divisions
COMPILER DIRECTIVES
EXAMINING THE NATIVE CODE
INTRODUCTION
The guidelines are set out in a "Do this" / "Don't do that" style. Do
remember that these are only guidelines; programs that do not conform to
these guidelines will still run correctly, just less efficiently.
For further information on many of the topics discussed in this document,
see chapter Compiling and chapter Writing Programs in your Operating
Guide.
DATA DIVISION CONSIDERATIONS
Data Types
----------
o Use unsigned COMP-5 or COMP-X numeric data items; preferably COMP-5.
o Use 2-byte COMP-5 items rather than single byte fields for arithmetic.
o Do not use numeric items that occupy more than 4 bytes of storage.
o Do not redefine COMP-5 items to access individual bytes; if access to
individual bytes is required use COMP-X.
o Use edited items only when necessary and use only simple ones such as
ZZ9. If possible use them in a subroutine so the total number of
edited moves in your program is kept as small as possible.
o Do not use items defined as EXTERNAL.
o Align items on even byte boundaries and ensure the stride of a table
is an even number, if possible a power of 2 (pad the table as
necessary). The stride of a table is the size of one element; for
example, the stride of the following table is 2 bytes:
01 a occurs 10.
05 b pic x
05 c pic x.
Linkage Items
-------------
o Don't make many references to items defined in the Linkage Section
(examples: linkage parameters; x"D0" allocated memory). It is more
efficient to move the data to Working-Storage, manipulate it there,
then move it back again.
o If a parameter is optional you can detect its presence using the
following syntax:
IF ADDRESS OF linkage-item NOT = NULL
Large Data Divisions
--------------------
o Ensure the size of your Data Division is less than 64 kilobytes (64K),
keeping the total size as small as possible. In the Procedure Division
the RTS can swap segments, but it cannot page the Data Division.
o Redefine memory that is only used during certain phases of a program's
execution so that it can be shared by several routines.
o If more than 64K of data is required use a heap, or use the x"D0"
COBOL System Library Routine to allocate the space dynamically. But
remember x"D0" items have the disadvantage of being Linkage items.
o Do not use the NOSMALLDD directive unless you have no choice. If
necessary (normally for parameters passed to your program by external
programs), use the SEGCROSS directive so that the performance of most
data accesses is not affected.
PROCEDURE DIVISION CONSIDERATIONS
Arithmetic Statements
---------------------
o Use simple forms (for example, ADD a TO b) of the arithmetic verbs and
do not use the COMPUTE verb.
o Do not use the GIVING form of these verbs. If necessary create
temporary variables and code several simple statements to achieve the
same result. For example, write:
MOVE a TO c
ADD b TO c
rather than:
ADD a TO b giving c
o Do not mix items of different sizes in an arithmetic statement (for
example, try to use all 2-byte items or all 4-byte items).
o Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING
phrases.
Alphanumeric Data Manipulation
------------------------------
o Reference modified fields are optimized if coded in one of the
following forms:
item (literal:)
item (literal:literal)
item (literal:variable)
item (variable:variable)
item (variable + literal:literal)
item (variable - literal:literal)
item (variable + literal:variable)
item (variable - literal:variable)
Other forms of reference modification are inefficient.
o If the offset or length of the reference modification is a data item,
use a 2-byte COMP-5 item. Define it in Working-Storage.
o In a MOVE statement, have the source item the same size as or larger
than the target. This prevents space-padding code being generated.
o Do not use the INITIALIZE verb.
o Do not use the CORRESPONDING option of the MOVE verb.
o Do not use the STRING or UNSTRING verbs - they create a lot of code.
For manipulating filenames use the COBOL System Library Routines
CBL_SPLIT_FILENAME and CBL_JOIN_FILENAME. For other purposes, create
your own loops; they will almost always be more efficient.
Tables
------
o The optimal definition for a subscript is a 2-byte COMP-5 item.
o Subscripts for items that have the same stride and are used in
consecutive statements are optimized so that they are only evaluated
once. For example:
01 A PIC XX OCCURS 10.
01 B PIC XX OCCURS 10.
01 C PIC XX OCCURS 10.
01 D PIC XX OCCURS 10.
. . .
MOVE A(I) TO B(I)
IF C(I) = D(I)
DISPLAY "PASS"
END-IF
would result in the subscript I being evaluated only once, although it
is used four times in two statements.
o When compiling your program for use in production, use the NOBOUND
directive so that subscript range checking code is not included. Use
BOUND only when debugging.
Conditional Statements
----------------------
o Do not use large EVALUATE statements. They are compiled into a series
of IF ... ELSE IF ... statements where the value of the expression is
derived separately for each WHEN clause.
o Order an EVALUATE statement so that the most commonly satisfied
condition is tried first. Do not use complex expressions or Linkage
items as conditions in an EVALUATE statement; instead, calculate the
value yourself in a Working-Storage item and use that item.
o Comparing for equality or inequality is more efficient than testing
for "greater than" or "less than", especially with COMP-X or
alphanumeric items.
o In both alphanumeric and numeric comparisons, have the source and
target items the same size.
o Use a GO TO ... DEPENDING statement if the range of possible values is
fairly close. Although this construct has the disadvanage of not being
particularly suited to structured programming, it is efficient.
Loops and Subroutines
---------------------
o When incrementing or decrementing a counter, terminate it with a
literal value rather than a value held in a data item. For example, to
execute a loop n times, set the counter to n and then decrement the
counter until it becomes zero, rather than incrementing the counter
from 0 to n.
o The range of an out-of-line PERFORM statement should not contain the
end of another perform range. (One way to ensure this is to perform
sections only, not paragraphs; however, with carefully structured
programming this should not arise). You can then compile your program
with the generator directive NOTRICKLE, which lets the compiler
produce more efficient code. This coding style generally gives you a
more easily maintained program too.
o Do not use PERFORM a THRU b as this too can lead to trickling code.
o Use PERFORM para N TIMES but not the equivalent in-line perform.
o Put commonly used pieces of code in sections or paragraphs and PERFORM
them. This is saves space for any statement used more than once that
produces more than 4 bytes of generated code (in a NOTRICKLE program).
It is often beneficial even on single statements, for example edited
moves, subprogram calls or file I/O.
CALL statements
---------------
o Try to limit the number of CALL statements a program makes, if
necessary by avoiding splitting it into too many subprograms.
o CALL statements that do not alter the RETURN-CODE special register or
whose effect on RETURN-CODE are of no interest should use a calling
convention of 4 (the checker directive DEFAULTCALLS can be used to set
this globally).
o Calls to the COBOL System Library Routines that carry out logical
operations (CBL_AND, etc) are optimized by the generator to actual
machine logical operations, providing the parameters are 8 bytes long
or less. These too should use a calling convention of 4.
o Use the generator directive NOPARAMCOUNTCHECK if your program is
always called with the correct number of parameters, or if it does not
reference unsupplied parameters. Most programs will fall into this
category.
Large Procedure Divisions
-------------------------
o On DOS or OS/2, code segments are limited in size to 64K. Do not let
the generator decide where to break your program into segments.
Instead segment the program manually, that is, by using the segment
number on a section header, so you can choose where to split it. Avoid
inter-segment PERFORM and GO TO statements.
o If appropriate the memory needed can be reduced by manually segmenting
a procedure division smaller than 64K. As the RTS can segment-swap
procedural code this is rarely necessary.
DIRECTIVES
A number of checker and generator directives can be used to enable the
native code for a program to be better optimized. Some of these directives
must be used with care; ensure that the behavior that you get with these
directives is acceptable.
Use the following directives when checking and generating your programs:
NOALTER
ALIGN(2)
NONESTCALL
OPTSIZE
Use with care:
NOTRICKLE
DEFAULTCALLS(4)
NOPARAMCOUNTCHECK
Use when compiling for production:
NOANIM
NOBOUND
Other suggestions (to help prevent inefficient coding):
REMOVE "UNSTRING"
REMOVE "STRING"
REMOVE "GIVING"
REMOVE "ROUNDED"
REMOVE "COMPUTE"
REMOVE "ERROR"
REMOVE "ALTER"
REMOVE "INITIALIZE"
REMOVE "CORRESPONDING"
REMOVE "TALLYING"
REMOVE "THRU"
REMOVE "THROUGH"
EXAMINING THE NATIVE CODE
You can see the generated code produced for your program, by using the
generator directives ASMLIST() SOURCEASM to produce a .GRP file containing
the assembler and source listing in the same file.
If the generator considers a statement for optimization, but finds that it
does not conform to the necessary guidelines, the word "BADCODE" appears
next to the statement, on the right-hand side of the listing. But some
statements that have generated inefficient code will not have been
identified in this way; you can usually spot these by looking for a single
statement that has generated a lot of assembler code.
Try to eliminate or at least reduce the inefficient statements in your
program. But be aware of the law of diminishing returns; as you improve
the efficiency of your program, you will eventually reach a point where a
lot of extra effort will give only small further gains.
==========================================================================
Copyright (C) 1991 Microsoft Corporation
Copyright (C) 1991 Micro Focus Ltd