Professional programmers appreciate the importance of standards in developing programs that are readable, understandable, and maintainable. The issue of programming style goes beyond any one language, but the introduction of the RPG IV syntax demands that we reexamine standards of RPG style. With that in mind, here are some simple rules of thumb you can use to ensure that bad code doesn't happen to good RPG software construction.
1.0 Comments
Good programming style can serve a documentary purpose in helping others understand the source code. If you practice good code-construction techniques, you'll find that "less is more" when it comes to commenting the source. Too many comments are as bad as too few.
1.1 Use comments to clarify - not echo - your code. Comments that merely repeat the code add to a program's bulk, but not to its value. In general, you should use comments for just three purposes:
1.2 Always include a brief summary at the beginning of a program or procedure. This prologue should include the following information:
|
A Cooperative Effort |
|
|
Like many coding standards, the style guidelines presented in this article are the
result of collaboration by many people - in this case, by those who participate in
NEWS/400's RPG IV Style Forum at http://www.news400.com B.D.M. | |
1.3 Use consistent "marker line" comments to divide major sections of code. For example, you should definitely section off with lines of dashes or asterisks the declarations, the main procedure, each subroutine, and any subprocedures. Identify each section for easy reference.
1.4 Use blank lines to group related source lines and make them stand out. In general, you should use completely blank lines instead of blank comment lines to group lines of code, unless you're building a block of comments. Use only one blank line, though; multiple consecutive blank lines make your program hard to read.
1.5 Avoid right-hand "end-line" comments in columns 81-100. Right-hand comments tend simply to echo the code, can be lost during program maintenance, and can easily become "out of synch" with the line they comment. If a source line is important enough to warrant a comment, it's important enough to warrant a comment on a separate line. If the comment merely repeats the code, eliminate it entirely.
2.0 Declarations
With RPG IV, we finally have an area of the program source in which to declare all variables and constants associated with the program. The D-specs organize all your declarations in one place.
2.1 Declare all variables within D-specs. Except for key lists and parameter lists, don't declare variables in C-specs - not even using *LIKE DEFN. Define key lists and parameter lists in the first C-specs of the program, before any executable calculations.
2.2 Whenever a literal has a specific meaning, declare it as a named constant in the D-specs. This practice helps document your code and makes it easier to maintain. One obvious exception to this rule is the allowable use of 0 and 1 when they make perfect sense in the context of a statement. For example, if you're going to initialize an accumulator field or increment a counter, it's fine to use a hard-coded 0 or 1 in the source.
2.3 Indent data item names to improve readability and document data structures. Unlike many other RPG entries, the name of a defined item need not be left-justified in the D-specs; take advantage of this feature to help document your code:
D ErrMsgDSDS DS D ErrPrefix 3 D ErrMsgID 4 D ErrMajor 2 OVERLAY(ErrMsgID:1) D ErrMinor 2 OVERLAY(ErrMsgID:3)
2.4 Use length notation instead of positional notation in data structure declarations. D-specs let you code fields either with specific from and to positions or simply with the length of the field. To avoid confusion and to better document the field, use length notation consistently. For example, code
D RtnCode DS D PackedNbr 15P 5
instead of
D RtnCode DS D PackedNbr 1 8P 5
2.4.1 Use positional notation only when the actual position in a data structure is important. For example, when coding the program status data structure, the file information data structure, or the return data structure from an API, you'd use positional notation if your program ignores certain positions leading up to a field or between fields. Using positional notation is preferable to using unnecessary "filler" variables with length notation:
D APIRtn DS D PackedNbr 145 152P 5
In this example, to better document the variable, consider overlaying the positionally declared variable with another variable declared with length notation:
D APIRtn DS D Pos145 145 152 D PackNbr 15P 5 OVERLAY(Pos145)
2.4.2 When defining overlapping fields, use the OVERLAY keyword instead of positional notation. Keyword OVERLAY explicitly ties the declaration of a "child" variable to that of its "parent." Not only does OVERLAY document this relationship, but if the parent moves elsewhere within the program code, the child will follow.
2.5 If your program uses compile-time arrays, use the **CTDATA form to identify the compile-time data. This form effectively documents the identity of the compile-time data, tying the data at the end of the program to the array declaration in the D-specs. The **CTDATA syntax also helps you avoid errors by eliminating the need to code compile-time data in the same order in which you declare multiple arrays.
3.0 Naming Conventions
Perhaps the most important aspect of programming style deals with the names you give to data items (e.g., variables, named constants) and routines.
3.1 When naming an item, be sure the name fully and accurately describes the item. The name should be unambiguous, easy to read, and obvious. Although you should exploit RPG IV's allowance for long names, don't make your names too long to be useful. Name lengths of 10 to 14 characters are usually sufficient, and longer names may not be practical in many specifications. When naming a data item, describe the item; when naming a subroutine or procedure, use a verb/object syntax (similar to a CL command) to describe the process. Maintain a dictionary of names, verbs, and objects, and use the dictionary to standardize your naming conventions.
3.2 When coding an RPG symbolic name, use mixed case to clarify the named item's meaning and use. RPG IV lets you type your source code in upper- and lowercase characters. Use this feature to clarify named data. For RPG-reserved words and operations, use all uppercase characters.
3.3 Avoid using special characters (e.g., @, #, $) when naming items. Although RPG IV allows an underscore (_) within a name, you can easily avoid using this "noise" character if you use mixed case intelligently.
4.0 Indicators
Historically, indicators have been an identifying characteristic of the RPG syntax, but with RPG IV they are fast becoming relics of an earlier era. To be sure, some operations still require indicators, and indicators remain the only practical means of communicating conditions to DDS-defined displays. But reducing a program's use of indicators may well be the single most important thing you can do to improve the program's readability.
4.1 Use indicators as sparingly as possible; go out of your way to eliminate them. In general, the only indicators present in a program should be resulting indicators for opcodes that absolutely require them (e.g., CHAIN before V4R2) or indicators used to communicate conditions such as display attributes to DDS-defined display files.
4.1.1 Whenever possible, use built-in functions (BIFs) instead of indicators. As of V4R2, you can indicate file exception conditions with error- handling BIFs (e.g., %EOF, %ERROR, %FOUND) and an E operation extender to avoid using indicators.
4.2 If you must use indicators, name them. V4R2 supports a Boolean data type (N) that serves the same purpose as an indicator. You can use the INDDS keyword with a display-file specification to associate a data structure with the indicators for a display or printer file; you can then assign meaningful names to the indicators.
4.2.1 Use an array-overlay technique to name indicators before V4R2. Using RPG IV's pointer support, you can overlay the *IN internal indicator array with a data structure. Then you can specify meaningful subfield names for the indicators. This technique lessens your program's dependence on numeric indicators. For example:
D IndicatorPtr * INZ(%ADDR(*IN)) D DS BASED(IndicatorPtr) D F03Key 3 3 D F05Key 5 5 D CustNotFnd 50 50 D SflClr 91 91 D SflDsp 92 92 D SflDspCtl 93 93 C IF F03Key = *ON C EVAL *INLR = *ON C RETURN C ENDIF
4.3 Use the EVAL opcode with *Inxx and *ON or *OFF to set the state of indicators. Do not use SETON or SETOFF, and never use MOVEA to manipulate multiple indicators at once.
4.4 Use indicators only in close proximity to the point where your program sets their condition. For example, it's bad practice to have indicator 71 detect end- of-file in a READ operation and not reference *IN71 until several pages later. If it's not possible to keep the related actions (setting and testing the indicator) together, move the indicator value to a meaningful variable instead.
4.5 Don't use conditioning indicators - ever. If a program must conditionally execute or avoid a block of source, explicitly code the condition with a structured comparison opcode, such as IF. If you're working with old S/36 code, get rid of the blocks of conditioning indicators in the source.
4.6 Include a description of any indicators you use. It's especially important to document indicators whose purpose isn't obvious by reading the program, such as indicators used to communicate with display or printer files or the U1-U8 external indicators, if you must use them.
5.0 Structured Programming Techniques
Give those who follow you a fighting chance to understand how your program works by implementing structured programming techniques at all times.
5.1 Don't use GOTO, CABxx, or COMP. Instead, substitute a structured alternative, such as nested IF statements, or status variables to skip code or to direct a program to a specific location. To compare two values, use the structured opcodes IF, ELSE, and ENDIF. To perform loops, use DO, DOU, and DOW with ENDDO. Never code your loops by comparing and branching with COMP (or even IF) and GOTO. Employ ITER to repeat a loop iteration, and use LEAVE for premature exits from loops.
5.2 Don't use obsolete IFxx, DOUxx, DOWxx, or WHxx opcodes. The newer forms of these opcodes - IF, DOU, DOW, and WHEN - support free-format expressions, making those alternatives more readable. In general, if an opcode offers a free-format alternative, use it.
5.3 Perform multipath comparisons with SELECT/WHEN/OTHER/ENDSL. Deeply nested IFxx/ELSE/ENDIF code blocks are hard to read and result in an unwieldy accumulation of ENDIFs at the end of the group. Don't use the obsolete CASxx opcode; instead, use the more versatile SELECT/WHEN/OTHER/ENDSL construction.
5.4 Always qualify END opcodes. Use ENDIF, ENDDO, ENDSL, or ENDCS as applicable. This practice can be a great help in deciphering complex blocks of source.
5.5 Avoid programming tricks and hidden code. Such maneuvers aren't so clever to someone who doesn't know the trick. If you think you must add comments to explain how a block of code works, consider rewriting the code to clarify its purpose. Use of the obscure "bit-twiddling" opcodes (BITON, BITOFF, MxxZO, TESTB, TESTZ) may be a sign that your source needs updating.
6.0 Modular Programming Techniques
The RPG IV syntax, along with the AS/400's Integrated Language Environment (ILE), encourages a modular approach to application programming. Modularity offers a way to organize an application, facilitate program maintenance, hide complex logic, and efficiently reuse code wherever it applies.
6.1 Use RPG IV's prototyping capabilities to define parameters and procedure interfaces. Prototypes (PR definitions) offer many advantages when you're passing data between modules and programs. For example, they avoid runtime errors by giving the compiler the ability to check the data type and number of parameters. Prototypes also let you code literals and expressions as parameters, declare parameter lists (even the *ENTRY PLIST) in the D-specs, and pass parameters by value and by read-only reference, as well as by reference.
6.2 Store prototypes in /COPY members. For each module, code a /COPY member containing the procedure prototype for each exported procedure in that module. Then, include a reference to that /COPY module in each module that refers to the procedures in the called module. This practice saves you from typing the prototypes each time you need them and reduces errors.
6.2.1 Include constant declarations for a module in the same /COPY member as the prototypes for that module. If you then reference the /COPY member in any module that refers to the called module, you've effectively "globalized" the declaration of those constants.
6.3 Use IMPORT and EXPORT only for global data items. The IMPORT and EXPORT keywords let you share data among the procedures in a program without explicitly passing the data as parameters - in other words, they provide a "hidden interface" between procedures. Limit use of these keywords to data items that are truly global in the program - usually values that are set once and then never changed.v
7.0 Character String Manipulation
IBM has greatly enhanced RPG IV's ability to easily manipulate character strings. Many of the tricks you had to use with earlier versions of RPG are now obsolete. Modernize your source by exploiting these new features.v
7.1 Use a named constant to declare a string constant instead of storing it in an array or table. Declaring a string (such as a CL command string) as a named constant lets you refer to it directly instead of forcing you to refer to the string through its array name and index. Use a named constant to declare any value that you don't expect to change during program execution.
7.2 Avoid using arrays and data structures to manipulate character strings and text. Use the new string manipulation opcodes and/or built-in functions instead.
7.3 Use EVAL's free-format assignment expressions whenever possible for string manipulation. When used with character strings, EVAL is usually equivalent to a MOVEL(P) opcode. Use MOVE and MOVEL only when you don't want the result to be padded with blanks.
8.0 Avoid Obsolescence
RPG is an old language. After 30 years, many of its original, obsolete features are still available. Don't use them.
8.1 Don't sequence program line numbers in columns 1-5. Chances are you'll never again drop that deck of punched cards, so the program sequence area is unnecessary. In RPG IV, the columns are commentary only. You may use them to identify changed lines in a program or structured indentation levels, but be aware that these columns may be subject to the same hazards as right-hand comments (see style guideline 1.5).
8.2 Avoid program-described files. Instead, use externally defined files whenever possible.
8.3 If an opcode offers a free-format syntax, use it instead of the fixed- format version. Opcodes to avoid include CABxx, CASxx, CAT, DOUxx, DOWxx, IFxx, and WHxx.
8.4 If a BIF offers the same function as an opcode, use the BIF instead of the opcode. With some opcodes, you can substitute a built-in function for the opcode and use the function within an expression. At V4R1, the SCAN and SUBST opcodes have virtually equivalent built-in functions, %SCAN and %SUBST. In addition, you can usually substitute the concatenation operator (+) in combination with the %TRIMx BIFs in place of the CAT opcode. The free-format versions are preferable if they offer the same functionality as the opcodes.
8.5 Shun obsolete opcodes. In addition to the opcodes mentioned earlier (style guidelines 5.2 and 5.3), some opcodes are no longer supported or have better alternatives.
8.5.1 CALL, CALLB. The prototyped calls (CALLP or a function call) are just as efficient as CALL and CALLB and offer the advantages of prototyping and parameter passing by value. Neither CALL nor CALLB can accept a return value from a procedure.
8.5.2 DEBUG. With OS/400's advanced debugging facilities, this opcode is no longer supported.
8.5.3 DSPLY. You should use display file I/O to display information or to acquire input.
8.5.4 FREE. This opcode is no longer supported.
8.5.5 PARM, PLIST. If you use prototyped calls, these opcodes are no longer necessary.
9.0 Miscellaneous Guidelines
Here's an assortment of other style guidelines that can help you improve your RPG IV code.
9.1 In all specifications that support keywords, observe a one-keyword-per-line limit. Instead of spreading multiple keywords and values across the entire specification, your program will be easier to read and let you more easily add or delete specifications if you limit each line to one keyword, or at least to closely related keywords (e.g., DATFMT and TIMFMT).
9.1.1 Begin all H-spec keywords in column 8, leaving column 7 blank. Separating the keyword from the required H in column 6 improves readability.
9.2 Relegate mysterious code to a well-documented, well-named procedure. Despite your best efforts, on extremely rare occasions you simply will not be able to make the meaning of a chunk of code clear without extensive comments. By separating such heavily documented, well-tested code into a procedure, you'll save future maintenance programmers the trouble of deciphering and dealing with the code unnecessarily.
10.0 Final Advice
Sometimes good style and efficient runtime performance don't mix. Wherever you face a conflict between the two, choose good style. Hard-to-read programs are hard to debug, hard to maintain, and hard to get right. Program correctness must always win out over speed. Keep in mind these admonitions from Brian Kernighan and P.J. Plauger's The Elements of Programming Style:
Bryan Meyers, CCP, is director of information services for KOA Kampgrounds of America and a senior technical editor for NEWS/400. You can contact him at bmeyers@wtp.net.
|
Style Online |
|
|
In NEWS/400's RPG IV Style Forum (at http://www.news400.com Placement of KLISTs and PLISTs Bradley V. Stone: One thing I have hated from the get-go is the declaration of KLISTs and PARM lists at the top of the code. Put those babies in the initialization subroutine (*INZSR) - they belong out of the way and out of harm's way. They just get cumbersome at the top of the program. Bryan Meyers: Before the advent of D-specs, I would have agreed with you wholeheartedly. Now, I think KLISTs and PLISTs fit better at the top of the C-specs, just under the D-specs, since they are, after all, declarations. An *INZSR is indeed very useful, but for setting initial states, not for declaring variables or structures. Denis Robitaille: Currently, I put all KLISTs and PLISTs in the *INZSR, too, but I will switch as soon as we can put them in the D-specs. By the way, I also always code a $INZSR subroutine. It is like *INZSR except that it gets called at every invocation of the program (EXSR $INZSR is the first C-spec). Barbara Morris: You can't define a key list in D-specs, but you can (and I think you should) define a parameter list in D-specs at V3R2 and above. Mark Fields: I agree that the top of the C-specs is as good a place as any for these items. I just have personally always preferred to put them in an INIT subroutine. However if the KLISTs would be the only things in there, I would just put them at the top of the C-specs. I guess the most important thing is just to group them in a logical place. Naming Conventions Denis Robitaille: Here are my preferences for naming conventions. Our subroutine names all start with the character $ followed by xxxxxxxxx, where xxxxxxxxx is a meaningful name. Some names are standard for all programs (e.g., $INZSR, which is like *INZSR except that it is called every time the program is called). For variable names, we have three categories:
If two variables are somewhat linked (e.g., one is a copy of the other), only the prefix differs. For example, we use f1CstNam, #CstNam, and @CstNam for the database field, display field, and work field, respectively. I do not like, or find useful, the prefixing of a variable with its type (e.g., int, str). The meaningful part of the name is enough for me. The special characters #, @, and $ do cause problems in other countries (I learned this the hard way). Njål Fisketjøn: We try to use the following standard. For database fields, we use a unique, two-character prefix starting at a1 for files. Key lists are named starting with the file prefix (e.g., a1Key1, a1Key2). Work fields start with a lowercase w (e.g., wIndex, wName). If one uses subprocedures, there's never more than a handful of variables anyway, and they really don't need a special prefix, but old habits . . . . I don't like the idea of using the name to tell the type of the variable. Anthony Newstead: Are there any standards concerning the naming conventions for modules, programs, binding directories, and so on? In RPG/400, my standards were that CL programs ended with C (e.g., RTC010AC), display files ended with D (e.g., RTC010AD), and RPG programs ended with R (e.g., RTC010AR). With RPG IV, I started off ending all module names with MD and the actual program with PGM - for example, RTC010PGM, which contains RTC010MD, RTC020MD, and so on. The binding directory became RTCBINDER. Bryan Meyers: We didn't change naming standards too much when we started using RPG IV because we already did fairly modular programming (although not to the extent that it is possible today). We always name the program entry module the same as the program itself. We also always name the binding directory the same as the program itself (unless we have just one binding directory for an entire library). Our programs/modules are usually named xxxnnn, where xxx identifies the application and nnn numbers the program or module. We tend to group programs or modules that are frequently called by other programs or modules close together in the numbering scheme (usually in the 900s). Report programs tend to be in the 700s, setup and file maintenance programs in the 100s to 300s, and transaction programs in the 400s to 600s. I'm not fond of using suffixes for the various components, although we do end display files with D (xxxnnnD) and printer files with P (xxxnnnP). We add numbers on to the end as necessary if there's more than one (xxxnnnP1 and so on). Bradley V. Stone: I name the functions f.xxxxx where xxxxx is the module name. If the module is one that returns static values from a file (e.g., the item description), it is f. plus the file name. So, for ITEMPF the function is f.ITEM and all internal functions are # plus the field name. I also put these in QMODULESRC. Barbara Morris: I'm curious about the use of numbers instead of something more meaningful to create module names (e.g., xxxnnn). It looks like nnn (three digits) is common. You can convey quite a bit of information with three letters - why not take advantage of this? Even if you used the first letter to indicate the basic function (e.g., P=print, S=setup), you'd still have a couple of letters to work with. Maybe this scheme was forced by the single-procedure modules you've lived with so long. If you really have 999 modules, I can see that it might be tricky to name them in a meaningful way with two or three letters. Once you start combining more procedures into one module, you might rethink your module-naming scheme. Bryan Meyers: Back in the early 1970s, IBM's CMAS construction accounting package used the number scheme, and I just gravitated toward it. It's very comfortable and it serves the purpose of organizing the source. I am an advocate of using meaningful names (verb/subject for routines, returned value names for functions), but I haven't yet jettisoned the old scheme. To do it properly, one should develop a dictionary of abbreviated terms to use and build the names from that dictionary. When Should Modules or Procedures Do Their Own Data Access? Kevin Juenemann: To what degree should a module do its own data access, and to what degree should all the necessary data be passed in as parameters? I like to limit the complexity of the parameter list, so I sometimes pass in a parcel ID to a module, have the module read from the database to get all the data it needs about that parcel, and then pass back only the calculated tax amount. I assume it would be more efficient at runtime to pass all possible variables in the tax-calculation process to the module (or procedure) and have it do only the calculations. But from a maintenance perspective, if the tax calculation changes to encompass more variables, I would prefer not to have to change all the programs that call this module to read and pass the additional parameters. Any thoughts? Bryan Meyers: Steve McConnell, in his book Code Complete, lists valid reasons to create a routine. Your architecture well fits many of these criteria:
It seems to me that a module should be as self-sufficient and complete as possible, without depending on lots of parameters or lots of prerequisite routines to do its job. The simpler the interface, the better. Hans Boldt: Let me add a couple of points related to this thread. RPG IV keywords IMPORT and EXPORT let you define a "hidden" interface between modules. As a result, you should limit your use of these keywords to those data items that are global throughout the application. I'd also suggest that this global data be things like global attributes that are set once and never modified elsewhere. The Power of Variable-Length Fields Barbara Morris: I just thought of this when I read the style guide's plea to use built-in functions instead of arrays to handle strings. The new (with V4R2) varying- length character type can also simplify string-handling code, as well as make it more efficient. I recommend using varying-length fields as CONST or VALUE parameters to every string-handling subprocedure, as well as for string temporaries. (I recently worked with an RPG programmer here, and we reduced the number of lines in a subprocedure from about 10 icky lines to two simple and straightforward lines, mostly through the use of varying- length parameters.) Here's a little example; not only does it look better, but it's also faster (no %TRIMs). Instead of C EVAL Name = %TRIMR(Lib) + '/' +
C %TRIMR(File) + '(' +
C %TRIMR(Mbr) + ')'
with variable-length fields, you can use C EVAL Name = Lib + '/' +
C File + '(' +
C Mbr + ')'
I thought I should check my claim that this approach is faster, and it turns out it's more than twice as fast to produce "Lib/File(Mbr)" where all variables (except "Name") are 10 characters long. | |