XTRAN Example — Command-Driven Text Processing Utility
Scenario — you have received a COBOL program listing file created by the Source Entry Utility (SEU) on an IBM iSeries computer, and you want to retrieve the original COBOL code. The problem is that, in creating the listing, SEU has:
- Inserted page headers every so many lines
- Prepended line numbers to each line of the code
- Suffixed update date information to the end of each code line
XTRAN to the rescue!
The following example uses an XTRAN rule set comprising 276 non-comment lines of "meta-code" (XTRAN's rules language) that implements ProcText, a command-driven text processing utility. The rules took 3¼ hours to design, create, and debug. (That's right, only 3¼ hours!)
ProcText supports the following text processing commands, which it reads from a specified command file:
Command syntax | Description |
---|---|
DC,<stcol>,[[<numcol>]],[[<dcrgx>]] |
Delete columns from each line, optionally only if the line matches a regular expression |
DL,<numlin>,<dlrgx> |
Delete lines, starting with a line that matches a regular expression |
TB,[[<tabsiz>]] |
Tabify lines before writing them out, based on a tab size |
TR |
Trim trailing spaces on each line, after all DC
commands have been applied |
UT,[[<utbsiz>]] |
Untabify lines before applying DC commands, based on
a tab size |
where:
<dcrgx> |
(optional) is a regular expression an input line
must match for this DC command to be applied.
ProcText first replaces each underscore
in <dcrgx> with a space. |
<dlrgx> |
is a regular expression an input line must match for
this DL command to be applied; ProcText then
deletes <numlin> lines starting with the
matching line. ProcText first replaces
each underscore in <dlrgx> with a space. |
<numcol> |
(optional) is the number of columns for this DC command to
delete, defaulting to the rest of the line. |
<numlin> |
is the number of lines for this DL command to delete. |
<stcol> |
is the 1-n starting column for this DC command. |
<tabsiz> |
(optional) is the number of columns for each TAB , defaulting
to 8. |
<utbsiz> |
(optional) is the number of columns for each TAB , defaulting
to 8. |
You can specify multiple DC
and DL
commands, as needed.
In the example below, we use the following set of ProcText commands to clean up SEU listings:
; Delete SEU's listing headers: ; DL,4,SEU_SOURCE_LISTING ; ; Delete SEU's "end of source" line: ; DL,1,E_N_D__O_F__S_O_U_R_C_E ; ; Delete line numbers SEU added -- columns 1-8: ; DC,1,8 ; ; Delete update info SEU added -- columns 88-end: ; DC,88 ; ; Trim trailing blanks, as in the original code: ; TR
The sample input shown below is from actual iSeries COBOL code that has been obfuscated to protect confidentiality and shortened for demonstration purposes. After applying the ProcText commands shown above to the untouched SEU listing, the result was identical, character for character, to the original COBOL code from which the listing was made.
Because of ProcText's generality, you can use it
to automate a broad range of text processing. For
instance, you can use it to retabify text from one TAB size
to another, by simply specifying a UT
command and
a TB
command with
different TAB sizes.
How can such a powerful and versatile command-driven text processing utility be created in only 3¼ hours and only 276 lines of XTRAN rules? Because there is so much capability already available as part of XTRAN's rules language. These rules take advantage of the following functionality:
- Text file input and output
- Text manipulation
- Text formatting
- Regular expression matching
- Environment variable manipulation
- Content-addressable data bases
The input to and output from XTRAN are untouched.
Process Flowchart
Here is a flowchart for this process, in which the elements are color coded:
- BLUE for XTRAN versions (runnable programs)
- ORANGE for XTRAN rules (text files)
- PURPLE for text files
Input to XTRAN ProcText rules:
XXXXXXX XXXXXX 111111 SEU SOURCE LISTING 00/00/00 00:00:00 XXXXXXXX PAGE 1 SOURCE FILE . . . . . . . XXXXXX/XXXXXXX MEMBER . . . . . . . . . XXXXXXX SEQNBR*...+... 1 ...+... 2 ...+... 3 ...+... 4 ...+... 5 ...+... 6 ...+... 7 ...+... 8 ...+... 9 ...+... 0 100 PROCESS XXXXX 00/00/00 200 IDENTIFICATION DIVISION. 00/00/00 300 PROGRAM-ID. XXXXXXX. 00/00/00 400 AUTHOR. XXXXXXXXXXXXX. 10/00/00 500 DATE-WRITTEN. 00/00/00. 00/00/00 600 ********************************************************************** 10/00/00 700 ENVIRONMENT DIVISION. 00/00/00 800 CONFIGURATION SECTION. 00/00/00 900 SOURCE-COMPUTER. IBM-AS400. 00/00/00 1000 OBJECT-COMPUTER. IBM-AS400. 00/00/00 1100 SPECIAL-NAMES. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 00/00/00 1200 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 10/00/00 1300 XXXXXXXXXXXXXXXXXXXXXXXXXXXXX. 10/00/00 1400 INPUT-OUTPUT SECTION. 00/00/00 1500 FILE-CONTROL. 00/00/00 1600 00/00/00 1601 SELECT XXXXXXXXXXX 00/00/00 1602 ASSIGN TO XXXXXXXXXXXXXXX 00/00/00 1603 ORGANIZATION IS INDEXED 00/00/00 1604 ACCESS IS DYNAMIC 00/00/00 1605 RECORD KEY IS XXXXXXXXXXXXXXXXXXXXXXXX 00/00/00 1606 FILE STATUS IS XXXXXXXXXXXXXX. 00/00/00 1607 00/00/00 XXXXXXX XXXXXX 111111 SEU SOURCE LISTING 00/00/00 00:00:00 XXXXXXXX PAGE 2 SOURCE FILE . . . . . . . XXXXXX/XXXXXXX MEMBER . . . . . . . . . XXXXXXX SEQNBR*...+... 1 ...+... 2 ...+... 3 ...+... 4 ...+... 5 ...+... 6 ...+... 7 ...+... 8 ...+... 9 ...+... 0 2300 DATA DIVISION. 00/00/00 2400 FILE SECTION. 00/00/00 2500 00/00/00 2501 FD XXXXXXXXXXX 00/00/00 2502 LABEL RECORDS ARE STANDARD. 00/00/00 2503 01 XXXXXXXXXXX. 00/00/00 2504 COPY XXXXXXXXXXXXXXX OF XXXXXX. 00/00/00 2505 00/00/00 3000 WORKING-STORAGE SECTION. 00/00/00 3100 *--------------------------------------------------------------* 00/00/00 3200 * Xxxxxxxxxxxxxxxxxxxxxxxxxxx * 00/00/00 3300 *--------------------------------------------------------------* 00/00/00 3400 01 XXXXXXXXXXXXXXXXXXX. 00/00/00 3500 05 XXXXXXXXXXXXXX. 00/00/00 3600 10 XXXXXXXXXXXXXXXX PIC X(00) VALUE 'XXXXXXX'. 00/00/00 3700 10 XXXXXXXXXXXXXXXXXX PIC X(00) VALUE 'XXXXXX'. 00/00/00 3800 88 XXXXXXXXXXXXXX VALUE 'XXXXX'. 00/00/00 3900 88 XXXXXXXXXXXXXXX VALUE 'XXXXXX'. 00/00/00 4000 10 XXXXXXXXXXXXXXXX PIC X(0000). 00/00/00 4100 10 XXXXXXXXXXXXXXXX PIC X(0000). 00/00/00 5200 ********************************************************************** 00/00/00 10900 00/00/00 * * * * E N D O F S O U R C E * * * *
Output from XTRAN:
PROCESS XXXXX IDENTIFICATION DIVISION. PROGRAM-ID. XXXXXXX. AUTHOR. XXXXXXXXXXXXX. DATE-WRITTEN. 00/00/00. ********************************************************************** ENVIRONMENT DIVISION. CONFIGURATION SECTION. SOURCE-COMPUTER. IBM-AS400. OBJECT-COMPUTER. IBM-AS400. SPECIAL-NAMES. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT XXXXXXXXXXX ASSIGN TO XXXXXXXXXXXXXXX ORGANIZATION IS INDEXED ACCESS IS DYNAMIC RECORD KEY IS XXXXXXXXXXXXXXXXXXXXXXXX FILE STATUS IS XXXXXXXXXXXXXX. DATA DIVISION. FILE SECTION. FD XXXXXXXXXXX LABEL RECORDS ARE STANDARD. 01 XXXXXXXXXXX. COPY XXXXXXXXXXXXXXX OF XXXXXX. WORKING-STORAGE SECTION. *--------------------------------------------------------------* * Xxxxxxxxxxxxxxxxxxxxxxxxxxx * *--------------------------------------------------------------* 01 XXXXXXXXXXXXXXXXXXX. 05 XXXXXXXXXXXXXX. 10 XXXXXXXXXXXXXXXX PIC X(00) VALUE 'XXXXXXX'. 10 XXXXXXXXXXXXXXXXXX PIC X(00) VALUE 'XXXXXX'. 88 XXXXXXXXXXXXXX VALUE 'XXXXX'. 88 XXXXXXXXXXXXXXX VALUE 'XXXXXX'. 10 XXXXXXXXXXXXXXXX PIC X(0000). 10 XXXXXXXXXXXXXXXX PIC X(0000). **********************************************************************