XTRAN Example — Restructure C, C++, or C#

The following example uses a rule set written in XTRAN's "meta-code" (rules language) that implements a set of code structuring algorithms to eliminate goto statements in C, C++, or C# by imposing if, else, and for statements.  The algorithms use recursive statement and expression pattern matching and replacement capabilities provided by XTRAN's meta-code, involving "match" statement patterns and "replace" statement patterns.

We have developed additional patterns and replacements to eliminate more gotos by imposing while and do statements as well.

The rules used for this example comprise 496 code lines of XTRAN's rules language.  They are not very specific to C/C++/C#; they can easily be adapted to work for any 3GL language.  In fact, we also have versions for COBOL and PL/I.

How can such powerful and generalized code quality improvement be achieved in less than 500 lines of rules?  Because there is so much capability already available as part of XTRAN's rules language.  The rules used for this example take advantage of the following functionality provided by that rules language:

In the patterns below, this indicates C / C++ / C# elements and <this> indicates meta (rules language) elements.  Also, the matching of each occurrence of <label> in each pattern is constrained by a condition — the matching label must have only one GOTO to it.

Structure with "if"

To impose ifs, we use the following "match" and "replace" patterns.

The first if "match" pattern has the form:

        if (<expr>)
            goto <label>;
        <stmts>
<label>:
        <stmt>

where <expr> matches any conditional expression, <label> matches any label, <stmts> match one or more statements containing no if at the top level and no labels, and <stmt> matches any statement.  Note that the second occurrence of <label> must be the same symbol as the first occurrence in order for the pattern to match.

The second if "match" pattern is the same as the first except that it allows if statements at the top level of <stmts>.

The if "replace" pattern has the form:

        if (!<expr>)
            <stmts>
<label>:
        <stmt>

where <expr>, <stmts>, <label>, and <stmt> are whatever matched them in the "match" patterns.

Structure with "if" and "else"

To impose if/elses, we use the following "match" and "replace" patterns.

The if/else "match" pattern has the form:

        if (<expr>)
            {
            <stmts1>
            goto <label>;
            }
        <stmts2>
<label>:
        <stmt>

The if/else "replace" pattern has the form:

        if (<expr>)
            <stmts1>
        else
            <stmts2>
<label>:
        <stmt>

where <expr>, <stmts1>, <stmts2>, <label>, and <stmt> are whatever matched them in the "match" pattern.

Structure with "for"

To impose fors, we use two pattern and replacement sets:  One for "up-counting" fors and one for "down-counting" fors.

"Up-counting" for

The "up-counting" for "match" pattern has the form:

        <index> = <init>;
<label>:
        <stmts>;
        <index> += <increm>;
        if (<index> < <limit>)
            goto <label>;

where <index> matches the index variable, <init> matches its initial value, <label> matches any label, <stmts> matches one or more statements with no labels, <increm> matches the index increment value, and <limit> matches the index limit value.  Note that the second occurrence of <label> must be the same symbol as the first occurrence in order for the pattern to match.

The "up-counting" for "replace" pattern has the form:

        for (<index> = <init>;
          <index> <= <limit>;
          <index> += <increm>)
            <stmts>;

where <index>, <init>, <limit>, <increm>, and <stmts> are whatever matched them in the "match" pattern.

"Down-counting" for

The "down-counting" for "match" pattern has the form:

        <index> = <init>;
<label>:
        <stmts>;
        <index> -= <decrem>;
        if (<index> >= <limit>)
            goto <label>;

where <index> matches the index variable, <init> matches its initial value, <label> matches any label, <stmts> matches one or more statements with no labels, <decrem> matches the index decrement value, and <limit> matches the index limit value.  Note that the second occurrence of <label> must be the same symbol as the first occurrence in order for the pattern to match.

The "down-counting" for "replace" pattern has the form:

        for (<index> = <init>;
          <index> > <limit>;
          <index> -= <decrem>)
            <stmts>;

where <index>, <init>, <limit>, <decrem>, and <stmts> are whatever matched them in the "match" pattern.

Strategy

For each "match" pattern, the meta-code repeatedly scans the code to be processed, using a facility built into XTRAN's rules language that visits each statement (recursively) once in each pass, looking for matches and replacing them with the corresponding "replace" patterns, until it can find no more matches to be replaced.  The meta-code also tells XTRAN to delete any label that has no references, at the end of each pass, so if the structuring deletes all references to a label, the label will disappear.  This has the effect that a successful match and replacement can trigger more structuring on the next pass.

The meta-code also:

The rules are not very specific to C / C++ / C#; they can easily be adapted for any language.  In fact, we also have versions for COBOL and PL/I.


Process Flowchart

Here is a flowchart for this process, in which the elements are color coded:

data flowchart

C / C++ / C# code before structuring:

        int i, j;                               /*int i, j;*/

/*
 * The following should structure to nested "if"s and an "else":
 */
lbl0:   if (i == 0 || i == 1)                   /*lbl0: if (i == 0 || i == 1)*/
            goto lbl2;                          /*goto lbl2;*/
        i = 1;                                  /*i = 1;*/
        if (i > 1)                              /*if (i > 1)*/
            goto lbl1;                          /*goto lbl1;*/
        i = 2;                                  /*i = 2;*/
lbl1:   i = 3;                                  /*lbl1: i = 3;*/
        goto lbl3;                              /*goto lbl3;*/
lbl2:   i = 4;                                  /*lbl2: i = 4;*/
lbl3:   i = 5;                                  /*lbl3: i = 5;*/
/*
 * The following should structure to an up-counting "for" with "i" as the
 * index:
 */
lbl4:   i = 1;                                  /*lbl4: i = 1;*/
lbl5:   j = i + 1;                              /*lbl5: j = i + 1;*/
        j = i + 2;                              /*j = i + 2;*/
        i += 3;                                 /*i += 3;*/
        if (i < 10)                             /*if (i < 10)*/
            goto lbl5;                          /*goto lbl5;*/
        j = i + 3;                              /*j = i + 3;*/
/*
 * The following should structure to a down-counting "for" with "i" as the
 * index:
 */
lbl6:   i = 10;                                 /*lbl6: i = 10;*/
lbl7:   j = i + 1;                              /*lbl7: j = i + 1;*/
        j = i + 2;                              /*j = i + 2;*/
        i = i - 2;                              /*i = i - 2;*/
        if (i >= 1)                             /*if (i >= 1)*/
            goto lbl7;                          /*goto lbl7;*/
        j = i + 3;                              /*j = i + 3;*/
/*
 * The reference in the following "goto" will prevent XTRAN from deleting lbl0:
 */
        goto lbl0;                              /*goto lbl0;*/


C / C++ / C# code after structuring by XTRAN:

        int i, j;                               /*int i, j;*/

/*
 * The following should structure to nested "if"s and an "else":
 */
lbl0:
        if (i != 0 && i != 1)                   /*(rvsd) lbl0: if (i == 0 || i == 1)*/
                                                /*goto lbl2;*/

            {
            i = 1;                              /*i = 1;*/
            if (i <= 1)                         /*(rvsd) if (i > 1)*/
                                                /*goto lbl1;*/

                i = 2;                          /*i = 2;*/
            i = 3;                              /*lbl1: i = 3;*/
            }
        else                                    /*goto lbl3;*/
            i = 4;                              /*lbl2: i = 4;*/
        i = 5;                                  /*lbl3: i = 5;*/
/*
 * The following should structure to an up-counting "for" with "i" as the
 * index:
 */
        for (i = 1; i < 10; i += 3)             /*lbl4: i = 1;*/
            {
            j = i + 1;                          /*lbl5: j = i + 1;*/
            j = i + 2;                          /*j = i + 2;*/
                                                /*i += 3;*/
                                                /*if (i < 10)*/
                                                /*goto lbl5;*/
            }
        j = i + 3;                              /*j = i + 3;*/
/*
 * The following should structure to a down-counting "for" with "i" as the
 * index:
 */
        for (i = 10; i >= 1; i -= 2)            /*lbl6: i = 10;*/
            {
            j = i + 1;                          /*lbl7: j = i + 1;*/
            j = i + 2;                          /*j = i + 2;*/
                                                /*i = i - 2;*/
                                                /*if (i >= 1)*/
                                                /*goto lbl7;*/
            }
        j = i + 3;                              /*j = i + 3;*/
/*
 * The reference in the following "goto" will prevent XTRAN from deleting lbl0:
 */
        goto lbl0;                              /*goto lbl0;*/