XTRAN Example — Analyze Year 2000 (Y2K) Impact on C, C++, Java, or C#

This example shows how XTRAN rules can analyze C, C++, Java, or C# code in order to identify symbols and contexts that may be a problem in terms of the Year 2000 (Y2K).  The approach is not specific to those languages; similar XTRAN rules can easily be written for other computer languages, including assemblers, other 3GLs, and 4GLs.  Click for an assembler example and for a PL/M example.

Of course, Y2K has come and gone, but this is an excellent example of XTRAN's analysis power, so we decided to keep it on our Web site.

This example comprises 391 non-comment lines of XTRAN's rules language, which we call "meta-code".

In order to accommodate the occurrence of duplicate symbol names that are local to different modules or scopes, or occur in different namespaces, the XTRAN meta-code is set up to deal with fully qualified symbols, as returned by a built-in XTRAN rules language primitive.  Full qualification includes:

We specify, via environment variables, patterns in the form of (case-insensitive) regular expressions for the meta-code to use to identify possible Y2K problem symbols, as well as possible Y2K problems evidenced in the code's text strings and comments.  In each case, the meta-code allows us to specify one or more regular expressions, separated by commas.  Note that this makes the analysis independent of the human language involved.  For this example, we specified the following regular expressions:

Our meta-code reads text files (named via environment variables) containing symbols, both global and local to the module we are analyzing, that are known to be ("yes"), or not to be ("no"), Y2K problems.  The meta-code excludes the latter from its analysis.  These files usually start out empty and are then populated with symbols through the review process, as described below.

Our meta-code creates a similar text file ("maybe") in which it puts symbols it finds in the subject module that may be Y2K problems, along with a reason for each one.  It does not, however, add to this file any symbols that have already been specified as known Y2K problems.  Reasons for a symbol's inclusion in the module's "maybe" file are as follows:

Our meta-code also reports, as analysis output, the following:

Each of these occurrences is noted with the source file name and line number, in a form suitable for the "next error" features of many text editors such as VI and Emacs.  This allows us to conveniently visit each occurrence in the source code during review.

Note that the analysis report for a given module is only complete and accurate when the run produces no "maybes" for the module.  Prior reports for that module should be ignored.



Analysis Sequence of Events

The typical sequence of events in performing this XTRAN analysis on a body of C code is as follows:

  1. Run XTRAN on each C, C++, Java, or C# module with the meta-code, which will read files containing known problem ("yes") and known non-problem ("no") symbols, both global and local to this module (there will be none of the latter on the first run).  The run will create a "maybe" (.msm) symbol file for the module.  It will also create a .y2k analysis report for the module.  If the resulting "maybe" file has symbols in it, disregard the analysis report, since it didn't take the "maybe" symbols into account.
  2. After running each module, if its "maybe" file has symbols in it, review them, deciding in the case of each symbol whether or not it is a Y2K problem.  If it is, add its line to the appropriate "yes" symbol file (global or module-specific).  If it isn't, add its line to the appropriate "no" symbol file (global or module-specific).  Go to step 1.
  3. When running the module with the meta-code produces no "maybe" symbols, the module's analysis report is complete and accurate.  We can then use the (possibly updated) global "yes" and "no" files for the analysis of subsequent modules.

When all modules have been analyzed, we have the following:



Environment Variables and Files

Our XTRAN meta-code retrieves the values of the following environment variables, and assumes the indicated values:

Environment variable Description of value
 Y2K_C_GBLNAM  Name of files containing global symbols that are known to be, or not to be, Y2K problems.  Must be a legal file name. File types .ysm and .nsm added.
 Y2K_C_MODNAM  Name of module we are analyzing.  Must be a legal file name. File types .ysm, .nsm, .msm, and .y2k added.
 Y2K_C_SYMPAT  Symbol name patterns (regular expressions) that mean "maybe", separated by commas (no spaces or TABs).
 Y2K_C_TXTPAT  Text patterns (regular expressions) in text literals to report, separated by commas (no spaces or TABs).
 Y2K_C_COMPAT  Text patterns (regular expressions) in comments to report, separated by commas (no spaces or TABs).

In addition to reading in the "yes" and "no" files named by Y2K_C_GBLNAM, which must exist (but can be empty), our meta-code also reads files named <mod>.ysm and <mod>.nsm if they exist, where <mod> is the module named by Y2K_C_MODNAM.  These files contain symbols local to the module we're analyzing that are known to be ("yes"), or not to be ("no"), Y2K problems.  They are typically produced by review of the "maybe" file produced by previous analysis of this module.

Our meta-code produces a file named <mod>.msm ("maybe") containing symbols occurring in the named module that may be Y2K problems (if any).

Our meta-code also produces a file named <mod>.y2k containing analysis output for the named module.

Each line of the "yes", "no", and "maybe" symbol files contains a fully qualified symbol, optionally followed by ? and an explanation of why we are interested in the symbol (the meta-code always adds such an explanation).  Exception: Our meta-code regards any line starting with ; in these files as a comment, and ignores it.



Rules

XTRAN's rules language, "meta-code", is proprietary and requires a Nondisclosure Agreement.  However, the following is an English paraphrase of the meta-code used in this analysis.  It is functionally equivalent to the actual meta-code used.  NOTE that these rules are COPYRIGHT 2001-2018 by XTRAN, LLC and may not be copied or used in any way without our permission.

Read global "yes" and "no" symbol files
Read module-specific "yes" and "no" symbol files if they exist
Create an empty "maybe" symbol file for this module
Create an analysis report file for this module
For each C symbol occurring in the module
    If it matches any of the given symbol regular expressions
        If it's not already in the "yes", "no", or "maybe" symbol lists
            Add it to "maybe" list, with explanation
For each C statement in the module (recursively)
    For each expression in the statement (recursively)
        If this is a printf() or scanf() type function call
            For each 2-digit integer format spec in its format string argument
                If the corresponding argument is a symbol
                    If it's not already in "yes", "no", or "maybe" lists
                        Add it to "maybe" list, with explanation
        Else if this is a text string
            If it matches any of the given text string regular expressions
                Report it to module's report file, with file name & line
        Else if this is a symbol
            If it is a known Y2K problem (in "yes" list)
                Report its occurrence to report file, with file name & line
                For each higher-level expression the symbol is in
                    If higher-level expression is assignment or comparison
                        For each term of higher-level expression (recursively)
                            If term is assignment or comparison
                                Ignore it; it's already been done on the way up
                            Else if term is a symbol
                                If symbol not in "yes", "no", "maybe" lists
                                    Add symbol to "maybe" list, with
                                      explanation
    For each of statement's comments
        If it matches any of the given comment regular expressions
            Report it to module's report file, with file name & line
Close report file
Write out "maybes" to module's "maybe" file
Close "maybe" file


Questions?

For any questions or comments concerning this example, please .



Example Procedure

For this example, we ran the analysis three times, until it produced no "maybe" symbols.  After the first and second runs, we reviewed the "maybes" and moved them to the appropriate "yes" and "no" symbol files.



Input and Output Files, by Run

The following files were input to, or generated by, the XTRAN meta-code in the course of this example.  See listings below for their contents if not empty.

 File name   I/O?   Description 
 Run 1 
 demy2k-a.c  I  Code to be analyzed 
 y2k-gbl.nsm  I  Global symbol "no" file — empty 
 y2k-gbl.ysm  I  Global symbol "yes" file — empty 
 demy2k-a.msm  O  "Maybe" file 
 demy2k-a.y2k  O  Analysis report; ignored
 Run 2 
 demy2k-a.c  I  Code to be analyzed 
 y2k-gbl.nsm  I  Global symbol "no" file — still empty 
 y2k-gbl.ysm  I  Global symbol "yes" file — still empty 
 demy2k-a.nsm  I  Local symbol "no" file — edited from Run 1 "maybe" file 
 demy2k-a.ysm  I  Local symbol "yes" file — edited from Run 1 "maybe" file 
 demy2k-a.msm  O  Updated  "maybe" file  
 demy2k-a.y2k  O  Analysis report; ignored 
 Run 3 
 demy2k-a.c  I  Code to be analyzed 
 y2k-gbl.nsm  I  Global symbol "no" file — still empty 
 y2k-gbl.ysm  I  Global symbol "yes" file — edited from Run 2 "maybe" file 
 demy2k-a.nsm  I  Local symbol "no" file — updated from Run 2 "maybe" file 
 demy2k-a.ysm  I  Local symbol "yes" file — updated from Run 2 "maybe" file 
 demy2k-a.msm  O  Updated  "maybe" file — empty! 
demy2k-a.y2k O  Analysis report — good!



Final Results

The file y2k-gbl.ysm documents all global symbols so far that are known to be Y2K problems.

The file y2k-gbl.nsm documents all global symbols so far that are known not to be Y2K problems.

The file demy2k-a.ysm documents all symbols local to demy2k-a.c that are known to be Y2K problems.

The file demy2k-a.nsm documents all symbols local to demy2k-a.c that are known not to be Y2K problems.

The file demy2k-a.y2k documents all occurrences in demy2k-a.c of:


Process Flowchart

Here is a flowchart for this process, in which the elements are color coded:

process flowchart

File listings

demy2k-a.c — C code to be analyzed; line numbers added for reference

1 #include <stdio.h>                    /*define Posix I/O*/
 2
 3        extern short base;             /*base year (2 digits)*/
 4
 5 void func1(
 6    char *p_year_txt,                  /*1st-rnd "maybe", 2nd-rnd "yes"*/
 7    short yr_num,                      /*1st-rnd "maybe", 2nd-rnd "yes"*/
 8    short yr_no)                       /*1st-rnd "maybe", 2nd-rnd "no"*/
 9 {
10      char *p_txt;                     /*text pointer*/
11      short i, j, k, m;                /*util vars*/
12
13      i = yr_num;                      /*makes i a 2nd-rnd "maybe"*/
14      p_txt = p_year_txt;              /*makes p_txt a 2nd-rnd "maybe"*/
15      if (yr_num > base)               /*makes base a 2nd-rnd "maybe"*/
16          {
17          fprintf(stdout, "19%02d\n",
18            j);                        /*"%02d" makes j a 1st-rnd "maybe"*/
19          printf("Year from text:  %s\n",
20            p_txt);                    /*show year from text*/
21          k = j;                       /*asnmt makes k a 2nd-rnd "maybe"*/
22          }
23      m = yr_no;                       /*asnmt, but from 2nd-rnd "no"*/
24      return;
25 }

demy2k-a.msm — "Maybe" file produced by run 1

; demy2k-a.msm — Symbols that may be Year 2000 problems
; Created 1997-07-14.0845
;
module(demy2k-a.c), proc(func1), space(vars): p_year_txt?Matched pattern "y[ae]*r"
module(demy2k-a.c), proc(func1), space(vars): yr_no?Matched pattern "y[ae]*r"
module(demy2k-a.c), proc(func1), space(vars): yr_num?Matched pattern "y[ae]*r"
module(demy2k-a.c), proc(func1), space(vars): j?Matched 2-digit integer format spec at
  demy2k-a.c(17)

demy2k-a.y2k — Analysis report produced by run 1; ignored

demy2k-a.y2k — Year 2000 analysis of demy2k-a.c
Created 1997-07-14.0845

demy2k-a.c(3):  Comment may be about year:  "base year (2 digits)"
demy2k-a.c(17):  Text string may be about year:  "19%02d"
demy2k-a.c(19):  Text string may be about year:  "Year from text:  %s"
demy2k-a.c(19):  Comment may be about year:  "show year from text"

demy2k-a.nsm — Local symbol "no" file input to run 2, from run 1's demy2k-a.msm

; demy2k-a.nsm — Symbols in demy2k-a.c that aren't Y2K problems
; Created 1997-07-14.0841
;
module(demy2k-a.c), proc(func1), space(vars): yr_no?Matched pattern "y[ae]*r"

demy2k-a.ysm — Local symbol "yes" file input to run 2, from run 1's demy2k-a.msm

; demy2k-a.ysm — Symbols in demy2k-a.c that are Y2K problems
; Created 1997-07-14.0841
;
module(demy2k-a.c), proc(func1), space(vars): p_year_txt?Matched pattern "y[ae]*r"
module(demy2k-a.c), proc(func1), space(vars): yr_num?Matched pattern "y[ae]*r"
module(demy2k-a.c), proc(func1), space(vars): j?Matched 2-digit integer format spec
  at demy2k-a.c(17)

demy2k-a.msm — "Maybe" file produced by run 2

; demy2k-a.msm — Symbols that may be Year 2000 problems
; Created 1997-07-14.1039
;
module(demy2k-a.c), proc(func1), space(vars): i?assigned to in "i = yr_num" at
  demy2k-a.c(13)
module(demy2k-a.c), proc(func1), space(vars): p_txt?assigned to in "p_txt =
  p_year_txt" at demy2k-a.c(14)
space(vars): base?compared in "yr_num > base" at demy2k-a.c(15)
module(demy2k-a.c), proc(func1), space(vars): k?assigned to in "k = j" at
  demy2k-a.c(21)

demy2k-a.y2k — Analysis report produced by run 2; ignored

demy2k-a.y2k — Year 2000 analysis of demy2k-a.c
Created 1997-07-14.1039

demy2k-a.c(3):  Comment may be about year:  "base year (2 digits)"
demy2k-a.c(13):  Problem symbol "yr_num" occurred
demy2k-a.c(14):  Problem symbol "p_year_txt" occurred
demy2k-a.c(15):  Problem symbol "yr_num" occurred
demy2k-a.c(17):  Text string may be about year:  "19%02d"
demy2k-a.c(17):  Problem symbol "j" occurred
demy2k-a.c(19):  Text string may be about year:  "Year from text:  %s"
demy2k-a.c(19):  Comment may be about year:  "show year from text"
demy2k-a.c(21):  Problem symbol "j" occurred

y2k-gbl.ysm — Global symbol "yes" file input to run 3; updated from run 2's demy2k-a.msm

; y2k-gbl.ysm — Global "yes" symbols for analysis of demy2k-a.c
; Revised 1997-07-14.1050 by S. F. Heffner (VMS on Mozart @ XTRAN, LLC)
; Written 1997-07-14 by S. F. Heffner (VMS on Mozart @ XTRAN, LLC)
;
; This file was originally empty.  The current entries were transferred here
; after review of the "maybe" file produced by analysis of demy2k-a.c.
;
space(vars): base?compared in "yr_num > base" at demy2k-a.c(15)

demy2k-a.nsm — Local symbol "no" file input to run 3; updated from run 2's demy2k-a.msm

; demy2k-a.nsm — Symbols in demy2k-a.c that aren't Y2K problems
; Created 1997-07-14.0841
; Revised 1997-07-14.1044 by S. F. Heffner (VMS on Mozart @ XTRAN, LLC)
;
module(demy2k-a.c), proc(func1), space(vars): yr_no?Matched pattern "y[ae]*r"
module(demy2k-a.c), proc(func1), space(vars): i?assigned to in "i = yr_num" at
  demy2k-a.c(13)

demy2k-a.ysm — Local symbol "yes" file input to run 3; updated from run 2's demy2k-a.msm

; demy2k-a.ysm — Symbols in demy2k-a.c that are Y2K problems
; Revised 1997-07-14.1050 by S. F. Heffner (VMS on Mozart @ XTRAN, LLC)
; Created 1997-07-14.0841
;
module(demy2k-a.c), proc(func1), space(vars): p_year_txt?Matched pattern "y[ae]*r"
module(demy2k-a.c), proc(func1), space(vars): yr_num?Matched pattern "y[ae]*r"
module(demy2k-a.c), proc(func1), space(vars): j?Matched 2-digit integer format spec at
  demy2k-a.c(17)
module(demy2k-a.c), proc(func1), space(vars): p_txt?assigned to in "p_txt = p_year_txt" at
  demy2k-a.c(14)
module(demy2k-a.c), proc(func1), space(vars): k?assigned to in "k = j" at
demy2k-a.c(21)

demy2k-a.y2k — Analysis report produced by run 3; good!

demy2k-a.y2k — Year 2000 analysis of demy2k-a.c
Created 1997-07-14.1045

demy2k-a.c(3):  Comment may be about year:  "base year (2 digits)"
demy2k-a.c(14):  Problem symbol "yr_num" occurred
demy2k-a.c(14):  Problem symbol "p_txt" occurred
demy2k-a.c(14):  Problem symbol "p_year_txt" occurred
demy2k-a.c(15):  Problem symbol "yr_num" occurred
demy2k-a.c(15):  Problem symbol "base" occurred
demy2k-a.c(17):  Text string may be about year:  "19%02d"
demy2k-a.c(17):  Problem symbol "j" occurred
demy2k-a.c(19):  Text string may be about year:  "Year from text:  %s"
demy2k-a.c(19):  Problem symbol "p_txt" occurred
demy2k-a.c(19):  Comment may be about year:  "show year from text"
demy2k-a.c(21):  Problem symbol "k" occurred
demy2k-a.c(21):  Problem symbol "j" occurred