XTRAN Example — Analyze Year 2000 (Y2K) Impact on C, C++, Java, or C#
This example shows how XTRAN rules can analyze C, C++, Java, or C# code in order to identify symbols and contexts that may be a problem in terms of the Year 2000 (Y2K). The approach is not specific to those languages; similar XTRAN rules can easily be written for other computer languages, including assemblers, other 3GLs, and 4GLs. Click for an assembler example and for a PL/M example.
Of course, Y2K has come and gone, but this is an excellent example of XTRAN's analysis power, so we decided to keep it on our Web site.
This example comprises 391 non-comment lines of XTRAN's rules language, which we call "meta-code".
In order to accommodate the occurrence of duplicate symbol names that are local to different modules or scopes, or occur in different namespaces, the XTRAN meta-code is set up to deal with fully qualified symbols, as returned by a built-in XTRAN rules language primitive. Full qualification includes:
- The module file name, if the symbol is not global.
- All levels of lexical scope, such as function,
{}
block, etc. - The symbol's namespace (variables, labels, tags, etc.).
We specify, via environment variables, patterns in the form of (case-insensitive) regular expressions for the meta-code to use to identify possible Y2K problem symbols, as well as possible Y2K problems evidenced in the code's text strings and comments. In each case, the meta-code allows us to specify one or more regular expressions, separated by commas. Note that this makes the analysis independent of the human language involved. For this example, we specified the following regular expressions:
- For matching symbol names:
y[ae]*r
- For searching text strings:
y[ae]*r,19
- For searching comments:
y[ae]*r,19
Our meta-code reads text files (named via environment variables) containing symbols, both global and local to the module we are analyzing, that are known to be ("yes"), or not to be ("no"), Y2K problems. The meta-code excludes the latter from its analysis. These files usually start out empty and are then populated with symbols through the review process, as described below.
Our meta-code creates a similar text file ("maybe") in which it puts symbols it finds in the subject module that may be Y2K problems, along with a reason for each one. It does not, however, add to this file any symbols that have already been specified as known Y2K problems. Reasons for a symbol's inclusion in the module's "maybe" file are as follows:
- The symbol matches one of the symbol name patterns specified.
- The symbol's value is assigned to a known problem symbol.
- The symbol has a value assigned to it from a known problem symbol.
- The symbol is compared to a known problem symbol.
- The symbol is the input or output variable for a 2-digit format
specification in a
printf()
/scanf()
type function call.
Our meta-code also reports, as analysis output, the following:
- Every occurrence in the code of each symbol known to be a Y2K problem.
- Every occurrence of any of the specified text patterns (case-insensitive) in text strings in the module.
- Every occurrence of any of the specified text patterns (case-insensitive) in the first 50 characters of any comment in the module.
Each of these occurrences is noted with the source file name and line number, in a form suitable for the "next error" features of many text editors such as VI and Emacs. This allows us to conveniently visit each occurrence in the source code during review.
Note that the analysis report for a given module is only complete and accurate when the run produces no "maybes" for the module. Prior reports for that module should be ignored.
Analysis Sequence of Events
The typical sequence of events in performing this XTRAN analysis on a body of C code is as follows:
- Run XTRAN on each C, C++, Java, or C# module with
the meta-code, which will read files containing known problem ("yes")
and known non-problem ("no") symbols, both global and local to this
module (there will be none of the latter on the first run). The run will
create a "maybe" (
.msm
) symbol file for the module. It will also create a.y2k
analysis report for the module. If the resulting "maybe" file has symbols in it, disregard the analysis report, since it didn't take the "maybe" symbols into account. - After running each module, if its "maybe" file has symbols in it, review them, deciding in the case of each symbol whether or not it is a Y2K problem. If it is, add its line to the appropriate "yes" symbol file (global or module-specific). If it isn't, add its line to the appropriate "no" symbol file (global or module-specific). Go to step 1.
- When running the module with the meta-code produces no "maybe" symbols, the module's analysis report is complete and accurate. We can then use the (possibly updated) global "yes" and "no" files for the analysis of subsequent modules.
When all modules have been analyzed, we have the following:
- The global "yes" and "no" symbol files document all global symbols that are known to be, or not to be, Y2K problems.
- Each module's "yes" and "no" files similarly document symbols local to the module.
- Each module's analysis report documents all occurrences in the module of:
- Y2K problem symbols, both global and local.
- Specified text patterns, in text strings in the code.
- Specified text patterns, in comments in the code.
Environment Variables and Files
Our XTRAN meta-code retrieves the values of the following environment variables, and assumes the indicated values:
Environment variable | Description of value |
---|---|
Y2K_C_GBLNAM
|
Name of files containing global symbols that are known to be, or not to be, Y2K
problems. Must be a legal file name. File types .ysm
and .nsm added.
|
Y2K_C_MODNAM
|
Name of module we are analyzing. Must be a legal file name. File types
.ysm , .nsm , .msm , and .y2k
added.
|
Y2K_C_SYMPAT
|
Symbol name patterns (regular expressions) that mean "maybe", separated by commas (no spaces or TABs). |
Y2K_C_TXTPAT
|
Text patterns (regular expressions) in text literals to report, separated by commas (no spaces or TABs). |
Y2K_C_COMPAT
|
Text patterns (regular expressions) in comments to report, separated by commas (no spaces or TABs). |
In addition to reading in the "yes" and "no" files named
by Y2K_C_GBLNAM
, which must exist (but can be empty), our
meta-code also reads files named <mod>.ysm
and
<mod>.nsm
if they exist, where <mod>
is
the module named by Y2K_C_MODNAM
. These files contain
symbols local to the module we're analyzing that are known to be
("yes"), or not to be ("no"), Y2K problems. They are
typically produced by review of the "maybe" file produced by previous
analysis of this module.
Our meta-code produces a file named <mod>.msm
("maybe") containing symbols occurring in the named module that may
be Y2K problems (if any).
Our meta-code also produces a file named <mod>.y2k
containing analysis output for the named module.
Each line of the "yes", "no", and "maybe"
symbol files contains a fully qualified symbol, optionally followed by
?
and an explanation of why we are interested in the symbol (the
meta-code always adds such an explanation). Exception: Our meta-code
regards any line starting with ;
in these files as a comment, and
ignores it.
Rules
XTRAN's rules language, "meta-code", is proprietary and requires a Nondisclosure Agreement. However, the following is an English paraphrase of the meta-code used in this analysis. It is functionally equivalent to the actual meta-code used. NOTE that these rules are COPYRIGHT 2001-2018 by XTRAN, LLC and may not be copied or used in any way without our permission.
Read global "yes" and "no" symbol files Read module-specific "yes" and "no" symbol files if they exist Create an empty "maybe" symbol file for this module Create an analysis report file for this module For each C symbol occurring in the module If it matches any of the given symbol regular expressions If it's not already in the "yes", "no", or "maybe" symbol lists Add it to "maybe" list, with explanation For each C statement in the module (recursively) For each expression in the statement (recursively) If this is a printf() or scanf() type function call For each 2-digit integer format spec in its format string argument If the corresponding argument is a symbol If it's not already in "yes", "no", or "maybe" lists Add it to "maybe" list, with explanation Else if this is a text string If it matches any of the given text string regular expressions Report it to module's report file, with file name & line Else if this is a symbol If it is a known Y2K problem (in "yes" list) Report its occurrence to report file, with file name & line For each higher-level expression the symbol is in If higher-level expression is assignment or comparison For each term of higher-level expression (recursively) If term is assignment or comparison Ignore it; it's already been done on the way up Else if term is a symbol If symbol not in "yes", "no", "maybe" lists Add symbol to "maybe" list, with explanation For each of statement's comments If it matches any of the given comment regular expressions Report it to module's report file, with file name & line Close report file Write out "maybes" to module's "maybe" file Close "maybe" file
Questions?
For any questions or comments concerning this example, please .
Example Procedure
For this example, we ran the analysis three times, until it produced no "maybe" symbols. After the first and second runs, we reviewed the "maybes" and moved them to the appropriate "yes" and "no" symbol files.
Input and Output Files, by Run
The following files were input to, or generated by, the XTRAN meta-code in the course of this example. See listings below for their contents if not empty.
File name | I/O? | Description |
---|---|---|
Run 1 | ||
demy2k-a.c |
I | Code to be analyzed |
y2k-gbl.nsm |
I | Global symbol "no" file — empty |
y2k-gbl.ysm |
I | Global symbol "yes" file — empty |
demy2k-a.msm |
O | "Maybe" file |
demy2k-a.y2k |
O | Analysis report; ignored |
Run 2 | ||
demy2k-a.c |
I | Code to be analyzed |
y2k-gbl.nsm |
I | Global symbol "no" file — still empty |
y2k-gbl.ysm |
I | Global symbol "yes" file — still empty |
demy2k-a.nsm |
I | Local symbol "no" file — edited from Run 1 "maybe" file |
demy2k-a.ysm |
I | Local symbol "yes" file — edited from Run 1 "maybe" file |
demy2k-a.msm |
O | Updated "maybe" file |
demy2k-a.y2k |
O | Analysis report; ignored |
Run 3 | ||
demy2k-a.c |
I | Code to be analyzed |
y2k-gbl.nsm |
I | Global symbol "no" file — still empty |
y2k-gbl.ysm |
I | Global symbol "yes" file — edited from Run 2 "maybe" file |
demy2k-a.nsm |
I | Local symbol "no" file — updated from Run 2 "maybe" file |
demy2k-a.ysm |
I | Local symbol "yes" file — updated from Run 2 "maybe" file |
demy2k-a.msm |
O | Updated "maybe" file — empty! |
demy2k-a.y2k |
O | Analysis report — good! |
Final Results
The file y2k-gbl.ysm
documents all global symbols so far that
are known to be Y2K problems.
The file y2k-gbl.nsm
documents all global symbols so far that
are known not to be Y2K problems.
The file demy2k-a.ysm
documents all symbols local to
demy2k-a.c
that are known to be Y2K problems.
The file demy2k-a.nsm
documents all symbols local to
demy2k-a.c
that are known not to be Y2K problems.
The file demy2k-a.y2k
documents all occurrences in
demy2k-a.c
of:
- Y2K problem symbols, both global and local.
- Specified text patterns, in both text strings and comments.
Process Flowchart
Here is a flowchart for this process, in which the elements are color coded:
- BLUE for XTRAN versions (runnable programs)
- ORANGE for XTRAN rules (text files)
- RED for
code - PURPLE for text data files
File listings
demy2k-a.c — C code to be analyzed; line numbers added for reference
1 #include <stdio.h> /*define Posix I/O*/ 2 3 extern short base; /*base year (2 digits)*/ 4 5 void func1( 6 char *p_year_txt, /*1st-rnd "maybe", 2nd-rnd "yes"*/ 7 short yr_num, /*1st-rnd "maybe", 2nd-rnd "yes"*/ 8 short yr_no) /*1st-rnd "maybe", 2nd-rnd "no"*/ 9 { 10 char *p_txt; /*text pointer*/ 11 short i, j, k, m; /*util vars*/ 12 13 i = yr_num; /*makes i a 2nd-rnd "maybe"*/ 14 p_txt = p_year_txt; /*makes p_txt a 2nd-rnd "maybe"*/ 15 if (yr_num > base) /*makes base a 2nd-rnd "maybe"*/ 16 { 17 fprintf(stdout, "19%02d\n", 18 j); /*"%02d" makes j a 1st-rnd "maybe"*/ 19 printf("Year from text: %s\n", 20 p_txt); /*show year from text*/ 21 k = j; /*asnmt makes k a 2nd-rnd "maybe"*/ 22 } 23 m = yr_no; /*asnmt, but from 2nd-rnd "no"*/ 24 return; 25 }
demy2k-a.msm — "Maybe" file produced by run 1
; demy2k-a.msm — Symbols that may be Year 2000 problems ; Created 1997-07-14.0845 ; module(demy2k-a.c), proc(func1), space(vars): p_year_txt?Matched pattern "y[ae]*r" module(demy2k-a.c), proc(func1), space(vars): yr_no?Matched pattern "y[ae]*r" module(demy2k-a.c), proc(func1), space(vars): yr_num?Matched pattern "y[ae]*r" module(demy2k-a.c), proc(func1), space(vars): j?Matched 2-digit integer format spec at demy2k-a.c(17)
demy2k-a.y2k — Analysis report produced by run 1; ignored
demy2k-a.y2k — Year 2000 analysis of demy2k-a.c Created 1997-07-14.0845 demy2k-a.c(3): Comment may be about year: "base year (2 digits)" demy2k-a.c(17): Text string may be about year: "19%02d" demy2k-a.c(19): Text string may be about year: "Year from text: %s" demy2k-a.c(19): Comment may be about year: "show year from text"
demy2k-a.nsm — Local symbol "no" file input to run 2, from run 1's demy2k-a.msm
; demy2k-a.nsm — Symbols in demy2k-a.c that aren't Y2K problems ; Created 1997-07-14.0841 ; module(demy2k-a.c), proc(func1), space(vars): yr_no?Matched pattern "y[ae]*r"
demy2k-a.ysm — Local symbol "yes" file input to run 2, from run 1's demy2k-a.msm
; demy2k-a.ysm — Symbols in demy2k-a.c that are Y2K problems ; Created 1997-07-14.0841 ; module(demy2k-a.c), proc(func1), space(vars): p_year_txt?Matched pattern "y[ae]*r" module(demy2k-a.c), proc(func1), space(vars): yr_num?Matched pattern "y[ae]*r" module(demy2k-a.c), proc(func1), space(vars): j?Matched 2-digit integer format spec at demy2k-a.c(17)
demy2k-a.msm — "Maybe" file produced by run 2
; demy2k-a.msm — Symbols that may be Year 2000 problems ; Created 1997-07-14.1039 ; module(demy2k-a.c), proc(func1), space(vars): i?assigned to in "i = yr_num" at demy2k-a.c(13) module(demy2k-a.c), proc(func1), space(vars): p_txt?assigned to in "p_txt = p_year_txt" at demy2k-a.c(14) space(vars): base?compared in "yr_num > base" at demy2k-a.c(15) module(demy2k-a.c), proc(func1), space(vars): k?assigned to in "k = j" at demy2k-a.c(21)
demy2k-a.y2k — Analysis report produced by run 2; ignored
demy2k-a.y2k — Year 2000 analysis of demy2k-a.c Created 1997-07-14.1039 demy2k-a.c(3): Comment may be about year: "base year (2 digits)" demy2k-a.c(13): Problem symbol "yr_num" occurred demy2k-a.c(14): Problem symbol "p_year_txt" occurred demy2k-a.c(15): Problem symbol "yr_num" occurred demy2k-a.c(17): Text string may be about year: "19%02d" demy2k-a.c(17): Problem symbol "j" occurred demy2k-a.c(19): Text string may be about year: "Year from text: %s" demy2k-a.c(19): Comment may be about year: "show year from text" demy2k-a.c(21): Problem symbol "j" occurred
y2k-gbl.ysm — Global symbol "yes" file input to run 3; updated from run 2's demy2k-a.msm
; y2k-gbl.ysm — Global "yes" symbols for analysis of demy2k-a.c ; Revised 1997-07-14.1050 by S. F. Heffner (VMS on Mozart @ XTRAN, LLC) ; Written 1997-07-14 by S. F. Heffner (VMS on Mozart @ XTRAN, LLC) ; ; This file was originally empty. The current entries were transferred here ; after review of the "maybe" file produced by analysis of demy2k-a.c. ; space(vars): base?compared in "yr_num > base" at demy2k-a.c(15)
demy2k-a.nsm — Local symbol "no" file input to run 3; updated from run 2's demy2k-a.msm
; demy2k-a.nsm — Symbols in demy2k-a.c that aren't Y2K problems ; Created 1997-07-14.0841 ; Revised 1997-07-14.1044 by S. F. Heffner (VMS on Mozart @ XTRAN, LLC) ; module(demy2k-a.c), proc(func1), space(vars): yr_no?Matched pattern "y[ae]*r" module(demy2k-a.c), proc(func1), space(vars): i?assigned to in "i = yr_num" at demy2k-a.c(13)
demy2k-a.ysm — Local symbol "yes" file input to run 3; updated from run 2's demy2k-a.msm
; demy2k-a.ysm — Symbols in demy2k-a.c that are Y2K problems ; Revised 1997-07-14.1050 by S. F. Heffner (VMS on Mozart @ XTRAN, LLC) ; Created 1997-07-14.0841 ; module(demy2k-a.c), proc(func1), space(vars): p_year_txt?Matched pattern "y[ae]*r" module(demy2k-a.c), proc(func1), space(vars): yr_num?Matched pattern "y[ae]*r" module(demy2k-a.c), proc(func1), space(vars): j?Matched 2-digit integer format spec at demy2k-a.c(17) module(demy2k-a.c), proc(func1), space(vars): p_txt?assigned to in "p_txt = p_year_txt" at demy2k-a.c(14) module(demy2k-a.c), proc(func1), space(vars): k?assigned to in "k = j" at demy2k-a.c(21)
demy2k-a.y2k — Analysis report produced by run 3; good!
demy2k-a.y2k — Year 2000 analysis of demy2k-a.c Created 1997-07-14.1045 demy2k-a.c(3): Comment may be about year: "base year (2 digits)" demy2k-a.c(14): Problem symbol "yr_num" occurred demy2k-a.c(14): Problem symbol "p_txt" occurred demy2k-a.c(14): Problem symbol "p_year_txt" occurred demy2k-a.c(15): Problem symbol "yr_num" occurred demy2k-a.c(15): Problem symbol "base" occurred demy2k-a.c(17): Text string may be about year: "19%02d" demy2k-a.c(17): Problem symbol "j" occurred demy2k-a.c(19): Text string may be about year: "Year from text: %s" demy2k-a.c(19): Problem symbol "p_txt" occurred demy2k-a.c(19): Comment may be about year: "show year from text" demy2k-a.c(21): Problem symbol "k" occurred demy2k-a.c(21): Problem symbol "j" occurred