XTRAN Example — Translate Labeled Structures in HP (Digital, Compaq) VAX MACRO to C
One way to declare a structure in VAX MACRO is in the following form:
<lbl1>: [[<lbl>:]] <opr> ... ... <name>=.-<lbl1> ;OPTIONAL -- LENGTH OF STRUCTURE
where:
<lbl1> is any legal symbol in the source language <lbl> is any legal symbol in the source language <opr> is one of: .ASCII, .BLK{{B|W|L}}, .BYTE, .WORD, .LONG, .ADDRESS
An example of this usage is
STR: MEMB1: .BLKB 3 .ASCII /ABC/ .ASCII ?DEF? MEMB2: .LONG 2,3 .BLKL 1 STRSIZ = .-STR
(Note the use of "fillers" — members with no names.)
Normally, XTRAN would try to translate this as a
structure of unknown type named memb1
plus a
long
array named memb2
, and it would be unable to
translate the =
statement because of the Program Counter
reference.
We have created a set of XTRAN rules
("meta-code") to translate this kind of structure declaration, which
we call a LBLSTR
(labeled structure). These rules, which
comprise just over 400 non-comment lines of meta-code, are input
to XTRAN after the normal VAX MACRO to C translation
rules we provide. They implement the following strategy:
- Because we can't reliably determine the start and end of a
LBLSTR
from the original code, we require a marker at the start of eachLBLSTR
decl. If it doesn't end with an=
, we also require a marker at its end. These markers are embedded in comments that will be automatically deleted by XTRAN. - We post rules to be evaluated after reading and parsing all of our input,
but before starting translation. These rules will scan all VAX MACRO code
to identify
LBLSTR
struct declarations, based on the markers we require. When they find one, they will change the operator type of each statement of theLBLSTR
(including the ending=
if any) to "user data". This prevents XTRAN from trying to combine theLBLSTR
's instructions, and also activates our special rules for these operators. They will also mark each such statement to suppress automatic passthrough of its labels and comments, since we will be handling them explicitly in our rules. - We post a translation rule for each of the statement operators that can
legally belong to an
LBLSTR
, overriding that operator's normal rule:- If it has been marked as "user data", meaning it is part of a
LBLSTR
:- If it is the first member of the
LBLSTR
, the rule will record its first (or only) label as the structure's name. - The rule will record information we'll need to declare it as a member of
its
LBLSTR
structure. This includes its name, its data type, its dimension if any, its initial values if any, and its comments if any. - If the member has no name (or it's the first member and we used its name
for the structure), the rule will generate, as the member name,
filler<n>
, where<n>
starts from 0 for eachLBLSTR
. - The rule will generate no target code directly from the statement.
- If the statement is marked as the end of a
LBLSTR
structure, the rule will use the previously recorded member information (including the information from this statement) to generate a C declaration and initialization of the structure.
- If it is the first member of the
- If the statement isn't marked as "user data", the rule will translate it normally.
- If it has been marked as "user data", meaning it is part of a
- We post a translation rule for each
=
statement, overriding the normal rule for=
:- If it has been marked as "user data", meaning that it is the end
of a
LBLSTR
structure, the rule will use the member information we previously recorded to generate a C declaration and initialization of the structure, and will then translate the=
using sizeof(). - If the statement isn't marked as "user data", the rule will translate it normally.
- If it has been marked as "user data", meaning that it is the end
of a
- Our rules will first create a declaration of the desired C structure with a struct tag, and then an allocation of that struct type, with the given name and with appropriate initial values.
- Our rules will also tell XTRAN that the C structure is an allocation of the declared struct type, so that XTRAN will automatically qualify all member name references properly. The struct tag is required in order for this to work.
The following input to, and output from, XTRAN are
untouched except for added commentary in the input and paraphrasing of
the LBLSTR
start and end markers.
Process Flowchart
Here is a flowchart for this process, in which the elements are color coded:
- BLUE for XTRAN versions (runnable programs)
- ORANGE for XTRAN rules (text files)
- RED for
code
Input to XTRAN:
MB4DIM = 5 ;MB4DIM = 5 ARR: .ASCII /XYZ/ ;ARR: .ASCII /XYZ/ ARRSIZ = .-ARR ;ARRSIZ = .-ARR
(Note that the following struct is global. It ends with an equate, so we don't need to mark its end.)
;(LBLSTR start marker) STRCT1:: .ASCII /ABC/<9>/DEF/ ; .ASCII /ABC/<9>/DEF/; MEMB1: .ASCII /GHI/ ;MEMB1: .ASCII /GHI/; MEMB2: .BLKW 1 ;MEMB2: .BLKW 1 .ASCII /JKL/ ; .ASCII /JKL/; MEMB3: .ASCII /YY/ ;MEMB3: .ASCII /YY/; .LONG 2,3 ; .LONG 2,3 S1SIZ = .-STRCT1 ;S1SIZ = .-STRCT1 WS1: .LONG 3 ;WS1: .LONG 3
(Note that the following struct is local. It doesn't end with an equate, so we need to mark its end.)
;(LBLSTR start marker) STRCT2: .LONG 2,5 ;STRCT2:.LONG 2,5 .ASCII ?XYZ? ; .ASCII ?XYZ? MEMB4: .BLKW MB4DIM ;MEMB4: .BLKW MB4DIM ;(LBLSTR end marker) WS2: .BLKW 10 ;WS2: .BLKW 10 CODE: MOVAL STRCT1,R3 ; MOVAL STRCT1,R3 MOVB MEMB3+1,R4 ; MOVB MEMB3+1,R4 MOVW MEMB4(R1),MEMB2 ; MOVW MEMB4(R1),MEMB2 .END
Output from XTRAN:
extern long *r3; extern char r4; extern long r1; #define MB4DIM 5 /*mb4dim = 5*/ static char arr[4] = "XYZ"; /*arr: .ascii /xyz/*/ #define ARRSIZ sizeof(arr) /*arrsiz = .-arr*/ struct strct1_str { char filler1[7]; /* .ascii /abc/<9>/def/;*/ char memb1[3]; /*memb1: .ascii /ghi/;*/ short memb2; /*memb2: .blkw 1*/ char filler2[3]; /* .ascii /jkl/;*/ char memb3[2]; /*memb3: .ascii /yy/;*/ long filler3[2]; /* .long 2,3*/ }; struct strct1_str strct1 = { "ABC\tDEF", /*filler1*/ "GHI", /*memb1*/ 0, /*memb2*/ "JKL", /*filler2*/ "YY", /*memb3*/ { 2, 3 } /*filler3*/ }; #define S1SIZ sizeof(strct1) /*s1siz = .-strct1*/ static long ws1 = 3; /*ws1: .long 3*/ struct strct2_str { long filler1[2]; /*strct2:.long 2,5*/ char filler2[3]; /* .ascii ?xyz?*/ short memb4[MB4DIM]; /*memb4: .blkw mb4dim*/ }; static struct strct2_str strct2 = { { 2, 5 }, /*filler1*/ "XYZ", /*filler2*/ { 0 } /*memb4*/ }; static short ws2[10]; /*ws2: .blkw 10*/ code: r3 = (long *) &strct1; /* moval strct1,r3*/ r4 = strct1.memb3[1]; /* movb memb3+1,r4*/ strct1.memb2 = *((short *) ((char *) strct2.memb4 + r1)); /* movw memb4(r1),memb2*/