XTRAN Example — Forensic Analysis of Function Signatures


Scenario 1 — as a forensic code analyst, you've been asked to determine if a large body of code invokes a licensed API in an unauthorized way.  And the judge in the case has established a very short discovery period..

Scenario 2 — you are shocked to find out that calling a widely used function in your code with the wrong argument data type creates a security hole!  You must rapidly assess the problem and remediate it.

Scenario 3 — you want to add a new parameter to a function that is used throughout your application portfolio, and you need to assess the impact of doing so.

XTRAN to the rescue!

Introduction

The following example uses an XTRAN rules file comprising 110 non-comment lines of XTRAN's rules language ("meta-code") to automate a forensic analysis that looks for and logs function call signatures — a call to a specified function, optionally with a data type specified for a given argument number.  The occurrence of such a signature might signal an intellectual property problem, exposure to a known bug, or possibly some kind of "back door" that threatens the user's Information Security.

The rules took one hour to write and about ½ hour to debug.  (That's right, less than two hours total!)

This particular example involves C code, but the XTRAN rules used are completely language-independent; they can be applied unchanged to any language that allows function declarations and calls.

Strategy

The XTRAN rules allow you to specify, in an input text file, the information for each function whose calls you want to check, in the following format (one per line)

<fncnam>[[,<argnum>,<argtyp>[[,<tplvls>]]]]

where:

 <fncnam>  The function's name
 <argnum>  (optional) The 1-n number of the calling argument to check; none means log all calls to <fncnam>
 <argtyp>  The argument type to check for and log the call if it matches
 <tplvls>  (optional) How many levels of type definitions to go through to determine the calling argument's type; the default is all levels

The rules output a log of matching calls found in the following format, one call per line:

<filnam>,<linnum>,<fncnam>,<argnum>

where:

 <filnam>  The name of the source file in which the call occurs
 <linnum>  The line number at which the call occurs
 <fncnam>  The function's name
 <argnum>  The 1-n number of the calling argument checked, or 0 if no argument checked 

Note that, by telling the rules to go through all levels of type definition, you can defeat attempts to obfuscate function call signatures by using arbitrary levels of type definition to hide function argument data types.

Here is an English paraphrase of the XTRAN rules used for this example:

    Read function call signature data and store in an XTRAN
      data base, reporting any errors
    Create output file
    For each statement to be analyzed, excluding declaration statements
        For each expression in this statement
	    If expression isn't a call to a function we're doing
		Ignore expression
	    If need to check an argument's type
		If specified argument isn't specified type
		     Ignore expresssion
	    Log function call to output file, with source file name and line
    Close output file

How can such powerful and generalized forensic analysis be automated in less than two hours and only 110 lines of rules?  Because there is so much capability already available as part of XTRAN's rules language. The rules used for this example take advantage of the following functionality provided by that rules language:

The input to and output from XTRAN are untouched, except that line numbers have been added to the analyzed code for reference.


Process Flowchart

Here is a flowchart for this process, in which the elements are color coded:

data flowchart

Input to XTRAN — signature descriptions

; Log all calls:
;
intfnc1
;
; Log only calls with matching argument signature, thru no type defn lvls:
;
intfnc2,1,Str *,0
;
; Log only calls with matching argument signature, thru all type defn lvls:
;
intfnc3,2,short


Input to XTRAN — demfas-a.c

 1 #define NULL 0
 2 
 3         typedef struct                          /*struct as poss arg type*/
 4             {
 5             short memb1, memb2;
 6             }
 7             Str;
 8         typedef short Sh;                       /*type name as poss arg type*/
 9 
10         extern int   unintf(long, short);       /*uninteresting function*/
11         extern short intfnc1(long);             /*interesting function*/
12         extern short intfnc2(Str *);            /*interesting function*/
13         extern short intfnc3(short, short);     /*interesting function*/
14 
15 short func(void)
16 {
17         short sh1, sh2;
18         Sh    shtypd;
19         int   int1;
20         Str   str;
21 
22         while (intfnc2(sh1, int1))              /*won't be logged*/
23             {
24             sh2 = intfnc3(sh1, shtypd);         /*will be logged*/
25             sh1 = unintf((long *) sh1, sh2);    /*won't be logged*/
26             sh2 = intfnc1(sh1, sh2);            /*will be logged*/
27             sh2 = intfnc2(&str);                /*will be logged*/
28             sh2 = intfnc2(NULL);                /*won't be logged*/
29             }
30         return;
31 }       


Output from XTRAN:

demfas-a.c,24,intfnc3,2
demfas-a.c,26,intfnc1,0
demfas-a.c,27,intfnc2,1