XTRAN Example — Tally DSV Values by Field Name

Scenario — you want to tally the occurrences of field values in delimiter separated value (DSV) data.

XTRAN to the rescue!

The following example uses an XTRAN rules file comprising 86 non-comment lines of "meta-code" (XTRAN's rules language) to tally field values by field name in DSV data.

The rules took less than one hour to write and ½ hour to debug.  (That's right, less than 1½ hours total!)

You specify to the rules, via environment variable values:

The rules output the resulting value tallies in the following DSV format (assuming the DSV delimiting character is defaulted to comma):

<label>,<value>,<tally>

where:

<label>  Field's label, from the label row
<value>  Field's value
<tally>  Number of times <value> occurs for <label>

For instance, given the following DSV data input:

; Label row:
;
make,color,year
;
; Data:
;
Chrysler,red,2013
Cadillac,black,2015
Ford,blue,2015
Cadillac,white,2015
Ford,white,2013
Chrysler,black,2013
Ford,white,2014

The output will be

color,black,2
color,blue,1
color,red,1
color,white,3
make,Cadillac,2
make,Chrysler,2
make,Ford,3
year,2013,3
year,2014,1
year,2015,3

How can such powerful and generalized data manipulation be automated in less than 1½ hours and only 86 code lines of XTRAN rules?  Because there is so much capability already available as part of XTRAN's rules language.  These rules take advantage of the following functionality:




Process Flowchart

Here is a flowchart for this process, in which the elements are color coded:

data flowchart