XTRAN Example — Insert Anchors into HTML

Scenario — as Webmaster, you maintain some very long Web pages, and you want to provide, to the users of your Web site, the ability to record and return to positions on each page.  You know there are browser add-ons that will show all of the anchors on a Web page and allow capturing each one's URL for future use, e.g. as a bookmark.  So you need to retrofit such anchors into your Web site for your users, but adding them manually would be onerous and error-prone.

XTRAN to the rescue!

The XTRAN rules (which we call meta-code) used for this example allow you to optionally specify a series of one or more regular expressions (regexps) to match against HTML tags.  The default regexp is ^H[1-9], which adds an anchor before each heading, at any level.

This means that you can decide which HTML tags are worthy of added anchors, based on the regexp you provide (if any).  For this example, we specified ^H[12], which adds anchors before only <H1> and <H2> tags.

The rules insert, before each tag that matches the tag regexp, a sequentially numbered anchor.  Exception — if a qualifying tag is already immediately preceded by an anchor, the rules don't add another one.

You can optionally specify, via an environment variable, the format for each anchor name, as a C printf() format; the default is A%d,which will generate A1, A2, etc.  For this example, we specified A%02d.

The rules accomplish their task using a statement pattern matching and replacement facility provided via XTRAN's rules language.  The match pattern is one that matches any qualifying tag.  The replacement pattern looks like the following, in which THIS is HTML and this is meta-code:

    <A NAME=numbered-name></A>
    statement-that-matched

The XTRAN rules for this example comprise only 60 non-comment lines of meta-code.  The rules took less an hour to write and less than an hour to debug.  (That's right, less than 2 hours total!)

How can such powerful and generalized HTML enhancement be automated in less than 2 hours and only 60 lines of rules?  Because there is so much capability already available as part of XTRAN's rules language.  These rules take advantage of the following functionality:

NOTE that XTRAN rendered the HTML shown as XTRAN's output below using default conditionsXTRAN provides many options for controlling the way it renders code for output.

The input to and output from XTRAN are untouched.



Process Flowchart

Here is a flowchart for this process, in which the elements are color coded:

data flowchart

Input to XTRAN:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

<HTML>

<H1>Header level 1</H1>

<P>Paragraph following an H1 tag, which should have an anchor added preceding
it.</P>

<H2>Header level 2</H2>

<HR>

<P>Paragraph following an H2 tag and a line; the H2 tag should have an anchor
added preceding it.</P>

<A NAME="anchor"></A>
<H2>Header level 2</H2>

<P>Paragraph following an H2 tag that's preceded by a named anchor, so it
shouldn't get one added.</P>

<H3>Header level 3</H3>

<P>Paragraph following an H3 tag, which shouldn't get an anchor.</P>

<H1>Header level 1</H1>

<P>Paragraph following another H1 tag, which should have an anchor added
preceding it.</P>

</HTML>


Output from XTRAN:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

<HTML>

<A NAME="A01"></A>
<H1>Header level 1</H1>

<P>Paragraph following an H1 tag, which should have an anchor added preceding
it.</P>

<A NAME="A02"></A>
<H2>Header level 2</H2>

<HR />

<P>Paragraph following an H2 tag and a line; the H2 tag should have an anchor
added preceding it.</P>

<A NAME="anchor"></A>
<H2>Header level 2</H2>

<P>Paragraph following an H2 tag that's preceded by a named anchor, so it
shouldn't get one added.</P>

<H3>Header level 3</H3>

<P>Paragraph following an H3 tag, which shouldn't get an anchor.</P>

<A NAME="A03"></A>
<H1>Header level 1</H1>

<P>Paragraph following another H1 tag, which should have an anchor added
preceding it.</P>

</HTML>