XTRAN Example — Insert Anchors into HTML
Scenario — as Webmaster, you maintain some very long Web pages, and you want to provide, to the users of your Web site, the ability to record and return to positions on each page. You know there are browser add-ons that will show all of the anchors on a Web page and allow capturing each one's URL for future use, e.g. as a bookmark. So you need to retrofit such anchors into your Web site for your users, but adding them manually would be onerous and error-prone.
XTRAN to the rescue!
The XTRAN rules (which we
call meta-code) used for this example allow you to optionally specify a
series of one or more regular expressions (regexps) to match against
HTML tags. The default regexp is ^H[1-9]
, which adds
an anchor before each heading, at any level.
This means that you can decide which HTML tags are worthy of added
anchors, based on the regexp you provide (if any). For this example,
we specified ^H[12]
, which adds anchors before
only <H1>
and <H2>
tags.
The rules insert, before each tag that matches the tag regexp, a sequentially numbered anchor. Exception — if a qualifying tag is already immediately preceded by an anchor, the rules don't add another one.
You can optionally specify, via an environment variable, the format
for each anchor name, as a C printf()
format; the default
is A%d
,which will generate A1
, A2
,
etc. For this example, we specified A%02d
.
The rules accomplish their task using a statement pattern matching
and replacement facility provided via XTRAN's
rules language. The match pattern is one that matches any
qualifying tag. The replacement pattern looks like the
following, in which THIS
is HTML
and this
is meta-code:
<A NAME=numbered-name></A> statement-that-matched
The XTRAN rules for this example comprise only 60 non-comment lines of meta-code. The rules took less an hour to write and less than an hour to debug. (That's right, less than 2 hours total!)
How can such powerful and generalized HTML enhancement be automated in less than 2 hours and only 60 lines of rules? Because there is so much capability already available as part of XTRAN's rules language. These rules take advantage of the following functionality:
- Text manipulation
- Text formatting
- Regular expression matching
- Environment variable manipulation
- Statement pattern matching and replacement
- Access to XTRAN's Internal Representation (XIR)
- Navigation in XIR
- Delayed / forced rules evaluation
- Meta-variable pointers
NOTE that XTRAN rendered the HTML shown as XTRAN's output below using default conditions. XTRAN provides many options for controlling the way it renders code for output.
The input to and output from XTRAN are untouched.
Process Flowchart
Here is a flowchart for this process, in which the elements are color coded:
- BLUE for XTRAN versions (runnable programs)
- ORANGE for XTRAN rules (text files)
- RED for
code
Input to XTRAN:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <HTML> <H1>Header level 1</H1> <P>Paragraph following an H1 tag, which should have an anchor added preceding it.</P> <H2>Header level 2</H2> <HR> <P>Paragraph following an H2 tag and a line; the H2 tag should have an anchor added preceding it.</P> <A NAME="anchor"></A> <H2>Header level 2</H2> <P>Paragraph following an H2 tag that's preceded by a named anchor, so it shouldn't get one added.</P> <H3>Header level 3</H3> <P>Paragraph following an H3 tag, which shouldn't get an anchor.</P> <H1>Header level 1</H1> <P>Paragraph following another H1 tag, which should have an anchor added preceding it.</P> </HTML>
Output from XTRAN:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <HTML> <A NAME="A01"></A> <H1>Header level 1</H1> <P>Paragraph following an H1 tag, which should have an anchor added preceding it.</P> <A NAME="A02"></A> <H2>Header level 2</H2> <HR /> <P>Paragraph following an H2 tag and a line; the H2 tag should have an anchor added preceding it.</P> <A NAME="anchor"></A> <H2>Header level 2</H2> <P>Paragraph following an H2 tag that's preceded by a named anchor, so it shouldn't get one added.</P> <H3>Header level 3</H3> <P>Paragraph following an H3 tag, which shouldn't get an anchor.</P> <A NAME="A03"></A> <H1>Header level 1</H1> <P>Paragraph following another H1 tag, which should have an anchor added preceding it.</P> </HTML>