Output Processors
Expanding ustring tags
The Type UString
A ustring is a field type defined by the interpreter that can be used at any unit. It represents a character string with embedded tags. To get a string of it, a ustring must be evaluated and all its tags must be expanded as strings. The evaluation takes place in a UText script with the operation out and in a Perl script with the method call $ut->out($str).
Example:
~webpage =index
~title My amazing Website
~content
~h1 [v title]
If you print out the ustring of the role h1 at a script with
lit h1
or in Perl with
print $ut->get_field('h1');
you will get '[v title]'. If you otherwise evaluate it with
out h1
or in Perl
print $ut->out($ut->get_field('h1'));
you will get 'My amazing Website', which is the value of field ”title“.
To generate a html page one can write at a script something like this, provided the $ut object is now situated at the webpage =index:
out begin
<html>
<title>[v title]</title>
<body>
<h1>[v content.h1]</h1>
[...]
end
Tag Syntax
Overview
A ustring tag is enclosed in balanced square brackets. For example: my own [tag] is great contains a tag named ”tag“. The output processor will expand the tag at evaluation time and let the rest of the string ”my own [...] is great“ unchanged.
Tag names are case sensitive, being [x] and [X] two different tag names with separate output processors.
A tag can get a parameter: my own [tag by me] is great will be expanded into ”my own [...] is great“ and the bound output processor will receive the tag ”tag“ with the parameter string ”by me“.
A tag can span over a substring as in:
my own tag [tag/](I did it myself)[/tag] is great
This will be expanded as ”my own tag [...] is great“. The bound output processor will receive the content string ”(I did it myself)“. Opening and closing clause can be in separate lines.
A tag can get both a parameter and a content string at the same time:
my own tag [tag/ by me](I did it myself)[/tag] is great
A tag can also be affected by a modifier. A ustring such as my own tag [tag.small] causes the output processor to be called with the tag ”tag“ and the modifer ”small“.
Tags can be embedded in other tags, both as parameter and as contents.
[cite/]
[author/]Immanuel Kant[/author]:
[work/]Kritique Of Pure Reason[/work]
Our age is, in especial degree, the age of criticism,
and to criticism everything must submit.
[/cite]
The same content could alternatively be expressed as:
[cite/
[author/]Immanuel Kant[/author]:
[work/]Kritique Of Pure Reason[/work]
]
Our age is, in especial degree, the age of criticism,
and to criticism everything must submit.
[/cite]
The signs [ and ] are interpreted always as tag signs. If you want to output a square bracket you must use the tag [sb]. For example if you want to output [not a tag] you write [sb not a tag]. You get just an open bracket [ with [sb.o] and a close bracket ] with [sb.c].
Specification
The general syntax form of a ustring tag, depending on if it is a standalone or a tag with content, is this:
[name.modifier parameters]
[name.modifier/ parameters]content[/name]
name — A tag name consists of a string of digits and/or letters, no whitespace and no other signs.
modifier — A modifier is a string of digits and/or letters, no whitespace but possibly other signs (including period), which are passed through to the output processor.
parameters and content — Are free ustrings, possibly including whitespace and embedded tags. The content can span over multiple lines.
At evaluation time the ustring syntax is checked up and the program aborts if it is not correct. The tags must be balanced, the last opened tag must be the first one to be closed. Square brackets can only occur as tag delimiters, use [sb] to output them literally. The program aborts also if at that time there is no bound processor to handle a found tag.
Tag Semantics
The ustring tags define segments in a string to be expanded. The content of the tag expansion is defined by the bound output processor, allowing the same ustring to produce different output for example for website generation and for LaTeX generation.
The parsing of the ustring goes this way: The tags are parsed from left to right, the bound processor is called for the left most tag until the very end of the tag and including all embeded tags and the results appended to the output string.
When evaluating a tag with embedded tags inside, the inner tags are not necessarly evaluated. This must be done by the output processor, if desired. This grants complete flexibility in ustring evaluation and allows an output processor to treat parameters and contents as they want to, even if they contain substrings that ”look like“ ustring tags.
Output Processing
Some tags with basic functionality are bound by the base system through the module Tags.pm, others are provided by modules such as ”script“ and ”cms“. You can add your own tags with a script or even override the predefined ones. A Perl script and a custom add-in module can define tags, too.
Let us see how to set output processors in UText script. Each tag must initially be declared.
declare tag <name>
To define how a tag should expand, one uses:
set tag <name> to <value>
or
set tag <name> do <script instruction>
or
set tag <name> begin <multiline script instructions> end
The keyword tag can be missing in the set instruction if there is no other current setting with the same name.
A declare and a set can be joined in a single instruction:
declare tag <name> to <value>
If a value is given, the tag expands as this value, if there are tags inside, they are expanded, too.
If some script instructions are given, these are executed by the UText script interpreter, and the tag expands as their output.
Examples:
declare tag x to hello [v name]
declare tag x1 do out hello [v name]
declare tag x2 do lit hello [v name]
declare tag y to (\%str)
declare tag z to <em>\%param</em>
Here [x] and [x1] expand as ”hello“ plus the binary data of the child unit with role name, whereas [x2] expands as the literal string ”hello [v name]“. [y something] expands as ”(something)“, [z/]some important thing[/z] expands as ”<em>some important thing</em>“. The arguments available for tag expansion are these:
- %op — the tag name, i.e. ”x“ in
[x.i/ p]c[/x] - %mod — the modifier, i.e. ”i“
- %param — the parameters, i.e. ”p“
- %str — the contents, i.e. ”c“
- %all — the whole string including tag marks, i.e. ”[x.i/ p]c[/x]“
Let us see an example. Suppose we are generating a website and want to define a tag [wp] to set links to Wikipedia articles.
We base on the tag [url <url>]<title>[/url] from the ”cms“ add-in module and write:
declare tag wp do url en.wikipedia.org/wiki/\%param begin \%param end
Now we can write things such as "I was born in [wp Barcelona] and moved to [wp Germany] ten years ago." and this will render as: I was born in Barcelona and moved to Germany ten years ago.
The tags bound with declare are set by default under the module ”main“. If you want to get them under a specific module name, you can set the variable current module before declaring the tag:
set current module to HTML
declare tag b to <b>\%str</b>
declare tag i to <i>\%str</i>
This way you can remove all ”HTML“ bindings at once with
unbind HTML
A Perl Processor Function
An output processor can also be defined in Perl with a function that looks like that:
sub out_html
{
my ($self,$all,$op,$mod,$param,$str) = @_;
my $results;
[...]
return $results;
}
The output processor is called each time the ustring evaluation comes to a tag. The parameters of the call have this meanings:
$op - operation
This is the tag name. For example for [mytag] it has the value "mytag".
$mod - the tag modifier
For example on [mytag.a] it is "a". If no modifier is present, its value is an empty string "".
$param - the tag parameters
The string contained in the tag begin mark between the tag name and the closing square bracket. Example: "alfa" on expanding [tag alfa] or even [tug/ beta][fox][/tug] if the ustring being processed looks like [tag/ [tug/ beta][fox][/tug]]something[/tag].
If $param admits tagged contents that are to be expanded, the processor function should do:
$param = $self->out($param) if $param;
Otherwise the tags contained in $param will remain unexpanded.
$str - the content enclosed between opening and closing tag
For example "hello" when processing [tag/]hello[/tag].
If $str admits tagged contents that are to be expanded, the processor function should do:
$str = $self->out($str) if $str;
$self - the current UText object
This object can be used to get context-sensitive information to the current position and to trigger the evaluation of the parameter and/oder the content string.
$all the whole string being expanded including tags
This contains the whole tag with parameters and contents. This parameter is normally not needed.
Binding the processor
In Perl one can bind output processors to certain tags with this method from the UText object:
set_binding(<tag name>,<processor function>)
For example after $ut->set_binding('n',\&out_name); all embedded tags [n name] or [n/ name]contents[/name] will be resolved by out_name().
There are 2 pseudotags ”.PRE“ and ”.POST“ whose output processors are called respectively before and after each ustring parsing.
To unset the bindings there are two functions:
$ut->remove_binding(<tag name>)
$ut->remove_bindings()
The first one deletes the binding for a particular tag, the second one deletes all bindings of the current module.
The bindings are set and removed for the current module, that is the Perl package where this functions are called from.
To operate on bindings for a specific module there are the follwing functions available:
$ut->set_out_binding(<module name>,<tag name>,<processor function>)
$ut->remove_out_binding(<module name>,<tag name>)
$ut->remove_out_bindings(<module name>)
Sample: Wikipedia Links
To create a tag for wikipedia links one can use a single UText script instruction:
declare tag wp to \
[sb url/ en.wikipedia.org/wiki/\%param \
\%param on Wikipedia]\%str \
[sb z/ \%str]\%param[sb /z][sb /url]
The escape characters \ and the ”square bracket“ tags [sb] above are required in order for the tags and parameters to be evaluated not when declaring the tag, but when expanding it. The tag gets expanded as:
[url/ en.wikipedia.org/wiki/%param
%param on Wikipedia]%str
[z/ %str]%param[/z][/url]
For example, [wp Computer] expands to:
[url/ en.wikipedia.org/wiki/Computer]Computer[/url]
and [wp/ Computer]Computation[/wp] expands to:
[url/ en.wikipedia.org/wiki/Computer]Computation[/url]
We can implement the same tag in Perl, too. In this case we write a function say out_wikipedia that returns a link:
sub out_wikipedia
{
my ($self,$all,$op,$mod,$param,$str) = @_;
return $self->out("[url/ en.wikipedia.org/wiki/$param]${param}[/url]");
}
and then bind it to our UText object before the website generation begins.
$ut->set_binding('wp',\&out_wikipedia);
And with one more line of code you can show a word and link optionally to a different one.
sub out_wikipedia
{
my ($self,$all,$op,$mod,$param,$str) = @_;
my $caption = $str || $param;
return $self->out("[url/ en.wikipedia.org/wiki/$param]${caption}[/url]");
}
This does not alter the previous syntax but adds this: putting "My [wp/ Barcelona]hometown[/wp] is great" you get "My hometown is great".
Tag List
For a complete list of the tags supported out of the box by the interpreter see the predefined tag list.

