Computing Pages

by Francesc Hervada-Sala


Output Processors

Expanding ustring tags

The Type UString

A ustring is a field type defined by the interpreter that can be used at any unit. It represents a character string with embedded tags. To get a string of it, a ustring must be evaluated and all its tags must be expanded as strings. The evaluation takes place in a UText script with the operation out and in a Perl script with the method call $ut->out($str).

Example:

~webpage =index
	~title My amazing Website
	~content
		~h1 [v title]

If you print out the ustring of the role h1 at a script with

lit h1

or in Perl with

print $ut->get_field('h1');

you will get '[v title]'. If you otherwise evaluate it with

out h1

or in Perl

print $ut->out($ut->get_field('h1'));

you will get 'My amazing Website', which is the value of field ”title“.

To generate a html page one can write at a script something like this, provided the $ut object is now situated at the webpage =index:

out begin
<html>
<title>[v title]</title>
<body>
	<h1>[v content.h1]</h1>
[...]
end

Tag Syntax

Overview

A ustring tag is enclosed in balanced square brackets. For example: my own [tag] is great contains a tag named ”tag“. The output processor will expand the tag at evaluation time and let the rest of the string ”my own [...] is great“ unchanged.

Tag names are case sensitive, being [x] and [X] two different tag names with separate output processors.

A tag can get a parameter: my own [tag by me] is great will be expanded into ”my own [...] is great“ and the bound output processor will receive the tag ”tag“ with the parameter string ”by me“.

A tag can span over a substring as in:

my own tag [tag/](I did it myself)[/tag] is great

This will be expanded as ”my own tag [...] is great“. The bound output processor will receive the content string ”(I did it myself)“. Opening and closing clause can be in separate lines.

A tag can get both a parameter and a content string at the same time:

my own tag [tag/ by me](I did it myself)[/tag] is great

A tag can also be affected by a modifier. A ustring such as my own tag [tag.small] causes the output processor to be called with the tag ”tag“ and the modifer ”small“.

Tags can be embedded in other tags, both as parameter and as contents.

[cite/]
	[author/]Immanuel Kant[/author]:
	[work/]Kritique Of Pure Reason[/work]
	Our age is, in especial degree, the age of criticism,
	and to criticism everything must submit.
[/cite]

The same content could alternatively be expressed as:

[cite/
	[author/]Immanuel Kant[/author]:
	[work/]Kritique Of Pure Reason[/work]
	]
	Our age is, in especial degree, the age of criticism,
	and to criticism everything must submit.
[/cite]

The signs [ and ] are interpreted always as tag signs. If you want to output a square bracket you must use the tag [sb]. For example if you want to output [not a tag] you write [sb not a tag]. You get just an open bracket [ with [sb.o] and a close bracket ] with [sb.c].

Specification

The general syntax form of a ustring tag, depending on if it is a standalone or a tag with content, is this:

[name.modifier parameters]
[name.modifier/ parameters]content[/name]

name — A tag name consists of a string of digits and/or letters, no whitespace and no other signs.

modifier — A modifier is a string of digits and/or letters, no whitespace but possibly other signs (including period), which are passed through to the output processor.

parameters and content — Are free ustrings, possibly including whitespace and embedded tags. The content can span over multiple lines.

At evaluation time the ustring syntax is checked up and the program aborts if it is not correct. The tags must be balanced, the last opened tag must be the first one to be closed. Square brackets can only occur as tag delimiters, use [sb] to output them literally. The program aborts also if at that time there is no bound processor to handle a found tag.

Tag Semantics

The ustring tags define segments in a string to be expanded. The content of the tag expansion is defined by the bound output processor, allowing the same ustring to produce different output for example for website generation and for LaTeX generation.

The parsing of the ustring goes this way: The tags are parsed from left to right, the bound processor is called for the left most tag until the very end of the tag and including all embeded tags and the results appended to the output string.

When evaluating a tag with embedded tags inside, the inner tags are not necessarly evaluated. This must be done by the output processor, if desired. This grants complete flexibility in ustring evaluation and allows an output processor to treat parameters and contents as they want to, even if they contain substrings that ”look like“ ustring tags.

Output Processing

Some tags with basic functionality are bound by the base system through the module Tags.pm, others are provided by modules such as ”script“ and ”cms“. You can add your own tags with a script or even override the predefined ones. A Perl script and a custom add-in module can define tags, too.

Let us see how to set output processors in UText script. Each tag must initially be declared.

declare tag <name>

To define how a tag should expand, one uses:

set tag <name> to <value>

or

set tag <name> do <script instruction>

or

set tag <name> begin <multiline script instructions> end

The keyword tag can be missing in the set instruction if there is no other current setting with the same name.

A declare and a set can be joined in a single instruction:

declare tag <name> to <value>

If a value is given, the tag expands as this value, if there are tags inside, they are expanded, too.

If some script instructions are given, these are executed by the UText script interpreter, and the tag expands as their output.

Examples:

declare tag x to hello [v name]
declare tag x1 do out hello [v name]
declare tag x2 do lit hello [v name]
declare tag y to (\%str)
declare tag z to <em>\%param</em>

Here [x] and [x1] expand as ”hello“ plus the binary data of the child unit with role name, whereas [x2] expands as the literal string ”hello [v name]“. [y something] expands as ”(something)“, [z/]some important thing[/z] expands as ”<em>some important thing</em>“. The arguments available for tag expansion are these:

Let us see an example. Suppose we are generating a website and want to define a tag [wp] to set links to Wikipedia articles.

We base on the tag [url <url>]<title>[/url] from the ”cms“ add-in module and write:

declare tag wp do url en.wikipedia.org/wiki/\%param begin \%param end

Now we can write things such as "I was born in [wp Barcelona] and moved to [wp Germany] ten years ago." and this will render as: I was born in Barcelona and moved to Germany ten years ago.

The tags bound with declare are set by default under the module ”main“. If you want to get them under a specific module name, you can set the variable current module before declaring the tag:

set current module to HTML
declare tag b to <b>\%str</b>
declare tag i to <i>\%str</i>

This way you can remove all ”HTML“ bindings at once with

unbind HTML

A Perl Processor Function

An output processor can also be defined in Perl with a function that looks like that:

sub out_html
{
	my ($self,$all,$op,$mod,$param,$str) = @_;
	my $results;
	[...]
	return $results;
}

The output processor is called each time the ustring evaluation comes to a tag. The parameters of the call have this meanings:

$op - operation

This is the tag name. For example for [mytag] it has the value "mytag".

$mod - the tag modifier

For example on [mytag.a] it is "a". If no modifier is present, its value is an empty string "".

$param - the tag parameters

The string contained in the tag begin mark between the tag name and the closing square bracket. Example: "alfa" on expanding [tag alfa] or even [tug/ beta][fox][/tug] if the ustring being processed looks like [tag/ [tug/ beta][fox][/tug]]something[/tag].

If $param admits tagged contents that are to be expanded, the processor function should do:

$param = $self->out($param) if $param;

Otherwise the tags contained in $param will remain unexpanded.

$str - the content enclosed between opening and closing tag

For example "hello" when processing [tag/]hello[/tag].

If $str admits tagged contents that are to be expanded, the processor function should do:

$str = $self->out($str) if $str;

$self - the current UText object

This object can be used to get context-sensitive information to the current position and to trigger the evaluation of the parameter and/oder the content string.

$all the whole string being expanded including tags

This contains the whole tag with parameters and contents. This parameter is normally not needed.

Binding the processor

In Perl one can bind output processors to certain tags with this method from the UText object:

set_binding(<tag name>,<processor function>)

For example after $ut->set_binding('n',\&out_name); all embedded tags [n name] or [n/ name]contents[/name] will be resolved by out_name().

There are 2 pseudotags ”.PRE“ and ”.POST“ whose output processors are called respectively before and after each ustring parsing.

To unset the bindings there are two functions:

$ut->remove_binding(<tag name>)
$ut->remove_bindings()

The first one deletes the binding for a particular tag, the second one deletes all bindings of the current module.

The bindings are set and removed for the current module, that is the Perl package where this functions are called from.

To operate on bindings for a specific module there are the follwing functions available:

$ut->set_out_binding(<module name>,<tag name>,<processor function>)
$ut->remove_out_binding(<module name>,<tag name>)
$ut->remove_out_bindings(<module name>)

Sample: Wikipedia Links

To create a tag for wikipedia links one can use a single UText script instruction:

declare tag wp to                         \
  [sb url/ en.wikipedia.org/wiki/\%param  \
  \%param on Wikipedia]\%str              \
  [sb z/ \%str]\%param[sb /z][sb /url]

The escape characters \ and the ”square bracket“ tags [sb] above are required in order for the tags and parameters to be evaluated not when declaring the tag, but when expanding it. The tag gets expanded as:

  [url/ en.wikipedia.org/wiki/%param 
  %param on Wikipedia]%str              
  [z/ %str]%param[/z][/url]

For example, [wp Computer] expands to:

[url/ en.wikipedia.org/wiki/Computer]Computer[/url]

and [wp/ Computer]Computation[/wp] expands to:

[url/ en.wikipedia.org/wiki/Computer]Computation[/url]

We can implement the same tag in Perl, too. In this case we write a function say out_wikipedia that returns a link:

sub out_wikipedia
{
my ($self,$all,$op,$mod,$param,$str) = @_;
return $self->out("[url/ en.wikipedia.org/wiki/$param]${param}[/url]");
}

and then bind it to our UText object before the website generation begins.

$ut->set_binding('wp',\&out_wikipedia);

And with one more line of code you can show a word and link optionally to a different one.

sub out_wikipedia
{
my ($self,$all,$op,$mod,$param,$str) = @_;
my $caption = $str || $param;
return $self->out("[url/ en.wikipedia.org/wiki/$param]${caption}[/url]");
}

This does not alter the previous syntax but adds this: putting "My [wp/ Barcelona]hometown[/wp] is great" you get "My hometown is great".

Tag List

For a complete list of the tags supported out of the box by the interpreter see the predefined tag list.

Print Contact

Output Processors

The Type UString

Tag Syntax

Overview

Specification

Tag Semantics

Output Processing

A Perl Processor Function

Binding the processor

Sample: Wikipedia Links

Tag List

UText/1.2 Manual

Copyright

Getting Started

Installation

Quick Tour

User Guide

Universaltext Language

Feeding Text

Alternate Parsers

Text Selectors

Output Processors

Universaltext Script

Add-In Modules

Reference: Base Modules

UText.pm

UTL.pm

Navigation.pm

Tags.pm

FILE.pm

Reference: Script

Script.pm

Functions.pm

Settings.pm

utshell.pl

Reference: Extensions

cms add-in

odt add-in

types add-in

env add-in

Reference: Predefined Operations

Operations Index

Tags

Functions

Add-In Hooks

Project Universal Text

Forerunner

UText/1

Milestones

Text Engine

Text Repository

Text Server

Text Workbench

Text OS

Design Documents

Concepts

Universal Text Language

UTL Syntax

UTL Name System

Architecture

Glossary

Discussion

On Text Structure