Documentation for the AscToRTF conversion utility

This documentation can be downloaded as part of the documentation set in .zip format.


Prev | Next | Contents


7 Using the preprocessor

The preprocessor was introduced in version V1.05 to allow users more flexibility in the output they generate. As such it moves AscToRTF towards being a RTF authoring tool, as opposed to a simple text conversion or migration tool. Although this wasn't AscToRTF's original intention, it is increasingly the use to which AscToRTF's "power users" are putting it. As such this is a rapidly growing area of functionally within the product.

The preprocessor looks for lines that begin with a special character sequence. Presently this is "$_$_", but this will become configurable in later versions.

Preprocessor lines are not normally copied to the output generated. Instead they are used to modify AscToRTF's behaviour in a number of ways.


7.1 Marking up sections of text

The pre-processor can be used to mark sections in your document so that AscToRTF will process them as you wish.

Note:
AscToRTF does attempt to spot much user-formatted text automatically, but this is a difficult area and prone to error. Hence the use of these directives can reduce the error rate on such occasions.

7.1.1 User SECTIONS

This directive is used to divide the document up into named section types. Section type names can be repeated through the document, and by default text is assumed to belong to a section called "all", indicating that this text is always copied to the output file.

Section type names must contain no white space, but may contain underscores.

This has no effect unless the user supplies a policy file indicating that they wish to select only certain section types for output.

For example, if the text document looks like this

                Some text that'll always get copied, because it is in an
                "all" section type by default.

        $_$_SECTION Private

                Some text that will be copied either when the preprocessor
                is switched off, or when the user's policy file indicates
                that "private" section types are to be included.

        $_$_SECTION Other

                Likewise, this is an "other" section type.

        $_$_SECTION Private

                And here's some more "private" text.

        $_$_SECTION all

                Some text that will always get copied because it is explicitly
                in an "all" section type.

If the user then supplies a document policy file which includes the lines (see 6.3.5)

[Preprocessor]
Use Preprocessor : Yes

then the two section types marked "private" won't be copied into the converted file unless the line


Include document section : Private



is added to the policy file. Similarly with the "other" section.

Note_1:
Strictly speaking the "use preprocessor" line above isn't needed as this is set to "yes" by default. This means that any $_$_SECTION lines will cause text to be omitted unless you supply an appropriate policy file.
Note_2:
Be aware that any sections omitted are also omitted from the analysis pass. This may have unexpected results as AscToRTF responds only to the input text that is to be included in the output.

7.1.2 TABLE, COMMA_DELIMITED_TABLE and DELIMITED_TABLE sections

The BEGIN_TABLE ... END_TABLE directives are used to bracket a table in the source text. AscToRTF will then attempt to analyse this table as best it can.

The BEGIN_DELIMITED_TABLE ... END_DELIMITED_TABLE directives can be used to delimit a series of tab-delimited data values that should be interpreted as a table (e.g. data originally exported from a spreadsheet such as Excel)

Similarly BEGIN_COMMA_DELIMITED_TABLE ... END_COMMA_DELIMITED_TABLE directives can be used to delimit a series of comma-delimited data values that should be interpreted as a table.

Inside these sections you can use other TABLE pre-processor commands to further tailor the output generated (see 7.4).

The presence of these directives overrides any value set in the "Attempt table generation" policy


7.1.3 CONTENT sections

The BEGIN_CONTENTS ... END_CONTENTS directives are used to bracket a contents list in the source document.

AscToRTF will attempt to automatically detect the presence and location of any contents list in the document, but the algorithm can be problematic.

Use this markup only when the document contains a contents list that AscToRTF fails to detect correctly.

See the discussion in 5.6.2.


7.1.4 CODE sections

The BEGIN_CODE ... END_CODE directives are used to bracket a piece of sample code in the source text.

AscToRTF will render this using the "Code" style which uses a non-proportional font. See "the use of RTF stylesheets".


7.1.5 DIAGRAM sections

The BEGIN_DIAGRAM ... END_DIAGRAM directives are used to bracket a piece of Ascii art or text diagram in the source text.

AscToRTF will render this using the "Diagram" style which used a non-proportional font. See "the use of RTF stylesheets".


7.1.6 PRE (pre-formatted text) sections

The BEGIN_PRE ... END_PRE directives are largely replaced by the TABLE, CODE and DIAGRAM directives, which give a more accurate description of the type of pre-formatted text being dealt with.

AscToRTF will render this using the "Preformatted" style which used a non-proportional font. See "the use of RTF stylesheets".


7.1.7 IGNORE sections

The BEGIN_IGNORE ... END_IGNORE directive delimit a section of text that should be ignored. This could be used to place comments in the source file, or to mark text that shouldn't be converted when the file is being generated by some third party software package.


7.2 Commands that influence the document's properties header

RTF documents can have details placed in the header. These details are not normally displayed in the document, but can be seen separately. For example in Word these can be seen under the File -> Properties menu.

On some systems document management tools can use these properties in searches, e.g. to find files authored by a particular person.


7.2.1 The TITLE command

This directive allows you to specify the title of the document.

The presence of a TITLE command overrides any title specified in a policy file (see 6.3.1).

To fully understand how titles are calculated, see the discussion in 5.6.1


7.2.2 The DESCRIPTION command

This directive allows you to specify a description of your document that is added to the subject property in the document header.

The presence of a DESCRIPTION pre-processor command overrides any description specified via either a "Document subject" or "Document description" policy line.


7.2.3 The KEYWORDS command

This directive allows you to specify keywords that are added to a the document properties header.

The presence of a KEYWORDS pre-processor command overrides any keywords specified via a "Document keywords" policy line.


7.3 One line pre-processor commands

7.3.1 The INCLUDE command

This directive allows you to specify the name of a source file to be included at this point. This is useful if you wish some standard text inserted into many related documents, or into the same documents at many locations.

The included file will be treated as though it were part of the original file during both the analysis and output passes.

The include will fail is the fail cannot be found, and a test for recursive include files will be made.

You may need to include a path name to the source file if it isn't in the same directory as the original source file. Relative paths are supported (and encouraged for portability).


7.3.2 The CONTENTS_LIST command

These command allow you to insert a contents list at a point of your choosing.

The CONTENTS_LIST directive may also be supplied as an in-line tag (see 8.2.3). The same user arguments as described there apply.


7.3.3 The LINERULE command

The LINERULE directive allows you to insert a horizontal line into your text. It has the syntax:-

LINERULE <length>,<thickness>

where

<length>
length of line in pixels/pts
<thickness>
thickness of line in pixels/pts

Currently the <thickness> is ignored when generating RTF.


7.4 The TOC command

Not currently supported for RTF generation


7.5 The TABLE commands

These directives are used to tailor the output generated for any tables AscToRTF creates. They are placed either

  1. At the top of the file

Directives placed here become defaults for the whole file, and will replace any policies that have been set (see 6.3.7)

  1. Inside a BEGIN_TABLE ... END_TABLE section

Directives placed here will apply only to the table marked up by these commands (see 7.1.2).

The table commands are described (naturally enough) in the following table.

Directive
Value
Effect
TABLE_CELL_ALIGN

Align

Specifies the default alignment of
cells. Left, right or center
TABLE_CONVERT_XREFS




(none)




If present, indicates that any section
cross-references in the table may
be converted to hyperlinks
(see also the policy line
"Convert TABLE X-refs to links")
TABLE_HEADER_ROWS

Number

Number of header rows. These
will be placed in <TH> .. </TH> markup
TABLE_HEADER_COLS

Number

Number of header columns.
These will be marked up in bold
TABLE_MAY_BE_SPARSE


(none)


If present, indicates that the TABLE
may be sparse (see also the policy
"Expect sparse tables")
TABLE_MIN_COLUMN_SEPARATION



Number



Number of spaces to be taken as a
column separator when analysing the
table (see also the policy
"Minimum TABLE column separation").
TABLE_WIDTH

Text

The width of the table (see also the
policy "Default TABLE width")


7.6 The CHANGE_POLICY command

NOTE:
This feature has the potential to cause mayhem, and as such is offered to users on a "as is" basis. That is, we offer no support for getting this feature to have the effect a user may desire.

This directive allows you change a particular policy in part of a document. This is a potentially powerful feature, allowing you to tailor the conversion of your file in different sections of that file, or to embed the policy particular to a file in commands inserted at the top of the file itself.

The syntax of the command line is

$_$_CHANGE_POLICY <Policy Line>

where <Policy_line> is a policy line as it would appear in a policy file, and (usually) as it appears in the Policy manual.

For example the following would all be valid directives

        $_$_CHANGE_POLICY Create mailto links : No

Although how and when they would take affect will depend on the policy.

In this example an email addresses after the CHANGE_POLICY line will not be converted to hyperlinks. In this particular case this policy can be switched on and off several times. This is not true for all policies, for example the document title can only be set once.

There are a many caveats to this behaviour :-

Not all policies may be changed in this way. In particular policies that open other policy files are not supported. Even if a policy if "changed", it does not follow that changing the policy will have an effect.

It is unlikely that this feature can be sensibly used to influence the analysis of file, other than when placed at the top of the file only. If such a manner it is simply an alternative to using a separate policy file.

Output policies are referenced at different times. Only those that are referenced after the line is read from the source file may be influenced, thus things like output file name may have no effect.

Not all policies once changed, can be changed back. This is particularly of policies that contain values to be added to a list. This is an issue that may be addresses in later versions.

Messing with policies can cause unpredictable behaviour. For example if you alter the section splitting parameters, then the chances of a section cross-reference elsewhere in the document being calculated as a correct hyperlink diminishes.

That's why this feature is offered UNSUPPORTED

For more details see the Policy manual.



Prev | Next | Contents


Valid HTML 4.0! Converted from a single text file by AscToHTM
© 1997-2000 JafSoft Limited
Converted by AscToHTM