Documentation for the JafSoft text conversion APIs


Using JafSoft Conversion API

This document describes the API's that are available for the Text conversion products produced by JafSoft Limited. These include

AscToHTM Text-to-HTML conversion
AscToRTF Text-to-RTF conversion
AscToTab Text-to-table (HTML and RTF) conversion
Detagger HTML-to-Text conversion and tag removal.

Although these converters are written in C++, the API is exported as "C"-like methods, and can be called from C/C++, C#, Visual Basic or Java. The standard distribution is supplied under Windows, but customers with access to the source have successfully compiled and integrated the API into systems running under OpenVMS, Linux and Solaris.

If you have any particular enquiries, contact info<at>jafsoft.com (replace "<at>" by "@").

Table of Contents

Overview
Integrating the API into your software
Calling the API from C/C++
Using the DLL
Using static linking
C++ example
Calling the API from .NET
Calling the API from C# (C sharp)
Calling the API from Visual Basic
Passing text data into and out of the API
Defining the API
Calling the API
Visual Basic example
Calling the API from Java
Calling the API from inside Lotus Notes
LotusNotes example
Using the API on non-Windows platforms
Using The API
Allocating and releasing the API
Customising the conversion using policies
Policy files
Policy types
More documentation on policies
Specifying the conversion types
Performing the conversion
Setting up the input and output destinations
Performing conversion between files
Performing conversion between string buffers
Performing mixed conversions
Testing for success using the API return values and the "Result" argument
API return values
API result codes
Passing character data to and from the converter
When to use string or (char *) "pointers" to pass character data
Sample code using C++ strings
Sample code using (char *) pointers
Checking the conversion results when using (char *) pointers
Passing Unicode data to the converter
The various Unicode implementations
How the API handles Unicode internally
How the API detects the presence of Unicode
Doing file-to-file conversions
Doing string-to-string conversions
Using the "input text encoding" policy
Using the "output text encoding" policy
Summary of Unicode usage
Capturing error messages
The API demonstration package
API methods
Initialise and release methods
CONVERTER_Allocate
CONVERTER_Free
Policy manipulation methods
CONVERTER_ResetPolicies
CONVERTER_ReadPolicyFile
CONVERTER_WritePolicyFile
CONVERTER_SetPolicyValue
CONVERTER_GetPolicyValue
Input and output specification methods
CONVERTER_ResetSources
CONVERTER_ResetInputSource
CONVERTER_ResetOutputSource
CONVERTER_SetInputString
CONVERTER_SetOutputString
CONVERTER_SetInputFilename
CONVERTER_SetOutputFilename
CONVERTER_GetOutCharArraySize
CONVERTER_GetOutCharArray_Ptr
Conversion methods
CONVERTER_DoConversion
CONVERTER_DoFileConvert
CONVERTER_DoStringConvert
Error reporting methods
CONVERTER_SetErrorFn
CONVERTER_SetOutFn
Debugging methods
CONVERTER_DebugAPI
CONVERTER_DebugAPILogMessage
CONVERTER_GetLastMessage

Overview

The typical calling sequence when using the API is as follows

  1. Call CONVERTER_Allocate to allocate API resources.
    See Allocating and releasing the API

  2. (optional) Specify conversion options, by supplying a policy file or by setting individual policies.
    See Customising the conversion using policies

  3. (optional) Set up the input sources and output targets for the conversion.
    See Setting up the input and output destinations

  4. Execute the conversion itself.
    See Performing the conversion

  5. Repeat steps (2)-(4) as wanted

  6. Call CONVERTER_Free to free the API resources.
    See Allocating and releasing the API

For example a small C++ program might look as follows

    include "converter.h";
    include "api_defines.h";

    ...

    string inputFile = "input.txt";
    string outputFile = "output.html";

    long Result = R_SUCCESS;
    long APIResult = CONV_OK;
 
    // Allocate the API resource
    long Handle = CONVERTER_Allocate();

    // do a file conversion
    APIResult = CONVERTER_DoFileConvert (Handle,
                                         CT_NORMAL,
                                         inputFile,
                                         outputFile,
                                         Result);
 
    // test for success
    if (API_Result == CONV_OK && Result = R_SUCCESS) {
        cout << "Conversion worked okay!" << endl;
    }
 
    // free the API resource
    CONVERTER_Free(Handle);


Integrating the API into your software

Calling the API from C/C++

The API software is itself written in C++, and so in principle you can link to either the library or DLL forms of the API. This means that all things being equal you can call the string-based versions of the API methods, which are easier to use. Linking statically will make your .exe larger, but will avoid the need to manage the delivery and installation of the DLL.

The API will be delivered including the following files

<API_name>.dll
The API in DLL form
<API_name>.lib
The library file for the DLL version of
the API. This will be a comparatively
small file, just a few Kb in size
<API_name>_nodll.lib The library file for the non-DLL version
of the API. This will be a large file,
typically a few Mb in size.


Using the DLL

To use the DLL, include "converter.h" and "api_defines.h" in your source code, and then link your software against the <API_name>.lib file. This will be the smaller of the two .lib files, as it only contains wrappers for the DLL methods.

Once linked, you will need to ensure the DLL is either in the same folder as your executable file, or in your system directory.

Note, when using the DLL version, using string objects can become a problem as the implementation of the string objects varies from one C++ implementation to the next. In particular C++ inside .NET projects cannot access the string objects inside the supplied DLL because of a binary incompatibility.

(See Passing character data to and from the converter)


Using static linking

To use static linking, include "converter.h" and "api_defines.h" in your source code, and then link your software against the <API_name>_nodll.lib file. This will be the larger of the two .lib files.

Once linked you will be able to run your program independently of the DLL.


C++ example

An example program TestAPI.cxx is included in the Demonstration package, together with the converter.h and api_defines.h header files to define how the converter should be accessed in C++.

Here's an example of calling the Detagger API to convert a HTML file to text using the string version of the API methods.

#include "api_defines.h"
#include "converter.h"

    long ConvertType = CT_CONVERT_TO_TEXT;      // convert file to plain text
 
    string inFile  = "c:\temp\input.html";
    string outFile = "c:\temp\output.txt";

    long Result, APIstatus;
 
    // Allocate the API and get a handle used in subsequent calls
    long Handle = CONVERTER_Allocate();

    Result = R_SUCCESS;
    APIStatus = CONVERTER_DoFileConvert (APIHandle, ConvertType, inFile, outFile, Result);
 
    if (APIStatus != CONV_OK || Result != R_SUCCESS) {
        // you could test the value of Result to see what went wrong
        CONVERTER_Free (APIHandle)
        return EXIT_FAILED;
    }
 
    // Free up the converter
    APIStatus = CONVERTER_Free (APIHandle);
    if (APIStatus != CONV_OK || Result != R_SUCCESS) return EXIT_FAILED;


Calling the API from .NET

In principle C/C++ code should be callable from .NET projects, but as discussed above, the implementation of the string object varies under .NET, leading to binary incompatibilities. Furthermore it seems the implementation of string within .NET changed between versions, causing yet another binary incompatibility. For this reason unless you get a library or DLL that specifically matches your version of .NET you will get link and/or runtime errors.

For this reason I would advise .NET developers to use the _ptr variants and pass arguments as (char *) values (see Passing character data to and from the converter)

See calling the API from C/C++.


Calling the API from C# (C sharp)

Some API users have managed to call the API in DLL from inside C#. To do this you need to create a wrapper class that contains the API and exposes its methods. In this class you need to declare a method for each API method you wish to expose and to use a DllImport to associate this with the matching method inside the DLL itself.

In the Demonstration package the folder "C# demos" contains the file DetaggerAPI.cs as an example kindly provided by a user who got this working.

Once you have a wrapper class, you can then use this to invoke the API as required.

In calling the DLL from C# the _ptr variant of the API methods must be used (see Passing character data to and from the converter)

NOTE:
The samples provided in the Demonstration package are "as is" and may not be current. In particular you should check that they are compatible with the current API as defined by the C++ header file converter.h.

Calling the API from Visual Basic

Visual Basic can only call the DLL version of the API, and has to pass text data as character pointers see Passing character data to and from the converter)

Sample VB applications are available in the API demonstration package.


Passing text data into and out of the API

Visual Basic String variables cannot be mapped onto C++ string variables, so instead the VB code has to call the (char *) variants of the API methods (those whose name has "_Ptr" appended).

See When to use string or (char *) pointers to pass character data.


Defining the API

In order to use the API methods, they first must be correctly declared. This is done in declarations such as this

        Public Declare Function CONVERTER_ReadPolicyFile_Ptr  _
                    Lib "h:\DemoAPI\DLLs\rtfconv_eval"        _
                        (ByVal handle As Long,                _
                         ByVal policyfilename As String,      _
                         ByRef result As Long) As Long

In this example the DLL location "h:\DemoAPI\DLLs\rtfconv eval" is given explicitly (in this case for the AscToRTF demo DLL). If you copy the DLL to your system directory, the path can be omitted and only the DLL name "rtfconv eval" need be used.

Note: The actual DLL name will depend on which API you are working with.

Full declarations for the API are contained in the API demonstration package. These contain files such as RTFConv.bas, which is effectively a translation into VB of the C++ header file "converter.h". Only the "_Ptr" variants are defined, as VB has to use these.

Should you want to install the DLL in a non-system folder, you will need to edit this VB file to change all the references to the correct location.


Calling the API

To call the API you must first make sure it is properly defined (see defining the API) and include the API definition in your project.

Once this is done, you are free to call most of the API methods, using the "_Ptr" variants to pass text data where they exits.


Visual Basic example

Here is a snippet of Visual Basic code, that calls the API methods. In this case there is an RTFConv object which is the AscToRTF API converter object, declared in a separate module. Converter declaration files are available in the API demonstration package available.

'-- initialise some data values

    ll_on = 1
    ll_off = 0

'-- Allocate new RTFConverter resources to get a handle (needed in subsequent calls)

    ConverterHandle = RTFConv.CONVERTER_Allocate()
 
'-- switch the various API debug modes on/off
 
    ' we don't want the call-by-call reporting
    RTFConv.CONVERTER_DebugAPI ll_off
 
    ' ... but we will have a log file, thanks.
    ls_logfile = "c:\temp\debug_API.log"
    RTFConv.CONVERTER_DebugAPILogMessage ll_on, ls_logfile
 
'-- set any policies

    ls_policyname1 = "default font"
    ls_policyvalue1 = "Verdana, regular, 12"
 
    retval = RTFConv.CONVERTER_SetPolicyValue_Ptr(ConverterHandle, ls_policyname1, _
                                            ls_policyvalue1, result)
 
'-- now execute file conversion

    On Error GoTo ShowResult

    ' Do a NORMAL conversion
    Dim il_ConvType As Long
    il_ConvType = RTFConv.CT_NORMAL
    result = 0
    retval = RTFConv.CONVERTER_DoFileConvert_Ptr(ConverterHandle, il_ConvType, _
                                        ls_inputfilename, ls_outputfilename, result)
 
    Status.Caption = "Output file is " + ls_outputfilename
 
'-- fetch the last API message (only useful if there's an error - and not always then)

    Dim message As String
    Dim messagesize As Long
 
    messagesize = 150
    message = Space(messagesize)
 
    retval = RTFConv.CONVERTER_GetLastMessage_Ptr(message, messagesize, result)
    Status.Caption = "<" + message + ">"

'-- release the API resources

    retval = RTFConv.CONVERTER_Free(ConverterHandle)
 


Calling the API from Java

Some API users have managed to invoke the DLL versions of the API from inside Java programs. To do this it is necessary to create a C++ class that uses JNIEXPORT to expose its methods in a way that is accessible from Java. This class can then be called form inside Java to access the functionality of the API.

In the Demonstration Package, some samples of this kindly supplied by and API user are provided in the "JNI demo" folder.

Because Java Strings are not compatible with C++ string objects, the _ptr variant of the API methods must be used inside the wrapper class (see Passing character data to and from the converter)

NOTE:
The samples provided in the Demonstration package are "as is" and may not be current. In particular you should check that they are compatible with the current API as defined by the C++ header file converter.h.

Calling the API from inside Lotus Notes

Some users of the API have managed to invoke the DLL version of the API from inside LotusScript. They have kindly provided some sample code which is included in the "LotusNotes demo" folder of the Demonstration Package.

As with most other languages, Lotus Notes has to use the _ptr variants of the API methods (see Passing character data to and from the converter)

NOTE:
The samples provided in the Demonstration package are "as is" and may not be current. In particular you should check that they are compatible with the current API as defined by the C++ header file converter.h.

LotusNotes example

This example was supplied by a user who got the API to work inside Lotus Notes. Note the comment about declaring the result as Long to avoid a type mismatch error.

        Sub ConvertToText
          Dim ConverterHandle As Long
          Dim ls_inputfilename As String
          Dim ls_outputfilename As String

          ls_inputfilename$  = "C:\\Documents and Settings\\user\\Desktop\\table_to_unhtml.htm"
          ls_outputfilename$ = "C:\\Documents and Settings\\user\\Desktop\\table_to_unhtml.txt"

          ConverterHandle = CONVERTER_Allocate()

          Dim result As Long  '  <----- added to eliminate 'type mismatch' error on '"result"
          Dim il_ConvType As Long

          il_ConvType = CT_CONVERT_TO_TEXT
          result = 0
          retval = CONVERTER_DoFileConvert_Ptr ( _
          ConverterHandle, _
          il_ConvType, _
          ls_inputfilename, _
          ls_outputfilename, _
          result )

          retval = CONVERTER_Free(ConverterHandle)
        End Sub


Using the API on non-Windows platforms

At present the API is only readily available under Windows. However the core code has been successfully built and run under OpenVMS, Windows, Linux and Solaris, and could probably be easily ported to other platforms as it is relatively OS-neutral.

JafSoft Limited can currently only offer to support Windows and OpenVMS versions. To build a version on any other platform, you will need to sign a special agreement to get the source code. This is normally more expensive than the usual API cost, and in some cases may not be granted.

Email JafSoft Limited (info<at>jafsoft.com) with your requirements in this case (replace "<at>" by "@").


Using The API

Allocating and releasing the API

When using the API it is necessary to first allocate some API resources. You do this by calling CONVERTER_Allocate which returns a "handle". This is an ID that tells the API which resources are being used. You need to pass this handle into all subsequent API calls.

Once you are finished with this API handle, you should call CONVERTER_Free to release the API resource. Once you've done this you won't be able to continue using the same handle.

Inside the API the CONVERTER_Allocate call creates a new API object. As the conversion proceeds, this object will allocate memory. For example the output of the last conversion is usually held in memory. Calling CONVERTER_Free releases all this resource by causing the API object to be deleted and all it's memory released.

If you don't call CONVERTER_Free, you will have a memory leak that will consume an amount of memory comparable to he size of the data converted.

So a typical use of the API would be as follows :-

    // Allocate the API resource
    long Handle = CONVERTER_Allocate();

    // ... use the converter as you wish
 
    // free the API resource
    CONVERTER_Free(Handle);


Customising the conversion using policies

Each converter will accept options that can influence the analysis, or alter the output from the conversion process.

These options are known as "policies" and they may be saved in text files known as policy files.

The API offers several Policy manipulation methods which allow you to load a policy file, or to set individual policies before the conversion.

The API also includes methods which allow you to interrogate the value of a policy, or to dump all current policy values to file. You wouldn't normally want to do that unless you wanted to see how certain policies had been changed during the conversion. For example you might want to check the policy "expect underlined headings" to see if the converter had automatically detected underlined headings. If it hadn't, you might choose to explicitly set this policy before conversion in future.

Policies consist of a "policy name" - basically a text description - and a value. You should read the documentation for the converter you are interested in for more details.

See also the Policy Manual, but be aware that not all policies apply to all converters.


Policy files

Policy files are plain text files with a .pol extension. They contain one policy per line (i.e. no hard breaks within a policy) as follows

<policy_name> : <policy_value>

Blank lines and comments (lines beginning with "!") are allowed, and there are a number of recognised headings enclosed in brackets that are ignored. The headings are used for convenience to group policies together and to make the file easier to read. In general the order in which policies appear in the file doesn't matter.

The following is a sample fragment of a policy file

        [Added HTML]
        Document Title           : User manual for AscToHTM
        Document Keywords        : ASCII, text, HTML, conversion, utility, shareware

        [Contents]
        Add contents list        : Yes

        [Frames]
        Header frame depth       : 110
        Footer frame depth       : 90


Policy types

Policies come in a number of types, with the value formatted accordingly

Integer
integer value
Boolean
"yes", "no"
Text
any free text
Alignment
"left", "right", "centered", "justified"
Colour
any valid HTML colour hex value, or one of
the 16 standard colour names
Font Format liable to change, but currently
compatible with the MFC FontDialog control
using
"font name, weight, point size"
e.g.
Arial, regular, 12
Verdana, bold, 10

The special value "(none)" can be taken to mean "not set". See the converter documentation and the Policy Manual for details of individual policies


More documentation on policies

The use of "policies" is the same for all converters, but the actual policies supported will vary from converter to converter.

You should download and check the documentation for the converter you are interested in.

You should also review the Policy Manual. If you've download the Windows version of the converter, this was probably included in the download. If not you can find it online at

http://www.jafsoft.com/doco/policy_manual.html

Some useful policies, common to most converters, are below

Diagnostics  

Generate diagnostics files
Yes/No
Error messages  
  Display messages Yes/No

Error reporting level
1-10 (10 is high, shows only important messages)
  Suppress INFO messages Yes/No
  Suppress TAG ERROR messages Yes/No
  Suppress URL messages Yes/No
  Suppress WARNING messages Yes/No

Suppress program ERROR messages
Yes/No
Contents List  


Fonts
Add contents list
Yes/No
  Default Font "Times New Roman, regular, 10"
  Fixed Font "Courier, regular, 8"

Heading Font
"Arial, bold, 10"
Analysis (headings)  
  Expect Capitalised Headings Yes/No
  Expect Embedded Headings Yes/No
  Expect Numbered Headings Yes/No

Expect Underlined Headings
Yes/No
Analysis (various)  
  Attempt TABLE generation Yes/No
  Look for MAIL and USENET headers Yes/No
  Look for bullets Yes/No
  Look for character encoding Yes/No
  Look for diagrams Yes/No
  Look for horizontal rulers Yes/No
  Look for hanging paragraphs Yes/No
  Look for indentation Yes/No
  Look for preformatted text Yes/No
  Look for quoted text Yes/No
  Look for short lines Yes/No

Look for white space
Yes/No
Line/paragraph formatting  
  Preserve file structure using <PRE> Yes/No
  Preserve line structure Yes/No
  Preserve new paragraph offset Yes/No

Specifying the conversion types

The ConvType argument passed into the various Conversion methods is interpreted as follows. The default conversion type for most converters is CT_NORMAL.

  CT_NORMAL 1 input is normal ASCII text
  CT_TEXT_WITH_TAGS 2 input contains added HTML hyperlinks that
should be preserved if possible (HTML
conversion only)
Table types    
  CT_TEXT_TABLE 3 input is a plain text table. The converter
will attempt to analyse the text into tables
and rows
  CT_TAB_DELIMITED_TABLE 4 input is tab-delimited text in a table. Each
line will be treated as a table row, and
each value placed in a cell by itself
  CT_COMMA_DELIMITED_TABLE 5 input is comma-delimited text in a table.
Each line will be treated as a table row, and
each value placed in a cell by itself
Detagger types    
  CT_REMOVE_MARKUP 6 Detagger option. Markup will be selectively
removed from a markup file
  CT_CONVERT_TO_TEXT 7 Detagger option. Markup file will be converted
to text
AscToTab types (output to RTF)    
  CT_TEXT_TABLE_RTF 8 Same as CT_TEXT_TABLE, but specifies RTF
output (instead of HTML)
  CT_TAB_DELIMITED_TABLE_RTF 9 Same as CT_TAB_DELIMITED_TABLE, but specifies
RTF output (instead of HTML)
  CT_COMMA_DELIMITED_TABLE_RTF 10 Same as CT_COMMA_DELIMITED_TABLE, but specifies
RTF output (instead of HTML)


Performing the conversion

Setting up the input and output destinations

The API can support both external files and internal string buffers as input sources and output targets. If you are converting a file into a file, or a buffer into a buffer, then you can do so directly by calling the correct conversion method (CONVERTER_DoFileConvert and CONVERTER_DoStringConvert respectively).

If you want to convert mixed types (file to buffer or vice versa) then you will need to call the Input and output specification methods to setup the input source and output target before calling the general purpose CONVERTER_DoConversion method.

See Conversion methods


Performing conversion between files

You can convert files by calling the CONVERTER_DoFileConvert method.

The input filespec may include wildcards, and the output filespec may be just a directory name (or even blank). When converting files, by default the output file will be placed in the same folder, with the same name but with an extension suited to the output format.

NOTE:
When calling the Detagger API to remove markup, the output file may have the same name as the input file. This may lead to an error being reported.

Performing conversion between string buffers

You can convert between string buffers by calling the CONVERTER_DoStringConvert method.

If you're calling the DLL version of the API (e.g. from Visual Basic), then you'll need to call the "_Ptr" variant. It you do this, make sure you test the Result to check that the output buffer you supplied was large enough.

See comments in "Passing character data to and from the converter"


Performing mixed conversions

It's possible to convert from source files to string buffers, or to convert a string buffer into an output file. To do this you must first make calls to the desired Input and output specification methods and then call the general purpose method CONVERTER_DoConversion.

You should test the Result to ensure adequate inputs and outputs had been supplied.

If you want to do multiple conversions you may need to reset the input and output between calls.


Testing for success using the API return values and the "Result" argument

When using the API an initial call must be made to CONVERTER_Allocate. This returns a new handle that is required to be passed to all subsequent API calls.

All calls to subsequent API methods return a success code (see API return values). This code indicates only whether or not the call to the API is valid. Normally you would expect this to return the value CONV_OK (i.e. 0).

For those API methods that could fail, the argument list contains a writable Result field. On exit the value of the Result will be set to one of the API result codes. When no error is encountered, this will be returned as R_SUCCESS (i.e. 0). The possible error values vary from method to method.

So calling software should first test the return value to check the API call was okay, and then test the Result code variable to see what error (if any) has occurred.

e.g.

        long Result = R_SUCCESS;
        long APIStatus = CONV_OK;
        long Handle = 0;
        ...
 
        Handle = CONVERTER_Allocate();
 
        APIStatus = CONVERTER_<method> (Handle, ..., Result, ...);
        if (APIStatus == CONV_OK && Result == R_SUCCESS) {
            cout << "It worked!" << endl;
        }
 
        ...
 
        APIStatus = CONVERTER_Free(Handle);


API return values

All of the API methods (except CONVERTER_Allocate which returns a handle) return a code indicating success or failure as follows

Status code Value Meaning
CONV_OK
0
Call to API was made. Check any
API result codes to see whether it worked
or not.
CONV_FAILED
1
Call to API failed
CONV_INVHANDLE 2 Invalid API handle passed in


API result codes

Several of the API methods (especially the conversion methods) accept a "Result" variable, into which a result code is written. This result value is set as follows :-


Result code Value Meaning
R_SUCCESS
0
API call succeeded
R_NOTEXECUTED
1
API call not made. Usually indicates
CONVERTER call was bad (e.g. invalid
handle passed in
R_NULLARG
2
Null or empty argument passed where
not expected
R_BUFFERTOOSMALL
3
Write-back buffer is too small to
receive result
R_POLICYLOADERROR
4
Failed to load policy
R_CANTFINDFILE
5
Can't find input file
R_CANTOPENFILE
6
Can't open output file
R_CONVERSIONFAILED
7
Error during conversion
R_NOINPUTDEFINED
8
No input file or data buffer supplied
R_NOOUTPUTDEFINED 9 No output file or data buffer supplied


Here are some suggestions on how to handle the various error codes :-

R_NOTEXECUTED
The API call not executed. This usually indicates that the converter has detected that some or all of the calling arguments were passed incorrectly. Try using CONVERTER_DebugAPI and CONVERTER_DebugAPILogMessage to identify the error.

If calling from Visual Basic, check that the correct argument types are defined and passed

R_NULLARG
A NULL or empty argument has passed where one was expected. Treat as for R NOTEXECUTED above

R_BUFFERTOOSMALL
The supplied string buffer is too small to receive the requested data. Try again with a larger buffer. If you are attempting to read back the results of a conversion see Checking the conversion results when using (char *) pointers

R_POLICYLOADERROR
Failed to load policy value. Either the policy name was incorrect (check with the documentation), or the value was invalid. Check for any error messages generated by the converter - see Capturing error messages

R_CANTFINDFILE
The specified file couldn't be found

R_CANTOPENFILE
The specified file couldn't be opened. For output files this could be because the directory doesn't exist, or because the output file already exists and is currently open in another application. This last error is quite common with RTF files if you are looking at the previous results in Word.

R_CONVERSIONFAILED
Some major error has been detected during conversion. Check for any error messages generated by the converter - see Capturing error messages

R_NOINPUTDEFINED
You haven't yet specified an input file or supplied an input string buffer for the conversion.
See Setting up the input and output destinations

R_NOOUTPUTDEFINED
You haven't yet specified an output file or supplied an output string buffer for the conversion.
See Setting up the input and output destinations


Passing character data to and from the converter

When to use string or (char *) "pointers" to pass character data

Many of the API methods require character data to be passed into and out of the methods. The converter code has been written in C++ and so using C++ string variables is the most natural and easy way to pass this data.

Unfortunately there are a number of situations in which using C++ string variables is not possible.

In these cases it is not possible to call API methods that have string arguments. To get round this, the API has two variants of any method that passes text data. The alternative function has the same name, but with "_Ptr" appended, because the non-string version uses character pointers instead of string as follows :-

   Example:-
 
      DLL_DECLARE CONVERTER_ReadPolicyFile      (long   Handle,
                                                 string PolicyFileName,
                                                 long   &Result);

   becomes
 
      DLL_DECLARE CONVERTER_ReadPolicyFile_Ptr  (long   Handle,
                                                 char   *pPolicyFileName,
                                                 long   &Result);


    Example
 
      DLL_DECLARE CONVERTER_GetPolicyValue      (long   Handle,
                                                 string PolicyName,
                                                 string &PolicyValue,
                                                 long   &Result);

    becomes
 
      DLL_DECLARE CONVERTER_GetPolicyValue_Ptr  (long   Handle,
                                                 char   *pPolicyName,
                                                 char   *pPolicyValue,
                                                 long   &ValueBufferSize,
                                                 long   &Result);

Note in the above example that PolicyName is a read-only argument, while PolicyValue is an output argument, and so requires a buffer size passed.

Sample code using C++ strings

The following code fragment shows how to set a policy value, and how to interrogate it again, using string variables.

    string PolicyName, PolicyValue;
    long APIStatus, Result;

    PolicyName  = "Default font";
    PolicyValue = "Arial, regular, 10";

    // set the policy value
    APIStatus = CONVERTER_SetPolicyValue
        (APIHandle, PolicyName, PolicyValue, Result);

    ...
 
    // read back a policy value
    string Value;
    APIStatus = CONVERTER_GetPolicyValue
                                (APIHandle, "Page width", Value, Result);

    cout << "Page Width = " << Value.c_str() << endl;


Sample code using (char *) pointers

Here's the same code using (char *) pointers and the "_Ptr" variants

    #define MAX_POLICYNAME_LEN    255
    #define MAX_POLICYVALUE_LEN   255

    char *pPolicyName   = new char [MAX_POLICYNAME_LEN];
    char *pPolicyValue  = new char [MAX_POLICYVALUE_LEN];

    long APIStatus, Result;
 
    strcpy (pPolicyName,  "Default font");
    strcpy (pPolicyValue, "Arial, regular, 10");

    // set the policy value
    APIStatus = CONVERTER_SetPolicyValue_Ptr
        (APIHandle, pPolicyName, pPolicyValue, Result);
 
    ...
 
    // read back a policy value
    strcpy (pPolicyName, "Page width");
    strcpy (pPolicyValue, "");

    long PolicyBufferSize = MAX_POLICYVALUE_LEN;
    APIStatus = CONVERTER_GetPolicyValue_Ptr
        (APIHandle, pPolicyName, pPolicyValue, PolicyBufferSize, Result);

    // need to add extra checks on _Result_ to see if buffer was big
    // enough
 
    cout << "Page Width = " << pPolicyValue << endl;



Checking the conversion results when using (char *) pointers

When using the "_Ptr" variant of the method to set up an output buffer, there is the possibility that the buffer you supply will turn out to be too small when you come to do the conversion.

When this situation arose, the Result returned by the conversion method will be R_BUFFERTOOSMALL.

Rather than requiring you to do the conversion a second time, with a bigger buffer, the API will hold onto an internal copy of the results, which you can retrieve any time up until you start on the next conversion.

To access this you first make a call to CONVERTER_GetOutCharArraySize to find out how large a buffer is required to receive this data. Create a buffer of the required size, and then call CONVERTER_GetOutCharArray_Ptr to actually retrieve the conversion results

    APIStatus = CONVERTER_DoConversion (APIHandle, ConvertType, Result);
    if (APIStatus != CONV_OK) return EXIT_FAILED;

    // Conversion worked, but the output buffer may be too small.  Check
    // this, and if necessary re-allocate the buffer.  The converter will
    // internally still hold onto a copy of the output until you call the
    // free function, so you will be able to simply ask for the result once
    // you supply a big enough buffer

    if (Result == R_BUFFERTOOSMALL) {

      long Length = 0;
 
      // Find out what size buffer is required
      APIStatus = CONVERTER_GetOutCharArraySize (APIHandle, Length, Result);

      if (Result == R_SUCCESS) {

        char *pBigBuffer = new char [Length];

        // read back the result into the new, big enough, buffer
        APIStatus = CONVERTER_GetOutCharArray_Ptr
                                (APIHandle, pBigBuffer, Length, Result);

        if (Result == R_SUCCESS) cout << pBigBuffer << endl;

        delete [] pBigBuffer;

      }

    } // if buffer was too small

Passing Unicode data to the converter

New in version 2.3.2
The API was not originally designed with Unicode in mind, and as a result support for Unicode text has been gradually added over time, with the result that earlier versions of the API may not support all the features described in this manual. If in doubt, please contact JafSoft for details.

The various Unicode implementations

New in version 2.3.2
Traditional single-byte character sets interpret the 8-bit character values (128-255) as special characters. So on a Russian machine this would be interpreted as Cyrillic, but on a different machine this could be read (wrongly) as Arabic (and vice versa). On most English-based PCs, the 8-bit characters are used for accented character used in certain European languages, so a Russian text would appear to have lots accented 'i's, 'e's and 'a's.

Unicode is a way of implementing text that supports multiple types of character sets at teh same time so that - for example - it is possible to display Chinese and Cyrillic on the same page unambigously. It does this by allocating each character in each language a unique code value, so that codes used for Cyrillic characters no longer overlap and conflict with those assigned to Arabic.

However, these code values are in most cases larger than can be represented in a single byte. As a result a way has to be chosen to represent each character by one or more bytes.

The following Unicode representations are commonly used

UTF-8
Each character is represented by 1, 2 or 3 bytes, depending on the which range the Unicode code value falls into. This has the advantage that all ASCII characters are a single byte, so for example all the HTML tags in a document are represented by a single byte each. This also means there are no null bytes contained in the text, which can make programming software to work with this text easier.

UTF-16
Each character is represented by a 2-byte pair (future characters may require 2 such pairs). The 2-byte pair is just the numerical representation of the Unicode value of each character. This makes the files easier to interpret, but also means that the byte order depends on how the machine stores its bytes - i.e. is the machine big-endian or little-endian. Because ASCII characters have a Unicode value less than 255 the ASCII characters map onto a byte pairs in which one of the bytes is null. Because each character requires two bytes, a single byte wrongly inserted into a UTF-16 stream will render all text that follows is as gibberish.

Files that contain Unicode identify themselves by inserting a "Byte Order Mark" (BOM) at the top of the file. This is a two-byte marker for UTF-16 files and a three-byte marker for UTF-8 files. Modern applications will test for this byte marker and if present will then know how to interpret the contents of the file. For example Notepad as supplied with Windows XP can do this, whereas Notepad as supplied with Windows 98 could not.

In UTF-16 each character is represented by two bytes, and computers can store a two-byte value in different ways (known as "big-endian" and "little-endian"). Each operating system uses one method or another and it isn't usually an issue, but when Unicode files get passed from one machine to another, this becomes important. The BOM allows the two forms of UTF-16 (known as "UTF-16BE" and "UTF-16LE") to be distinguished.


How the API handles Unicode internally

New in version 2.3.2
Internally the API makes extensive use of the C runtime library, and so effectively assumes that the text it is processing is free form null characters. This means that the API cannor handle UTF-16 internally in it's native form, as the two-byte implementation cointains nulls in one of the bytes for each ASCII character present.

This means that the API will convert any detected Unicode characters into UTF-8.

How the API detects the presence of Unicode

New in version 2.3.2
The API considers that the input text is Unicode under the following circumstances

Doing file-to-file conversions

New in version 2.3.2
For file-to-file conversion, the API will normally detect the presence of Unicode by spotting the Byte Order Marks (BOM) at the top of the input file.

Alternatively if the inpput file is a html file, any HTML entities that map onto Unicode characters will mark the input as being Unicode.

Internally the output text will be calculated as UTF-8 encoded text. When this is output to file, the UTF-8 BOM is added to the output file.

Thus any type of properly identified Unicode file on input will result in a valid UTF-8 file being created as output.


Doing string-to-string conversions

New in version 2.3.2
When calling the API to do string-to-string conversions, it is likely that the Byte Order Marks (BOM) that identify files as being Unicode will be present. This means you will probably have to "tell" the API that the text is Unicode. How you do this depends on the way the text is encoded.

See Using the "input text encoding" policy and Using the "output text encoding" policy


Using the "input text encoding" policy

New in version 2.3.2
The program has the ability to detect Unicode Files on input if Byte Order Mark (BOM) is present. The Detagger API also has the ability - under some circumstances - to detect Unicode HTML entities are present in the input text.

However in files without the BOMs, or when passed string data as input, the software may fail to detect the input is Unicode.

In such circumstances this policy allows you to tell the software that the input should be treated as Unicode. The possible values for this policy are

auto automatic detection (the default)
UTF8 UTF-8
UTF16-BE UTF-16 "Big Endian"
UTF16-LE UTF-16 "Little Endian"


Using the "output text encoding" policy

New in version 2.3.2
When outputting to file the API will create a Unicode (UTF8) file whenever it detects (or is told) that the input conrtains Unicode.

However under some circumstances it may be necessary to use the API to output to a UTF16 string, as opposed to a UTF8 or ASCII string.

In those circumstances this policy - which is only meant for use with APIs - allows you to specify the output encoding of the text returned by the API. As with the "input text encoding" policy the possible values are

auto automatic detection (the default)
UTF8 UTF-8
UTF16-BE UTF-16 "Big Endian"
UTF16-LE UTF-16 "Little Endian"


Summary of Unicode usage

New in version 2.3.2
This table summarises how you should use the API when specifying the input and/or output locations of Unicode text.




  UTF-8 UTF-16



Input is file Just pass in the file Just pass in the
with BOM name file name.



Input is file Pass in file name and Pass in file name and
without BOM set the "input text set the "input text
  encoding" policy to encoding" policy to be
  be "UTF-8". either "UTF-16LE" or
"UTF-16BE" according
to the endian-ness.



Input is a Call string or "_Ptr" Call the "_utf16"
string method and set the method and set the
  "input text encoding" "input text encoding"
  policy to "UTF-8" to "UTF-16LE" or
"UTF-16BE" according
to the endian-ness



Output to file Just pass in Just pass in the
  file name. file name. Output
  Output will be
a UTF-8 file
will be a UTF-8 file



Output to Call string or "_Ptr" Call the "_utf16"
string method to get the method to get the
  result. result.
  Output will be Output will be UTF16
  UTF-8 text with the endian-ness
you requested





Capturing error messages

The API can generate a number of progress messages, as well as error messages that will help diagnose any problems.

When calling the API from C++, it is possible to establish some callback routines that get called each time a message would be output to the output or error streams.
See Error reporting methods

When calling the API from other languages, such as Visual Basic, this level of integration isn't possible. In that situation you might want to use the debug options to switch on logging. In this way the output can be diverted into a log file.
See Debugging methods.

Finally, after the conversion is complete, you can fetch the last error message displayed. This isn't always useful as the last error message isn't always the most significant, but it may help.
See CONVERTER_GetLastMessage


The API demonstration package

Evaluation copies of all of the APIs are available online at http://www.jafsoft.com/developers/api_demos.html

There you can download an evaluation copy, it will also contain a demonstration kit (DemoAPI.zip). The demonstration kit includes sample code for C++ and Visual Basic, showing how the converter can be called from your code. It also contains example files for other languages supplied by users who have managed to integrate the APIs into their systems. These other files are supplied on an "as is" basis, and may not always be up to date with the current API implementation.

These evaluation copies include DLLs that are not time-limited, but which have other limitations, e.g. limits on how many files can be converted in a wildcard operation, and watermarking the output data and converting occasional words or lines into UPPER case. It is hoped that these limitations should not overly interfere with your evaluation of the API. If you feel they do, please email info<at>jafsoft.com indicating your reasons, and we will see what we can do (replace "<at>" by "@").

Should you decide to register the API, you will be supplied with full versions of the .DLL and .LIB files which do not have these built-in restrictions.


API methods

Initialise and release methods

Before the converter is used, a call must be made to CONVERTER_Allocate. This will create a new converter object and return a Handle that must be passed to all subsequent API call, so that they know which converter object is to be used.

Once you have finished, you should call CONVERTER_Free to release the converter object. This will free the memory and other resources allocated to the API object.

Note:
Failure to call this method will cause a memory leak. Since the converter will often hold onto a copy of the results of the last conversion, this memory could be similar in size to the last file converted, and hence quite large.

CONVERTER_Allocate

    DLL_DECLARE CONVERTER_Allocate ();

This method must be called first to allocate an API resource. It should return a non-zero Handle if it succeeds, and that Handle should be passed in to all remaining API calls.


CONVERTER_Free

    DLL_DECLARE CONVERTER_Free          (long &Handle);

After all conversions are complete, this method should be called to release the resource. The resource is freed, and all memory allocated during the conversion will be released. Since the API typically keeps a copy of the last conversion, this can be a variable amount of memory, comparable to the size of the largest file converted.

On exit the Handle will have been reset to 0, preventing it's reuse in later API calls.

Note:
for this reason the Handle is passed by reference (unlike most other API calls) so it can be freed.

Policy manipulation methods

The conversion process can be fine tuned using "policies". Policies are program options that can be used to influence the conversion. Which policies are available varies from converter to converter, although some policies are supported by multiple converters.

You should see the program's documentation and the Policy Manual for details of individual policies.

In each case a policy consists of a "policy phrase" and a value. Policies can be placed in a text file, one per line, known as a policy file. The API supports the loading of existing policy files, and/or the setting of individual policies.


CONVERTER_ResetPolicies

    DLL_DECLARE CONVERTER_ResetPolicies (long Handle);

When called this will reset all policies back to default values. You might want to call this between conversions using the same API if you wanted to apply different policies each time. It wouldn't be necessary if you wanted to apply the same policies each time.


CONVERTER_ReadPolicyFile

    DLL_DECLARE CONVERTER_ReadPolicyFile        (long   Handle,
                                                 string PolicyFileName,
                                                 long   &Result);

    DLL_DECLARE CONVERTER_ReadPolicyFile_Ptr    (long   Handle,
                                                 char   *pPolicyFileName,
                                                 long   &Result);

These methods accept the name of a policy file, and will load the policies in that file into the API object. You should test the Result to check that the file was found okay.


CONVERTER_WritePolicyFile

    DLL_DECLARE CONVERTER_WritePolicyFile       (long   Handle,
                                                 string PolicyFileName,
                                                 long   ShowAllPolicies,
                                                 long   &Result);

    DLL_DECLARE CONVERTER_WritePolicyFile_Ptr   (long   Handle,
                                                 char   *pPolicyFileName,
                                                 long   ShowAllPolicies,
                                                 long   &Result);

These methods allow you to dump the actual policies used during a conversion. This can be used to check that the policies you set were indeed used, or to see what values the analysis policies (such as page width) were set to by the API. Sometimes looking at post-conversion policies helps diagnose problematic conversions

The ShowAllPolicies value should be set as follows

Symbol Value Explanation
INCREMENTAL_POLICY_FILE
0
save only those policies that were
loaded or changed to file
FULL_POLICY_FILE 1 save all policies to file. Only
recommended for diagnostic and
documentation purposes


You can elect to show (almost) all policy value, or only those which have been "Loaded" and "Edited". The "almost" refers to the fact that only policies which may be meaningfully re-loaded from file are saved.


CONVERTER_SetPolicyValue

    DLL_DECLARE CONVERTER_SetPolicyValue        (long   Handle,
                                                 string PolicyName,
                                                 string TextValue,
                                                 long   &Result);

    DLL_DECLARE CONVERTER_SetPolicyValue_Ptr    (long   Handle,
                                                 char   *pPolicyName,
                                                 char   *pTextValue,
                                                 long   &Result);

Sets an individual policy by name. You should test the value of Result so see if the call worked. The commonest cause of failure would be a typo in the policy name.

See Customising the conversion using policies


CONVERTER_GetPolicyValue

    DLL_DECLARE CONVERTER_GetPolicyValue        (long   Handle,
                                                 string PolicyName,
                                                 string &PolicyValue,
                                                 long   &Result);

    DLL_DECLARE CONVERTER_GetPolicyValue_Ptr    (long   Handle,
                                                 char   *pPolicyName,
                                                 char   *pValue,
                                                 long   &ValueBufferSize,
                                                 long   &Result);

Interrogates the current value of a named policy. You might use this, for instance, to ask the program what it calculated the page width to be after the conversion.

Check the value of Result to ensure the PolicyName was valid. If an error is detected the value is set to

"*** GetPolicyValue Error ***";

to distinguish it from any other value.

See Customising the conversion using policies


Input and output specification methods

The API can accept input from either file or a passed string, and can output the results to either a file or a string buffer. You can use any combination you wish, but as a special case if you only supply an input filename, the converter will default to creating an output file in the same folder, with the same name, but a different extension (one more suited to the output format).

Depending on the conversion method called, you can either pass in filenames or string buffers to the conversion method directly, or you can set these up before conversion.

If you want to do mixed conversion, (e.g. from file into string), then you'll need to call these methods first to set up the input and output options.

If you are doing multiple conversions with the same API object, you may need to reset the input source and output targets between conversions.

See Conversion methods


CONVERTER_ResetSources

    DLL_DECLARE CONVERTER_ResetSources          (long   Handle,
                                                 long   &Result);

When called this will reset to null the input source and output targets for the API. This means you will either have to set up new locations before the next conversion, of choose a conversion method which allows you to specify those sources. Failure to do so will result in an error message.


CONVERTER_ResetInputSource

    DLL_DECLARE CONVERTER_ResetInputSource      (long   Handle,
                                                 long   &Result);

When called this will nullify the input source. You will need to specify a new source before the next conversion, or choose a conversion method that allows you to specify a source.


CONVERTER_ResetOutputSource

    DLL_DECLARE CONVERTER_ResetOutputSource     (long   Handle,
                                                 long   &Result);

When called this will nullify the output target. You will need to specify a new target before the next conversion, or choose a conversion method that allows you to specify a target.

The exception is file conversion, where a default output file can be inferred from the input file name (same folder, same name, different extension).


CONVERTER_SetInputString

    DLL_DECLARE CONVERTER_SetInputString        (long   Handle,
                                                 string Instring,
                                                 long   &Result);

    DLL_DECLARE CONVERTER_SetInputString_Ptr    (long   Handle,
                                                 char   *pInstring,
                                                 long   &Result);

When called this sets up the input for the next conversion to be the passed string data. Once you have also set up the output target, you can then call CONVERTER_DoConversion.

If the output is also a string buffer, you should consider calling CONVERTER_DoStringConvert which negates the need to call this method first.


CONVERTER_SetOutputString

    DLL_DECLARE CONVERTER_SetOutputString       (long   Handle,
                                                 string &Outstring,
                                                 long   &Result);

    DLL_DECLARE CONVERTER_SetOutputString_Ptr   (long   Handle,
                                                 char   *pOutputString,
                                                 long   OutputBufferSize,
                                                 long   &Result);

When called this sets up the output target for the next conversion to be the passed string buffer. Once you have also set up the input source, you can then call CONVERTER_DoConversion.

If the input is also a string buffer, you should consider calling CONVERTER_DoStringConvert which negates the need to call this method first.

When calling the "_Ptr" version of this method, be aware that the passed buffer may end up being too small.
See the discussion in Passing character data to and from the converter


CONVERTER_SetInputFilename

    DLL_DECLARE CONVERTER_SetInputFilename      (long   Handle,
                                                 string Filename,
                                                 long   &Result);

    DLL_DECLARE CONVERTER_SetInputFilename_Ptr  (long   Handle,
                                                 char   *pFilename,
                                                 long   &Result);

When called this sets up the input for the next conversion to be the specified file. Once you have also set up the output target, you can then call CONVERTER_DoConversion.

If the output is also a file, you should consider calling CONVERTER_DoFileConvert which negates the need to call this method first.


CONVERTER_SetOutputFilename

    DLL_DECLARE CONVERTER_SetOutputFilename     (long   Handle,
                                                 string Filename,
                                                 long   &Result);

    DLL_DECLARE CONVERTER_SetOutputFilename_Ptr (long   Handle,
                                                 char   *pFilename,
                                                 long   &Result);

When called this sets up the output target for the next conversion to be the specified file. Once you have also set up the input target, you can then call CONVERTER_DoConversion.

If the input is also a file, you should consider calling CONVERTER_DoFileConvert which negates the need to call this method first.

Note:
If an input file is specified the output "filename" needn't be complete. By default the output file is placed in the same folder, and with the same name but different extension as the input file.

See discussion in performing conversion between files


CONVERTER_GetOutCharArraySize

    DLL_DECLARE CONVERTER_GetOutCharArraySize ( long    Handle,
                                                long    &Size,
                                                long    &Result );

When using (char *) buffers with the API there is the possibility that the buffer passed to the API may be too small. This method can be called after the conversion to determine the size of buffer required to receive the results.

See Checking the conversion results when using (char *) pointers


CONVERTER_GetOutCharArray_Ptr

    DLL_DECLARE CONVERTER_GetOutCharArray_Ptr ( long    Handle,
                                                char    *pArray,
                                                long    &OutArraySize,
                                                long    &Result);

When using (char *) buffers with the API there is the possibility that the buffer passed to the API may be too small. This method can be called after the conversion to retrieve the results of the last conversion. A call should first be made to CONVERTER_GetOutCharArraySize to determine how large the buffer passed into this method should be, otherwise the Result may again be R_BUFFERTOOSMALL.

See Checking the conversion results when using (char *) pointers


Conversion methods

There are a number of methods to actually perform the conversion, depending on whether or not you want to set up the input source and output destination before calling the execution method.

CONVERTER_DoFileConvert
Call this method if you want to
do convert an input file into an
output file
CONVERTER_DoStringConvert
Call this method if you want to
convert from one string buffer into
another
CONVERTER_DoConversion For all other conversions, use this
method. You will need to set up
the input and output locations by
calling other methods before calling this one.
See Setting up the input and output destinations

In each case you should test that the API method returns the value CONV_OK (see API return values), and that the Result argument is returned as R_SUCCESS (see API result codes).

If you are using a string buffer as the output location, and are using the "_Ptr" variants of methods, then bear in mind that buffer might have proved to be too small.
See the discussion in Checking the conversion results when using (char *) pointers

Bear in mind that while the conversion may appear to work, there may still be aspects of the conversion which are reported as conversion problems during the conversion. These will be reported as errors and warnings via the error reporting methods. To see those messages you will either need to establish error reporting callback functions (available via C++ only), or enable some debugging.

See Error reporting methods and Debugging methods.


CONVERTER_DoConversion

    DLL_DECLARE CONVERTER_DoConversion  (       long    Handle,
                                                long    ConvType,
                                                long    &Result);

This method should be called to execute a "mixed mode" conversion, i.e. one in which the input is a file, and the output is a string buffer or vice versa.

See also the discussion in Conversion methods.


CONVERTER_DoFileConvert

    DLL_DECLARE CONVERTER_DoFileConvert (       long    Handle,
                                                long    ConvType,
                                                string  InFilename,
                                                string  OutFilename,
                                                long    &Result);

    DLL_DECLARE CONVERTER_DoFileConvert_Ptr (   long    Handle,
                                                long    ConvType,
                                                char    *pInFilename,
                                                char    *pOutFilename,
                                                long    &Result);

This method should be called to execute a file conversion, i.e. one in which both the input and outputs are files.
See performing conversion between files

See also the discussion in Conversion methods.


CONVERTER_DoStringConvert

    DLL_DECLARE CONVERTER_DoStringConvert(      long    Handle,
                                                long    ConvType,
                                                string  InText,
                                                string  OutText,
                                                long    &Result);

    DLL_DECLARE CONVERTER_DoStringConvert_Ptr ( long    Handle,
                                                long    ConvType,
                                                char    *pInText,
                                                char    *pOutText,
                                                long    &OutTextSize,
                                                long    &Result);

This method should be called to execute a string conversion, i.e. one in which both the input and outputs are string buffers. If you are using the _(char *)_ method for passing text (using the "_Ptr" variant), you'll need to check the output buffer was large enough (see Checking the conversion results when using (char *) pointers).

See also the discussion in Conversion methods.


Error reporting methods

During the conversion the API will generate a number of messages indicating progress and problems with the conversion itself. These messages won't normally represent a total failure of conversion, but may act as warnings that some aspects of the conversion may not have proceeded as expected.

In C++, it is possible to establish callback functions to capture and report these messages.

When calling the API from other programming languages these techniques cannot be used, and you would need to use the various Debugging methods that are available instead.


CONVERTER_SetErrorFn

    DLL_DECLARE CONVERTER_SetErrorFn            (long   Handle,
                                                 void (*pErrorFn) (const char *));

This method can be used when calling the API from C++ to capture messages that would be sent to the "error" stream. The supplied callback routine will be called each time that an "error" message is generated.

NOTE:
The argument has been changed from (char *) to (const char *) since an earlier version of the API.

CONVERTER_SetOutFn

    DLL_DECLARE CONVERTER_SetOutFn              (long   Handle,
                                                 void (*pErrorFn) (const char *));

This method can be used when calling the API from C++ to capture messages that would be sent to the "output" stream. The supplied callback routine will be called each time that an "informational" message is generated.

NOTE:
The argument has been changed from (char *) to (const char *) since an earlier version of the API.

Debugging methods

A number of methods exist to help you debug your use of the API, and to direct the output of the API to a log file.


CONVERTER_DebugAPI

    DLL_DECLARE CONVERTER_DebugAPI              (long Value);

This method is used to switch on/off the generation of debug messages each time and API method is called. These messages will show calls to the API, and the arguments passed. Some API calls will produce multiple entries, for example the "_Ptr" variants of methods often call their _string_ based equivalents.

This call can be useful to help diagnose problems with the API, often caused by the incorrect passing of data, especially text arguments.

A _Value_ of 1, switches on the messages, 0 switches them off. They are off by default.


CONVERTER_DebugAPILogMessage

    DLL_DECLARE CONVERTER_DebugAPILogMessage    (long Value, char *pLogName);

This method can be used to direct messages generated by the API into a log file. If enabled all messages generated by the API (including any Debug messages if CONVERTER_DebugAPI has been called) will be output to a log file. You may need to specify a complete directory path in the filename, as relative filenames may not work.

A _Value_ of 1, switches on the logging, 0 switches it off. It is off by default.


CONVERTER_GetLastMessage

    DLL_DECLARE CONVERTER_GetLastMessage        (string &Message,
                                                 long   &Result);

    DLL_DECLARE CONVERTER_GetLastMessage_Ptr    (char   *pMessage,
                                                 long   &MessageSize,
                                                 long   &Result);

These methods may be used to retrieve the last message generated by the API. This can be useful in diagnosing problems, although sometimes the last message may not be the most important, and you may need to use some other techniques to capture all error messages generated during the conversion.


Valid HTML 4.0! Converted from a single text file by AscToHTM
© 2001-2004 John A Fotheringham
Converted by AscToHTM