Application Programming Interface

Command line, JavaScript, C and REST APIs that integrate parsers into systems..


Nets parser has two kinds of API - application APIs and library APIs. Application APIs help to run Nets on the command line, as a web service or via C. Library APIs allow compiled grammars to be built using a set of C library functions. This section deals with both kinds of API.

Command Line Interface

The Nets Command Line Interface (CLI) allows the parser to be executed from an operating system terminal. It can be executed independently or as part of a script or batch process. The nets-parser command is followed by a number of command line parameters as follows. These parameters are by convention used in the RESTful API and the C API.

Name Description
input Defines the input file for the parser. Defaults to default.in.
output Defines the output file for the parser. Defaults to default.out.
error Defines the log file for the parser. Defaults to default.log.
loglevel Sets the initial loglevel for the parser.
grammar Defines the grammars for the parser. Defaults to default.g. May include multiple files separated by either '#' or ';'.
grammar_xml Defines the grammar XML output file for the parser. Defaults to default.g.xml.
nologo Ommits the copyright and version information from being displayed.
encoding The input and output stream encoding in the form encoding="SOURCE/TARGET". The default encoding is "ASCII/ASCII".
grammar_encoding The grammar encoding in the form encoding="SOURCE/TARGET". The default grammar encoding is "ASCII/ASCII".
loadenv Load the system environment variables as context grammar entities.
grammar_libpath Load path for compiled grammar libraries and the 'command.g' grammar. (Not used in cloud environments for security reasons).
init Initialise the parser only and do not start. Can be used to check for configuration errors.
help Displays help information about command line parameters.
jobid The 'jobid' command line parameter adds jobid (instead of the thread id) to each line in the error stream.
xxxxxx Any unkown parameter is interpretted as a context grammar entity.

RESTful Interface

TBD

'C' API

The 'C' API includes the following structure and function defintions

Parser Library

Parser Structure

The PARSER structure holds the the state of the parser. It contains the following members.

Name Type Description
input STREAM The input stream for the parser.
output STREAM The output stream for the parser.
error STREAM * The error stream (pointer) for the parser.
mode pmode The current parser mode.
saved_mode pmode The saved parser mode.
doc NODE * The root node of the grammar DOM.
proc_node NODE * The current processing node withing the grammar DOM.
state state_t enum uninitialised_parser, initialised_parser, started_parser, stopped_parser, paused_parser, ended_parser, errored_parser The current state of the parser.
result bool The result of the last rule in the parser.

The pmode structure holds the the current mode of the parser. It contains the following members.

Name Type Description
echo bool If true the parser echos input to output.
predicate enum predicate_t no_predicate, and_predicate, not_predicate, again_predicate The predicate state.
ignorecase bool If true ignores case sensitivity when comparing strings.
loglevel enum log_level_t MESSAGE=0, ERROR_=1, WARNING=2, INFO=3, DEBUG_=4 The current log level of the parser.

Parser Functions

The stream library is designed to be familiar to users of POSIX compatible file systems and uses a unique the s_* prefix to differentiate nets-parser stream functions from standard C stream functions. The following table show the POSIX standard functions and the equivalent nets-parser function and describes any differences in their operation.

Name Description
new_parser PARSER *new_parser();
Creates a new parser instance and returns a pointer to a PARSER type. Set parser.state to uninitialised_parser.
delete_parser void delete_parser(PARSER *app);
Deletes the parser instance and frees memory.
init_parser2 int init_parser2(wchar_t *context);
Initialise the parser using name=value pairs. Set parser.state to initialised_parser.
init_parser void init_parser(PARSER *app);
Initialises the parser using the context grammar defined in app->doc. Set parser.state to initialised_parser.
start_parser2 int start_parser2(wchar_t *context);
Initialise and start the parser using name=value pairs. Set parser.state to started_parser.
start_parser void start_parser(PARSER *app);
Initialise and start the parser using the context grammar defined in app->doc. Set parser.state to started_parser.
error_paser void error_parser(PARSER *app);;
Stop parsing with error from the started_parser state. Set parser.state to errored_parser.
end_parser void end_parser(PARSER *app);
End parsing normally from the started_parser state. Set parser.state to ended_parser.
stop_parser void stop_parser(PARSER *app);
Stop parsing in response to a user request (SIGINT) in the started_parser state. Set parser.state to stopped_parser.
pause_parser void pause_parser(PARSER *app);
Pause parsing in response to a user request (SIGSTP) or a breakpoint from the started_parser state. Set parser.state to paused_parser.
resume_parser void start_parser(PARSER *app);
Resume parsing from the paused_parser state. Set parser.state to started_parser.

Parser Example


Grammar Library

Grammar Structures

Grammar libraries can be either internal or external. External libaries are compiled as DLLs or shared libraries and must include the following grammar_rule_def structure, which is used for defining the grammar's rules and entities.

Name Type Description
type wchar_t* May be either 'rule' or 'entity'.
name wchar_t* The name of rule or entity.
functptr bool (*funcptr) (struct parser *st) The function pointer for the rule or NULL for an entity.
value wchar_t* The value for entity.

Grammar Functions

Grammar libraries - both internal and external - are written to the grammar library interface.

Name Description
init_grammar struct node_t *init_grammar(struct node_t *root, wchar_t *name, struct grammar_rule_def gp[]);
Initialise a grammar with 'name' and add it to the grammar DOM root.
init_grammar_library __declspec(dllexport) void init_grammar_library(struct node_t *root)
Each external grammar must define this function as the means of declaring the grammar and adding it to the grammar DOM root. This function usually calls init_grammar.
rule bool rule(PARSER *st);
Each rule defined for a grammar uses takes the PARSER state and returns a true or false depending on whether the rule was successfully evaluate or not.

Stream Library

Stream Structure

The STREAM structure holds the the state of input and output streams. It contains the following members.

Name Type Description
type enum s_none, s_false, s_null, s_file, s_memory, s_pipe, s_property, s_element, s_attribute, s_text, s_CDATA, s_document, s_processinginstruction, s_entity, s_entityref, s_comment The type of the stream.
name wchar_t * The name of the stream.
fd int The file descriptor.
buffer struct stream_buffer A pointer to the stream_buffer structure. Used for stdin, stdout, stderr, pipes and buffered files.
pos off_t The current position in the stream.
line size_t The current line number in the stream.
nlpos off_t The position in the stream where the current line started
parent off_t The position in the stream of the parent node in the SDOM.
sibling off_t The position in the stream of the sibling node in the SDOM.
size off_t The size of the stream for those streams that have an upper limit or 0 otheriwse.
mutex pthread_mutex_t The thread mutex to allow prevent mutiple threads from writing simulatenously to a stream.

Stream Functions

The stream library is designed to be familiar to users of POSIX compatible file systems and uses a unique the s_* prefix to differentiate nets-parser stream functions from standard C stream functions. The following table show the POSIX standard functions and the equivalent nets-parser function and describes any differences in their operation.

POSIX Name Nets-parser Name Description
fwopen_s s_wopen_s errno_t s_wopen_s(STREAM *stream,const wchar_t *filename,const wchar_t *mode);
Opens a file with a Unicode specified filename. POSIX compliant, except the function returns 0 for failure, which is non-standard POSIX behaviour.
dup s_dup void s_dup(STREAM *sourcestream, STREAM *targetstream);
Duplicates a file stream copying from source to target. Non standard implementation.
pipe s_pipeopen void s_pipeopen(STREAM *inputstream, STREAM *outputstream);
Opens a pipe and returns two streams - one for output and one for input. Non standard implementation.
close s_close bool s_close(wchar_t *name, STREAM *s);
Closes a stream. Non standard implementation with the addition of the name parameter.
setvbuf s_setvbuf int s_setvbuf (STREAM *s, char *buffer, size_t size);
If buffer is NULL this function mallocs 'size' space for buffer. If the buffer is not NULL sets the streams buffer to 'buffer' and the streams size to 'size'. Buffer should be at least the same size as 'size'. Non standard implementation.
fflush s_flush bool s_flush(STREAM *s);
This function causes any buffered output on stream to be output. POSIX compliant, except the return code.
fread s_read2, s_read, s_readpos size_t s_read2(void *buffer, size_t size, size_t count, STREAM *stream);
Reads size * count bytes from the stream to buffer. POSIX compliant.
size_t s_read(STREAM *s, size_t buffer_size, void *buffer);
Reads size bytes from the stream to buffer. Non standard implementation.
size_t s_readpos(STREAM *s, size_t buffer_size, void *buffer, off_t pos);
Reads size bytes from the stream to buffer from a particular offset. Non standard implementation.
N/A s_readnode size_t s_readnode(STREAM *s, struct snode_t *snode);
Reads a DOM node from the stream. Non standard implementation.
N/A s_readnodepos size_t s_readnodepos(STREAM *s, struct snode_t *snode, off_t pos);
Reads a DOM node from the stream from a particular offset. Non standard implementation.
fgetws s_readwcs size_t s_readwcs(STREAM *s, size_t buffer_size, wchar_t *buffer, off_t pos);
Reads from a wide character string from the stream from a particular offset. Stops reading when a NULL character is encountered. Non standard implementation.
wchar_t* s_getws(wchar_t *str, int count, STREAM *stream);
POSIX. Not implemented.
fgetwc s_readwc size_t s_readwc(STREAM *s, wchar_t *output_buffer);
Reads from a wide character from the stream. Non standard implementation.
wint_t s_getwc(STREAM *stream);
POSIX. Not implemented.
fgetc s_readc size_t s_readc(STREAM *s, char *output_buffer);
Reads from a character from the stream. Non standard implementation.
extern int s_getc(STREAM *stream);
POSIX. Not implemented.
fwrite s_write2, s_write, s_writepos size_t s_write2(void *buffer, size_t size, size_t count, STREAM *stream);
Write size * count bytes to the stream from buffer. POSIX compliant.
size_t s_write(STREAM *s, size_t buffer_size, void *buffer);
Writes size bytes to the stream from buffer. Non standard implementation.
size_t s_writepos(STREAM *s, size_t buffer_size, void *buffer, off_t pos);
Writes size bytes to the stream from buffer from a particular offset. Non standard implementation.
N/A s_writenode size_t s_writenode(STREAM *s, struct snode_t *snode);
Write a DOM node to the stream. Non standard implementation.
N/A s_writenodepos size_t s_writenodepos(STREAM *s, struct snode_t *snode, off_t pos);
Write a DOM node to the stream from a particular offset. Non standard implementation.
fseeko s_seeko int s_seeko(STREAM *stream, off_t offset, int origin);
Seeks the a particular offset in the stream. POSIX compliant.
ftello s_tello off_t s_tello(STREAM *stream);
Returns the current read/write position in the stream. POSIX compliant.
N/A s_size off_t s_size(STREAM *s);
Returns the size of the stream.
ftruncate s_truncate int s_truncate(STREAM *s, off_t size);
Truncates the stream at offset size. POSIX compliant.
fwprintf s_wprintf void s_wprintf(struct stream *s, wchar_t *message, ...);
Print a formatted wide character string to stream. POSIX compliant.
fprintf s_printf void s_printf(struct stream *s, char *message, ...);
Print a formatted string to stream. POSIX compliant.
N/A e_wfprintf void e_wfprintf(struct stream *s, wchar_t *message, va_list argp);
Print a formatted string converted to multi byte characters, to the error stream. Non standard implementation.
N/A _log void _log(int level, wchar_t *message, ...);
Print a formatted string to the error stream using log level MESSAGE=0, ERROR_=1, WARNING=2, INFO=3 or DEBUG_=4. Non standard implementation.
fopen, fwopen, fopen_s N/A Not supported.