Volksdata 1.0b7
RDF library
Loading...
Searching...
No Matches
Collaboration diagram for Codec interface:

Data Structures

struct  VOLK_Codec
 Prototype for initializing a dataset decoding loop. More...

Macros

#define CHUNK_SIZE   4096

Typedefs

typedef VOLK_rc(* decode_ds_fn_t) (FILE *fh, const char *sh, VOLK_Store *store, void *udata, size_t *ct, char **err_p)
 Prototype for decoding a dataset.
typedef VOLK_rc(* decode_gr_fn_t) (FILE *fh, const char *sh, VOLK_Graph *gr, size_t *ct, char **err)
 Prototype for decoding a complete RDF document file into a graph.
typedef VOLK_rc(* decode_term_fn_t) (const char *rep, VOLK_Term **term)
 Prototype for decoding a string into a VOLK_Term.
typedef void(* encode_ds_done_fn_t) (void *it)
 Finalize a dataset encoding operation.
typedef void *(* encode_ds_init_fn_t) (const VOLK_Graph **dset)
 Initialize a dataset encoding loop.
typedef VOLK_rc(* encode_ds_iter_fn_t) (void *it, const VOLK_Graph *gr, char **res)
 Perform one dataset encoding iteration.
typedef void(* encode_gr_done_fn_t) (void *it)
 Finalize a graph encoding operation.
typedef void *(* encode_gr_init_fn_t) (const VOLK_Graph *gr, void *udata)
 Initialize a graph encoding loop.
typedef VOLK_rc(* encode_gr_iter_fn_t) (void *it, char **res)
 Perform one graph encoding iteration.
typedef void(* encode_store_done_fn_t) (void *it)
 Finalize a store encoding operation.
typedef void *(* encode_store_init_fn_t) (VOLK_Store *store)
 Initialize a store encoding loop.
typedef VOLK_rc(* encode_store_iter_fn_t) (void *it, char **res)
 Perform one store encoding iteration.
typedef VOLK_rc(* encode_term_fn_t) (const VOLK_Term *term, char **rep)
 Term encoder prototype.

Enumerations

enum  VOLK_CodecFeatures {
  VOLK_CODEC_FEAT_ENCODE_TERM = 1<<0 , VOLK_CODEC_FEAT_DECODE_TERM = 1<<1 , VOLK_CODEC_FEAT_ENCODE_GR = 1<<2 , VOLK_CODEC_FEAT_DECODE_GR = 1<<3 ,
  VOLK_CODEC_FEAT_ENCODE_DS = 1<<4 , VOLK_CODEC_FEAT_DECODE_DS = 1<<5 , VOLK_CODEC_FEAT_ENCODE_STORE = 1<<6 , VOLK_CODEC_FEAT_DECODE_STORE = 1<<7
}
 Feature flags applicable to a codec. More...
enum  VOLK_CodecFlags { VOLK_CODEC_NO_PROLOG = 1<<0 , CODEC_EXT_TXN = 1<<1 , VOLK_CODEC_INDENT = 1<<2 }
 Parse error information. More...

Functions

VOLK_rc escape_lit (const char *in, char **out)
 Add escape character (backslash) to illegal literal characters.
char * fmt_header (char *pfx)
 Format an informational header.
uint8_t * uint8_dup (const uint8_t *str)
 strdup() for unsigned char.
uint8_t * uint8_ndup (const uint8_t *str, size_t size)
 strndup() for unsigned char.
char unescape_char (const char c)
 Unescape a single character.
uint8_t * unescape_unicode (const uint8_t *esc_str, size_t size)
 Replace \uxxxx and \Uxxxxxxxx with Unicode bytes.

Detailed Description

Macro Definition Documentation

◆ CHUNK_SIZE

#define CHUNK_SIZE   4096

Max data size passed to the scanner and parser at each iteration.

Definition at line 78 of file codec_interface.h.

Typedef Documentation

◆ decode_ds_fn_t

typedef VOLK_rc(* decode_ds_fn_t) (FILE *fh, const char *sh, VOLK_Store *store, void *udata, size_t *ct, char **err_p)

Prototype for decoding a dataset.

Parameters
[in]fhOpen file handle pointing to the RDF data. Implementations MUST NOT close the file handle. This is mutually exclusive with sh.
[in]shstring handle for the RDF data. This is mutually exclusive with fh. If both are specified, fh has precedence.
[in]Storehandle to be populated. The store MUST be initialized and MAY be not empty.
[in]udataUser data that may be used by implementations.
[out]ctHandle to be populated with the count of parsed triples. It MAY be NULL.
[out]errHandle to be populated with the error message in case of error.
Returns
Pointer to opaque data to be passed to #decode_ds_iter_fn_t and #decode_ds_done_fn_t.

Definition at line 462 of file codec_interface.h.

◆ decode_gr_fn_t

typedef VOLK_rc(* decode_gr_fn_t) (FILE *fh, const char *sh, VOLK_Graph *gr, size_t *ct, char **err)

Prototype for decoding a complete RDF document file into a graph.

Implementations SHOULD consume data from the file handle in chunks and MAY read a string handle .

Parameters
[in]fhOpen file handle pointing to the RDF data. Implementations MUST NOT close the file handle. This is exclusive with sh.
[in]shstring handle for the RDF data. This is exclusive with fh. If both are specified, fh has precedence.
[out]grGraph handle to be populated with decoded data. The graph MUST be initialized and MAY be non-empty.
[out]ctIf not NULL, it MAY be populated with the number of triples parsed (which may be different from the resulting graph size). Implementations MAY choose not not use this.
[out]errPointer to error info string. If no error occurs, it yields NULL.
Returns
Implementations MUST return VOLK_OK on success and a negative value on parsing error.

Definition at line 436 of file codec_interface.h.

◆ decode_term_fn_t

typedef VOLK_rc(* decode_term_fn_t) (const char *rep, VOLK_Term **term)

Prototype for decoding a string into a VOLK_Term.

Implementations MAY ignore any other tokens after finding the first one.

Parameters
[in]repNT representation of the term.
[out]RDF term and triple modulePointer to the term handle to be created. Implementaions SHOULD return NULL on a parse error.
Returns
Implementations MUST return VOLK_OK on success and a negative value on parsing error.

Definition at line 409 of file codec_interface.h.

◆ encode_ds_done_fn_t

typedef void(* encode_ds_done_fn_t) (void *it)

Finalize a dataset encoding operation.

TODO not implemented in any of the current codecs.

Implementations SHOULD use this function to perform all necessary steps to clean up memory and free the iterator handle after a graph has been completely encoded.

Parameters
[in]itIterator handle.

Definition at line 346 of file codec_interface.h.

◆ encode_ds_init_fn_t

typedef void *(* encode_ds_init_fn_t) (const VOLK_Graph **dset)

Initialize a dataset encoding loop.

TODO not implemented in any of the current codecs.

This prototype is to be implemented by dataset encoding loops. A dataset source is defined as a number of named graphs that get passed one at a time to encode_ds_iter_fn_t after this function is called. Each graph is encoded into the same document that gets output in chunks as a single stream.

Returns
A void pointer to be passed to a encode_gr_iter_fn_t function and, eventually, to a encode_gr_done_fn_t function. The data structure of the pointer is defined by each codec according to its own needs to keep state across iterations.

Definition at line 310 of file codec_interface.h.

◆ encode_ds_iter_fn_t

typedef VOLK_rc(* encode_ds_iter_fn_t) (void *it, const VOLK_Graph *gr, char **res)

Perform one dataset encoding iteration.

Implementations of this prototype MUST perform all the steps to encode one complete graph into an RDF fragment representing that graph.

TODO not implemented in any of the current codecs.

Parameters
[in]itIterator handle.
[out]resHandle to be populated with a string obtained from encoding. The output data SHOULD be UTF-8 encoded. This pointer must be initialized (even to NULL) and should be eventually freed by the caller at the end of the loop. Implementations MAY reallocate this memory at each iteration, and users SHOULD expect that memory from a previous iteration may be overwritten with new data.
Returns
VOLK_OK if a new graph was processed; VOLK_END if the end of the loop was reached.

Definition at line 332 of file codec_interface.h.

◆ encode_gr_done_fn_t

typedef void(* encode_gr_done_fn_t) (void *it)

Finalize a graph encoding operation.

Implementations SHOULD use this function to perform all necessary steps to clean up memory and free the iterator handle after a graph has been completely encoded.

Parameters
[in]itIterator handle.

Definition at line 293 of file codec_interface.h.

◆ encode_gr_init_fn_t

typedef void *(* encode_gr_init_fn_t) (const VOLK_Graph *gr, void *udata)

Initialize a graph encoding loop.

This prototype is to be implemented by graph encoding loops. It should create an iterator and perform all initial setup for finding triples.

Parameters
[in]grThe graph to be encoded. The graph's namespace map is used by the codec for namespace prefixing. The graph may only be freed after the loop is finalized.
[in]udataAdditional data passed to the iterator. Implementations MAY decide whether and how to use this.
Returns
A void pointer to be passed to a encode_gr_iter_fn_t function and, eventually, to a encode_gr_done_fn_t function. The data structure behind the pointer is defined by each codec according to its own needs to keep state across iterations.

Definition at line 260 of file codec_interface.h.

◆ encode_gr_iter_fn_t

typedef VOLK_rc(* encode_gr_iter_fn_t) (void *it, char **res)

Perform one graph encoding iteration.

Implementations of this prototype MUST perform all the steps to encode one or more complete triples into an RDF fragment representing those triples. The input and output units are up to the implementation and a caller SHOULD assume that multiple lines may be yielded at each iteration.

Parameters
[in]itIterator handle.
[out]resHandle to be populated with a string obtained from encoding. The output data should be UTF-8 encoded. This pointer must be initialized (even to NULL) and should be eventually freed manually at the end of the loop. Implementations MAY reallocate this memory at each iteration, and users SHOULD expect that memory from a previous iteration may be overwritten with new data.
Returns
VOLK_OK if a new token was processed; VOLK_END if the end of the loop was reached.

Definition at line 282 of file codec_interface.h.

◆ encode_store_done_fn_t

typedef void(* encode_store_done_fn_t) (void *it)

Finalize a store encoding operation.

Parameters
[in]itIterator handle.

Definition at line 389 of file codec_interface.h.

◆ encode_store_init_fn_t

typedef void *(* encode_store_init_fn_t) (VOLK_Store *store)

Initialize a store encoding loop.

This prototype is to be implemented by codecs featuring language to encode named graphs, such as TriG. Its scope is to encode a whole store.

Parameters
[in]Backend store moduleStore to encode. The store MUST implement the VOLK_STORE_CTX feature.
Returns
A void pointer to be passed to an encode_store_iter_fn_t function and, eventually, to an encode_store_done_fn_t function. The data structure of the pointer is defined by each codec according to its own needs to keep state across iterations.

Definition at line 362 of file codec_interface.h.

◆ encode_store_iter_fn_t

typedef VOLK_rc(* encode_store_iter_fn_t) (void *it, char **res)

Perform one store encoding iteration.

This function generates a chunk of RDF as a string. The scope of the RDF contained in the string is left to the implementation.

Parameters
[in]itIterator handle created with encode_store_init_fn_t().
[out]resHandle to be populated with a string obtained from encoding. The output data MUST be UTF-8 encoded. This pointer MUST be initialized (even to NULL) and should be eventually freed by the caller at the end of the loop. Implementations MAY reallocate this memory at each iteration, and users SHOULD expect that memory from a previous iteration may be overwritten with new data.
Returns
VOLK_OK if a new graph was processed; VOLK_END if the end of the loop was reached.

Definition at line 382 of file codec_interface.h.

◆ encode_term_fn_t

typedef VOLK_rc(* encode_term_fn_t) (const VOLK_Term *term, char **rep)

Term encoder prototype.

Parameters
[in]RDF term and triple moduleSingle term handle.
[out]repPointer to a string to be filled with the encoded term. The string is reallocated and, if reused for multiple calls to this function, it only needs to be freed after the last call. It MUST be initialized to NULL at the beginning.
Returns
VOLK_OK on successful encoding; <0 for other errors.

Definition at line 240 of file codec_interface.h.

Enumeration Type Documentation

◆ VOLK_CodecFeatures

Feature flags applicable to a codec.

Enumerator
VOLK_CODEC_FEAT_ENCODE_TERM 

Supports encoding a single term.

VOLK_CODEC_FEAT_DECODE_TERM 

Supports decoding a single term.

VOLK_CODEC_FEAT_ENCODE_GR 

Supports encoding a graph.

VOLK_CODEC_FEAT_DECODE_GR 

Supports decoding a graph.

VOLK_CODEC_FEAT_ENCODE_DS 

Supports encoding a data set.

VOLK_CODEC_FEAT_DECODE_DS 

Supports decoding a data set.

VOLK_CODEC_FEAT_ENCODE_STORE 

Supports encoding a whole store.

VOLK_CODEC_FEAT_DECODE_STORE 

Supports decoding a whole store.

Definition at line 104 of file codec_interface.h.

◆ VOLK_CodecFlags

Parse error information.

Option flags passed to codec function calls.

Enumerator
VOLK_CODEC_NO_PROLOG 

Do not generate prolog.

CODEC_EXT_TXN 

The iterator is using an externally provided transaction that must not be freed. This is set internally.

VOLK_CODEC_INDENT 

Indent each output line by four spaces. Used by TTL writer when encoding a graph for TriG.

Definition at line 93 of file codec_interface.h.

Function Documentation

◆ escape_lit()

VOLK_rc escape_lit ( const char * in,
char ** out )

Add escape character (backslash) to illegal literal characters.

Parameters
[in]inInput string.
[out]outOutput string.
Returns
VOLK_OK on success; VOLK_MEM_ERR on memory error.

Definition at line 81 of file codec_interface.c.

◆ fmt_header()

char * fmt_header ( char * pfx)

Format an informational header.

The information includes software version and current date. It is terminated by a newline + NUL and prefixed with the string specified in pfx. It is NOT prefixed by any comment characters.

Parameters
[in]pfxPrefix to add to the string. It may be a comment starter, such as # .

Definition at line 116 of file codec_interface.c.

◆ uint8_dup()

uint8_t * uint8_dup ( const uint8_t * str)
inline

strdup() for unsigned char.

This is to be used with uint8_t sequences considered to be UTF-8 sequences, requird by re2c (it won't work with byte sequences containing NUL).

Definition at line 128 of file codec_interface.h.

◆ uint8_ndup()

uint8_t * uint8_ndup ( const uint8_t * str,
size_t size )
inline

strndup() for unsigned char.

This is to be used with uint8_t sequences considered to be UTF-8 sequences, requird by re2c (it won't work with byte sequences containing NUL).

Definition at line 138 of file codec_interface.h.

◆ unescape_char()

char unescape_char ( const char c)
inline

Unescape a single character.

Convert escaped special characters such as \t, \n, etc. into their corresponding code points.

Non-special characters are returned unchanged.

Parameters
[in]cCharacter to unescape. Note that this is the single character after \\endiskip.
Returns
Code point corresponding to the escaped character.

Definition at line 184 of file codec_interface.h.

◆ unescape_unicode()

uint8_t * unescape_unicode ( const uint8_t * esc_str,
size_t size )

Replace \uxxxx and \Uxxxxxxxx with Unicode bytes.

Parameters
[in]esc_strEscaped string.
[in]sizeMaximum number of characters to scan, à la strncpy().
Returns
String with escape sequences replaced by Unicode bytes.

Definition at line 11 of file codec_interface.c.