As explained in the following links, the static application-independent functions are used for eXtremeDB runtime control, database control (opening, connecting to and closing databases), transaction management and cursor navigation. But to create and modify individual database objects, C applications use the strongly typed object interfaces generated by the
mcocomp
schema compiler.The schema compiler generates the following types of C interface functions depending on the specific DDL class definitions: new and delete object creation and removal methods, put and get methods for storing and retrieving object field data,
oid
andautoid
access methods, find and search methods for looking up objects by indexes, and event methods for responding to database events.Some functions operate on entire database objects, while others on fields within the object. To generate function names for object-action functions, the compiler uses the class name followed by
_new()
,_delete()
or_delete_all()
. To generate function names for field-action functions the compiler uses the class or structure name, followed by the field name and then the action word, all separated by underscores. Action words can be any of the following:put, get, at, put_range, get_range, alloc, erase, pack
andsize
.New and Delete
In C applications, creating and deleting objects are accomplished by calling the generated
_new()
and_delete()
functions. The_new()
function reserves initial space for an object and returns a reference (handle) to the freshly created data object. These functions are generated only for classes; no_new()
or_delete()
functions are generated for structures because structures are never instantiated by themselves in the database; they always belong to some class.For classes declared without an
oid
:MCO_RET classname_new ( /*IN*/ mco_trans_h t, /*OUT*/ classname *handle);For classes declared with an
oid
, theoid
must be passed:MCO_RET classname_new ( /*IN*/ mco_trans_h t, /*IN*/ databasename_oid * oid, /*OUT*/ classname *handle);The
_delete()
function permanently removes the object whose handle is passed while the_delete_all()
function removes all objects of the class from the database. Storage pages occupied by the object(s) are also returned back to the storage manager for reuse.MCO_RET classname_delete (/*IN*/ classname *handle); MCO_RET classname_delete_all ( /*IN*/ mco_trans_h t );Put and Get
For each field of an object and for each element of a structure declared in the schema file,
_put()
and_get()
functions are generated. The_put()
functions are called to update specific field values. Depending on the type of field, the generated_put()
function will be one of the following forms.For scalar type fields it will be of the form:
MCO_RET classname_fieldname_put( /*IN*/ classname *handle, /*IN*/ <type> value);For
char
andstring
fields a pointer and length argument are required:MCO_RET classname_fieldname_put( /*IN*/ classname *handle, /*IN*/ const char *value, /*IN*/ uint2 len);It is important to understand how the
_put()
operation copies data to the database field and any associated indexes. Consider the following schema:persistent class MyClass { char<10> buf; tree<buf> idx; }Now the following code snippet puts “abc” into the
char<10>
fieldbuf
:MyClass cls; char buf[10] = "abc" ... MyClass_new(t, &cls); MyClass_buf_put(&cls, buf, strlen(buf));The eXtremeDB runtime will copy the specified number of characters (3) into the field and fill the remaining 7 bytes with the
MCO_SPACE_CHAR
. The default value ofMCO_SPACE_CHAR
is\0
. This normalizes the value of the unused part of the field for later sort operations. The runtime will not copy the extra null terminator from the input string.If there is an index on this field, the index node is handled differently depending on whether this class is transient or persistent. For transient classes, no data is copied to the index node. For persistent classes, the entire contents of the field are copied from the field value (not from the input variable) to the index node. So for the example above the bytes
[abc\0\0\0\0\0\0\0]
will be copied into the fieldbuf
first, and then (when the transaction is committed) from the fieldbuf
to the index node (because this is a persistent class).Note that if a character array of length 10 is copied into field
buf
, there is no null terminator in the field. If a character array of more than 10 characters is used as the argument to_put()
, only the specified number of characters (obviously <= 10) is copied.The
_get()
functions are called to bind a field of an object to an application variable and the function will be one of the following forms depending on the type of field.For scalar type fields it will be of the form:
MCO_RET classname_fieldname_get( /*IN*/ classname *handle, /*OUT*/ <type> *value);For fixed length char fields the length must be specified:
MCO_RET classname_fieldname_get( /*IN*/ classname *handle, /*OUT*/ char *dest, /*IN*/ uint2 dest_size);If the field is a string then the function takes two extra parameters: the
size
of the buffer to receive the string, and an OUT parameter to receive the actual number of bytes returned. So the generated function will have the form:MCO_RET classname_fieldname_get( /*IN*/ classname *handle, /*OUT*/ char *dest, /*IN*/ uint2 dest_size, /*OUT*/ uint2 *len);Some things to note about the behavior of
_get()
on character and string fields (using the example of field) :
- it puts the entire value of the
char
orstring
field plus a null terminator if the destination buffer is big enough.- for a field value of "
1234567890
" the call of_get(..., buf, sizeof(buf), ... )
into achar buf[100]
variable will extract "1234567890\0
";- it puts the whole value of the
char
orstring
field without the null terminator if the destination buffer is exactly the size to fit the whole value. For example, for the value "1234567890
" the call of_get(..., buf, sizeof(buf), ... )
into thechar buf[10]
variable will extract "1234567890
" (no null terminator);- it puts as much as will fit of the value of the
char
orstring
field without the null terminator if the destination buffer is smaller than the whole value. For example, for the value "1234567890
" the call of_get(..., buf, sizeof(buf), ... )
into achar buf[5]
variable will extract "12345
" (no null terminator).Numeric and Decimal generated functions
As stated in the Base Data Types page, the values for database fields of type
decimal
ornumeric
are stored internally as integers of a size determined by the specified width:
Width Storage type 1-2 signed<1> 3-4 signed<2> 5-9 signed<4> 10-19 signed<8> For these fields, the standard
_put()
and_get()
functions described above are generated and the argument passed to_put()
and_get()
will be an integer pointer or value of the corresponding size.In addition to the
_put()
and_get()
, the following functions are generated to allow specifying the field value as a character string:MCO_RET CLASS_FIELD_put_chars( CLASS *handle, char const* buf); MCO_RET CLASS_FIELD_get_chars( CLASS *handle, char* buf, int buf_size);The
_put_chars()
function converts the input string of characters to an integer value and stores it in the database field. The_get_chars()
function extracts the value from the database field and represents it as a string of characters. To facilitate conversion of integer values to character string and vice versa, two helper functions are also generated:MCO_RET CLASS_FIELD_to_chars( TYPE scaled_num, char* buf, int buf_size); MCO_RET CLASS_FIELD_from_chars( TYPE* scaled_num, char const* buf);Consider a schema defining a decimal like:
class A { ... decimal<10,3> dec; ... };The following code snippet demonstrates how these functions might be used in practice:
A a; int8 i8; /* Allocate an object */ A_new ( t, &a ); ... /* put int8 value to numeric field */ A_dec_from_chars( &i8, "123456"); A_dec_put( &a, i8 ); ... /* put char array value to decimal field */ A_dec_to_chars( 987654321, buf, sizeof(buf)); A_dec_put_chars( &a, buf ); ... /* put char array value to decimal field */ A_dec_to_chars( 987654321, buf, sizeof(buf)); A_dec_put_chars( &a, buf ); ... A_from_cursor(t, &csr, &a); printf("\n\tContents of first record A: \n"); ... /* get values from numeric/decimal fields */ A_dec_get( &a, &i8); A_dec_get_chars( &a, buf, sizeof(buf)); printf("\t\tdec=%lld, chars(%s)\n", i8, buf );Fixed _put() and _get()
Often database classes will contain many fields with the consequence that fetching and storing these objects require a long series of
_get()
and_put()
function calls for each individual field. To simplify this work of coding, the schema compiler generates a C-language structure for all scalar fields and arrays of fixed length and additional<classname>_fixed_get()
and<classname>_fixed_put()
functions are generated that can significantly reduce the number of function calls required. But, as the name indicates, these functions can only be generated for the fixed size fields of a given class. If a class contains fields of variable length (e.g. string, vector or blob fields) then these fields must be accessed with their individual_get()
and_put()
functions.For example, the following schema:
struct B { signed<1> i1; signed<2> i2; signed<4> i4; char<10> c10; float f; }; class A { unsigned<1> ui1; unsigned<2> ui2; unsigned<4> ui4; double d; string s; B b; list; };would cause the following “C” structures to be generated:
/* Structures for fixed part of the classes */ typedef struct B_fixed_ { int1 i1; int2 i2; int4 i4; char c10[10]; float f; } B_fixed; typedef struct A_fixed_ { uint1 ui1; uint2 ui2; uint4 ui4; double d; B_fixed b; } A_fixed;with the following access functions:
MCO_RET B_fixed_get( B *handle_, B_fixed* dst_ ); MCO_RET B_fixed_put( B *handle_, B_fixed const* src_ ); MCO_RET A_fixed_get( A *handle_, A_fixed* dst_ ); MCO_RET A_fixed_put( A *handle_, A_fixed const* src_ );Using these functions, objects of the A class can be written with two function calls:
A_fixed_put()
for the fixed size portion andA_s_put()
for the variable length field of type strings
. Similarly, the objects of this class can be read with two function calls:A_fixed_get()
andA_s_get()
.The following code snippet illustrates the use of the fixed length structures and the
fixed_get()
function:int main(int argc, char* argv[]) { MCO_RET rc; ... mco_trans_h t; mco_cursor_t csr; /* cursor to navigate database contents */ A a; /* object handle */ A_fixed _a; /* fixed size part of class A */ B b; /* struct handle */ B_fixed _b; /* fixed size part of class B */ uint1 ui1; /* value place holders */ uint2 ui2; ... /* Open a READ_ONLY transaction, read object A and display its contents */ rc = mco_trans_start(connection, MCO_READ_ONLY, MCO_TRANS_FOREGROUND, &t); if ( MCO_S_OK == rc ) { rc = A_list_cursor(t, &csr); if ( MCO_S_OK == rc ) { A_from_cursor(t, &csr, &a); A_fixed_get(&a, &_a); A_s_get( &a, buf, sizeof(buf), &ui2 ); printf("\n\tContents of record A: s (%s), ui1 = %d, b.i1 = %d\n", buf, _a.ui1, _a.b.i1 ); } } rc = mco_trans_commit( t ); }Nullable Fields
If a scalar class element (
int, float, double
) has been declarednullable
in the database schema, the following interfaces are generated:MCO_RET classname_fieldname_indicator_get( classname *handle, uint1 *result );The argument
result
will have a value of 1 upon return if the field isnull
, otherwiseresult
will be 0.MCO_RET classname_fieldname_indicator_put( classname *handle, uint1 value);Pass a
value
of 1 to set the null indicator, 0 to clear the null indicator.
Note that setting or clearing the null indicator has no effect on the underlying field’s value. In other words, if a
nullable uint2
field has a value of 5 and<classname_fieldname>_indicator_put( h, 1 )
is called for the field, it will still have a value of 5 after the call.
For fields of all types, the respective forms of
<classname_fieldname>_get()
can returnMCO_S_NULL_VALUE
.<classname>_fixed_get()
can also returnMCO_S_NULL_VALUE
, indicating that one or more constituent fields are null; a further examination with<classname_fieldname>_indicator_get()
will be necessary to determine which field(s) are null.Please see the Nullable Fields and Indexes page for an explanation of the behavior of nullable fields included in indexes.
Checkpoint and Size
When an indexed field is modified by a transaction, the object is removed from all indexes defined for that class. Regardless of the whether the modified field is present in other indexes. Once the object is removed from indexes, there is no way to locate the object based on any search function. The object is put back into the indexes upon the transaction commit.
The
_checkpoint()
API is the only way to put the object back into indexes within the transaction and thus make it visible through search functions.MCO_RET classname_checkpoint ( /*IN*/ classname *handle);If a unique index constraint is violated,
_checkpoint()
will return status codeMCO_S_DUPLICATE
.For fields of type
string
andvector
, an additional_size()
function is generated to return the actual length of the string value or the number of elements in the vector, so that you can allocate space for it:MCO_RET classname_fieldname_size( /*IN*/ classname *handle, /*OUT*/ uint2 *size);Autocompaction of dynamic objects
If a class contains dynamically extended components (fields of type
vector
orstring
), an object of this class that is frequently updated can develop memory holes. To prevent this kind of fragmentation, an autocompaction feature is provided. To enable autocompaction, specify a non-zero value for the database parameterautocompact_threshold
in themco_db_params_t
passed intomco_db_open_dev()
.If the size of an object exceeds this
autocompact_threshold
value, then the autocompaction algorithm reallocates objects, eliminating any internal fragmentation. However, note that object compaction is not a cheap operation, and should not be performed frequently. So a recommended value for this threshold is a number of bytes several kilobytes larger than a normal object’s expected size.Vectors and Fixed-length Arrays
eXtremeDB
vectors
are by definition of variable length, whereasarrays
are fixed length. For C applications, vectors and fixed-length arrays require a number of special functions. Fixed-lengtharrays
are given the specified number of bytes of static memory in the record layout, butvector
fields are initially just references that must be allocated storage at runtime. The_alloc()
function reserves space for thevector
field’s elements within a data layout. The application must call the_alloc()
function to supply the size of thevector
before values can be stored in thevector
field. Otherwise thevector
reference will remainnull
. Invoking the_alloc()
function for avector
field of an existing object will resize thevector
. If the new size is less than the current size thevector
is truncated to the new size.MCO_RET classname_fieldname_alloc( /*IN*/ classname *handle, /*IN*/ uint2 size);The functions that operate on
vector
andarray
fields require an index argument but are otherwise functionally equivalent to their scalar counterparts. The_put()
function for fields declared asvector
or fixed-sizearray
have the form:MCO_RET classname_fieldname_put( /*IN*/ classname *handle, /*IN*/ uint2 index, /*IN*/ <type> value);Fields declared as vectors of strings have the form:
MCO_RET classname_fieldname_put( /*IN*/ classname *handle, /*IN*/ uint2 index, /*IN*/ const char * value, /*IN*/ uint2 len);For convenience,
_put_range()
methods are generated to assign an array of values to avector
orarray
. (Note that the size of the IN array should be less than or equal to the size of thevector
as specified in the vector’s_alloc()
function call, or the size of thearray
as defined in the<classname_fieldname>_size
constant in the generated header file.MCO_RET classname_fieldname_put_range( /*IN*/ classname *handle, /*IN*/ uint2 start_index, /*IN*/ uint2 num, /*IN*/ const <type> *src );Note that
_put_range()
methods are only generated forvectors
that consist of simple scalar elements. For vectors of structures and vectors of strings this method is not generated. The reason is that for simple typevector
elements the schema compiler can generate optimized methods to assign values to them. This optimization is only possible if the size of the vector element is known at compile time. Also note that it is never necessary to use a_put_range()
method to set thevector
; the_put()
function can always be iterated to assign individualvector
element values for the desired range.To access a specific element of a vector the
_at()
function is generated. The form of the_at()
function will vary depending on the type of elements stored in thevector
. Forvectors
of fixed-length fields it will have the form:MCO_RET classname_fieldname_at( /*IN*/ classname *handle, /*IN*/ uint2 index, /*OUT*/ <type> *result );If the vector consists of
strings
or fixed length byte-arrays (char<n>
), the_at()
function takes two extra parameters: the maximum size of the buffer to receive the string and the actual length of the string returned:MCO_RET classname_fieldname_at( /*IN*/ classname *handle, /*IN*/ uint2 index, /*OUT*/ char *result, /*IN*/ uint2 bufsize, /*OUT*/ uint2 *len);When allocating memory (for host variables) for
vectors
of variable length elements, it may be necessary to first determine the actual size of the vector element. The_at_len()
functions are generated for vectors of strings for this purpose:MCO_RET classname_fieldname_at_len( /*IN*/ classname, /*IN*/ uint2 pos, /*OUT*/ uint2 *retlen);The
_get_range()
function returns a range ofvector
elements, for vectors of scalar elements:MCO_RET classname_fieldname_get_range( /*IN*/ classname, /*IN*/ uint2 startIndex, /*IN*/ uint2 num, /*OUT*/ const <type> *dest);The
_erase()
function is generated for vectors of structures, vectors of strings, as well as for optional struct fields. The_erase()
function removes an element of avector
from the layout and from all indexes the element is included in. Note that thevector
size remains unchanged. If an attempt is made to get the erased element, the runtime returns anull
pointer andMCO_S_OK
. (Also note that the_erase()
function is only generated for vectors of structures, not forvectors
of basic types. Forvectors
of basic types, the application should_put()
a recognizable value in thevector
element that it can interpret asnull
.)MCO_RET classname_fieldname_erase( /*IN*/ classname *handle, /*IN*/ uint2 index);The use of the
_erase()
function can leave unused elements (“holes”) invector
fields. For this reason, the_pack()
function is generated forvector
fields to remove “holes” so that the space occupied by the deleted element is returned to the free database memory pool.Likewise, if an application had a non-empty
string
allocated and then modified thestring
value tonull
, the_size()
function would return 0, but the actual space for the string would not be automatically reclaimed. The application needs to call the generated_pack()
function to return that space to the storage pool.MCO_RET classname_pack ( /*IN*/ classname *handle, /*OUT*/ uint4 pages_released );Character string collation
The eXtremeDB core and UDA programming interfaces for C applications include support for collations. A collation, as defined in Wikipedia, “is the assembly of written information into a standard order. One common type of collation is called alphabetization, though collation is not limited to ordering letters of the alphabet.”
Collation is implemented as a set of rules for comparing characters in a character set. A character set is a set of symbols with assigned ordinals that determine precise ordering. For example, in the segment of the Italian alphabet consisting of the letters “a, a`, b, c, d, e, e`, f” the letters could be assigned the following ordinals: a=0, a`=1, b=2, c=3, d=4, e=5, e`=6, f=7. This mapping will assure that the letter “a`” (“accented a”) will be sorted after “a” but before “b”, and “e`” will follow “e” but precede “f”.
In some character sets, multiple-character combinations like “AE” (“labor lapsus” in the Danish and Norwegian alphabets or “ash” in Old-English) and “OE” (an alternate form of “Ö” or “O-umlaut” in the German alphabet) are treated as single letters. This poses a collation problem when strings containing these character combinations need to be ordered. Clearly, a collation algorithm to sort strings of these character sets must compare more than a single character at a time.
“Capitalization” is also a collation issue. In some cases strings will be compared in a “case sensitive” manner where for example the letters “a-z” will follow the (uppercase) letter “Z”, while more often strings will be compared in a “case insensitive” manner where “a” follows “A”, “b” follows “B”, etc. This can be easily accomplished by treating uppercase and lowercase versions of each letter as equivalent, by converting upper to lower or vice versa before comparing strings, or by assigning them the same ordinal in a case-insensitive character set.
eXtremeDB enables comparison of strings using a variety of collations, and to mix strings and character arrays with different character sets or collations in the same database; character sets and collations are specified at the application level.
Collation Data Definition Language and API function definitions
As explained in page Custom Collations, eXtremeDB DDL language provides a collation declaration for
tree
andhash
indexes on string-type fields as follows:[unique] tree<string_field_name_1 [collate C1] [, string_field_name_2 [collate C2]], …> index_name; hash<string_field_name_1 [collate C1] [, string_field_name_2 [collate C2]], …> index_name;If a collation is not explicitly specified for an index component, the default collation is used. Based on the DDL declaration, for each collation the DDL compiler will generate the following compare function placeholders for tree indexes and/or hash indexes using this collation:
int2 collation_name_collation_compare( mco_collate_h c1, uint2 len1, mco_collate_h c2, uint2 len2 ); { /* TODO: add your implementation here */ return 0; } uint4 collation_name_collation_hash (mco_collate_h c, uint2 len) { /* TODO: add your implementation here */ return 0; }For each defined collation, a separate API is generated. The actual implementation of the compare functions, including the definition of character sets, is the application’s responsibility. To facilitate compare function implementation, eXtremeDB provides the following set of functions:
mco_collate_get_char(mco_collate_h s, char *buf, uint2 len); mco_collate_get_nchar(mco_collate_h s, nchar_t *buf, uint2 len); mco_collate_get_wchar(mco_collate_h s, wchar_t *buf, uint2 len); mco_collate_get_char_range(mco_collate_h s, char *buf, uint2 from, uint2 len); mco_collate_get_nchar_range(mco_collate_h s, nchar_t *buf, uint2 from, uint2 len); mco_collate_get_wchar_range(mco_collate_h s, wchar_t *buf, uint2 from, uint2 len);Note that three different versions of the
mco_collate_get_*char()
andmco_collate_get_*char_range()
functions are required because, in order to use the same collation, the arguments must be of the corresponding type for the field being accessed. In other words: for fields of typestring
andchar<n>
, the *char version (mco_collate_get_char()
) will be called; for fields of typenstring
andnchar<n>
, the*nchar
version; and for fields of typewstring
andwchar<n>
, the*wchar()
version.The C application registers user-defined collations via the following function:
mco_db_register_collations(dbname, mydb_get_collations());This function must be called prior to
mco_db_connect()
ormco_db_connect_ctx()
and must be called once for each process that accesses a shared memory database. The second argumentmydb_get_collations()
is a database specific function similar tomydb_get_dictionary()
that is generated by the DDL compiler in the filesmydb.h
andmydb.c
. In addition, the DDL compiler generates the collation compare function stubs inmydb_coll.c
. (Note that if the filemydb_coll.c
already exists, the DDL compiler will display a warning and generatemydb_coll.c.new
instead.)Please see page Custom Collations for further details and examples using custom collations.
Blob Support
BLOB fields are useful when it is necessary to keep streaming data, with no known size limits. C applications use the generated
_get()
function to copy BLOB data to an application’s buffer; it allows specification of a starting offset within the BLOB.MCO_RET classname_fieldname_get( /*IN*/ classname *handle, /*IN*/ uint4 startOffset, /*OUT*/ char *buf, /*IN*/ uint4 bufsz, /*OUT*/ uint4 *len);The
bufsz
parameter is the size of the buffer passed by the application in thebuf
parameter. Thelen
output parameter is the actual number of bytes copied to the buffer (which will be<= bufsz
).The
_size()
function returns the size of a BLOB field. This value can be used to allocate sufficient memory to hold the BLOB, prior to calling the_get()
function.MCO_RET classname_fieldname_size( /*IN*/ classname *handle, /*OUT*/ uint4 * result);The
_put()
function populates a BLOB field, possibly overwriting prior contents. It allocates space and copies data from the application’s buffer; the size of the BLOB must be specified.MCO_RET classname_fieldname_put( /*IN*/ classname *handle, /*IN*/ const void *from, /*IN*/ uint4 nbytes);The
_append()
function is used to append data to an existing BLOB. This method is provided so an application does not have to allocate a single buffer large enough to hold the entire BLOB, but rather can conserve memory by writing the BLOB in manageable pieces.MCO_RET classname_fieldname_append(/*IN*/ classname *handle, /*IN*/ const void * from, /*IN*/ uint4 nbytes );To erase (truncate) a BLOB, pass a size of 0 to the
_put()
method.Binary Data
While
blob
fields are useful for large binary data, they are intended only for large data fields (greater than 1 Kb). It is recommended to usestring
fields for character or binary data (less than 64 Kb).String
fields can hold arbitrary binary data when not used for indexes (because index comparisons require a null terminator). Bu unlikeblob
fields,binary
andvarbinary
fields can be added to both simple and compound indexes.Date, Time and Datetime Fields
Please refer to the Datetime FIelds page for a detailed description of the eXtremeDB
date
,time
anddatetime
database field types. The C APIs for determining precision and accessingdate
,time
anddatetime
fields are described in the Managing Datetime Fields in C page.