eXtremeSQL User Defined Functions

eXtremeSQL supports “stored procedures” by allowing developers to create their own Lua User-Defined Functions (UDF). An eXtremeSQL UDF can also be written in Python or C/C++. These methods are described in the sections below. The eXtremeSQL server provides a service (an internal function) that lists all UDFs through a meta table called Metafunction which can be queried like any SQL table:

 
    select * from Metafunction;
    Name	Profile	ReturnType	nArguments	Language
    -------------------------------------------
    format	String(...)	15	2	         C++
    choice	String(...)	15	3	         C++
    trim	String(...)    15	1	         C++
    ...
     

Note that the Metafunction pseudo-table returns information about statically registered functions (written in C++) as well as dynamically created functions which were created by the current connection via the CREATE FUNCTION statement. A Complete list of all user defined function can be found in the Functions table (if it exists).

If the preloadUDF parameter of the SqlOptimizerParameters is set to true, then all of these functions are loaded at the database open time and the Metafunction table will contain the complete list of all functions (dynamically-loaded and statically-linked).

Also note that if the number of arguments is listed as -1, this indicates a variable number of arguments.

Lua UDFs

Lua is a very elegant, popular and easy-to-adopt scripting language, with extensive grammar that supports operator overloading, encapsulation, inheritance, polymorphism and more. Lua's sophisticated and blazingly fast dynamic Just-in-time compiler LuaJIT and small footprint makes it a great procedural language to develop complex database user-defined functions and stored procedures for the eXtremeSQL server. Lua-based UDFs run in the context of the SQL server eliminating inherent client-server IPC and network overhead and fully utilizing the multi-core nature of modern hardware.

(Please see the Using Lua with eXtremeSQL for further details.)

Python UDFs

It is also possible to create UDFs in the Python language if the Python class SqlEngine is used to create the eXtremeDB database. The following steps are involved:

1. The supported python version is 2.7, and there is a requirement to have shared library on the machine. (For Linux this is libpython2.7.so.)

By default, this library is searched in the standard system library load path. This can be altered using the environment variable PYTHONLIB; for example:

 
    export PYTHONLIB=/opt/build/my/python/libpython2.7.so
     

2. Declare the python function that performs the desired operations. The arguments of this function will be the arguments of the UDF and the return value will be translated to an SQL value. For example;

 
    def gen_quotes(Bid, BidSize):
    ...
    Bid.append(some_bids)
    BidSize.append(some_bid_sizes)
 
    return len(some_bids)
     

The code indicated by the ellipsis (…) would use sequence iterators to generate a list of bid values and sizes from fields Bid and BidSz into the variables some_bids and some_bid_sizes.

2. Register the UDF in the SQL engine using a SqlEngine object created (by default) for the local connection:

 
    conn.engine.registerFunction(int, "gen_quotes", gen_quotes, -1)
     

3. Call the UDF as part of a SQL SELECT statement. Consider that we have a table with columns Bid and BidSz and table is populated with some records. We can then generate bids and return the total count of bids generated as follows:

 
    SELECT sum(gen_quotes(Bid, BidSz)) FROM Bids FOR UPDATE;
     

(Note that the FOR UPDATE clause in this example is required because the UDF will append values to sequences Bid and BidSz using sequence iterators.

C/C++ UDFs

A C/C++ UDF must be defined as returning type static String* and can accept arguments of type Value*. Or a user-defined aggregate function must be defined as returning type static Value* and can accept arguments of type Value*. (Please see page Dynamic Memory Allocation in SQL for details on processing queries in UDFs.)

The UDF can be statically linked by “registering” it with the eXtremeSQL runtime using the C/C++ API. This is done in the eXtremeSQL application by declaring a function of type static SqlFunctionDeclaration udf() with arguments specifying the type returned, the name used to call it, a pointer to the function and the number of arguments required. (Note that eXtremeDB has another concept of user defined functions for indexes described in eXtremeDB User Guide. This section describes only SQL UDFs.) For example consider the following definition:

 
    extern "C" 
    {
        McoSql::Value* toggleTrace(McoSql::SqlEngine* engine,
        McoSql::Vector<McoSql::Value>* args)
        {
            McoSql::Value *enable = args->getAt(0);
            return BoolValue::create(func->engine->trace(enable->isTrue()));
        }
    }
     

(Note that the extern "C" declaration is necessary in order to give the function a de-mangled name.)

The UDF can also be loaded dynamically using the SQL CREATE FUNCTION statement which has the following form:

 
    create function NAME ( param1_name param1_type, ...,
    paramN_name paramN_type)
    returns result_type as 'LIBRARY-NAME' [, 'C_FUNCTION-NAME'];
     

For example:

 
    create function toggleTrace(enable bit) returns bit as 'mysharedlib';
     

(Note that if the name of the library the function belongs to, suppose it is 'xxx', doesn't include a suffix, then on UNIX systems it is transformed to the libxxx.so, and on Windows systems it is called xxx.dll.)

Now this UDF can be invoked via the SELECT statement as follows:

 
    select toggleTrace(true);
     

Unlike functions declared in C statically, the functions declared dynamically internally keep information about their arguments types. The types are checked by the SQL compiler. The SQL engine internally represents all integer types as a 64-bit integer (through the int8 type), and the floating-point types as doubles through the real8 type. When a function argument is declared as any other than an int8 or real8 scalar type, implicit conversion is applied. Thus the function can be declared as a float with a float argument, but it will actually receive a double value (represented using a RealValue object) and should return a RealValue as well. (there is no way to return a float value in the SQL API).

Modifying Data Using Dynamically Loaded UDFs

Using UDFs it is possible to change data in the database, execute SQL statements and/or queries, work with sequence iterators, etc. For example consider the following UDF definition:

 
    McoSql::Value *load_cme_trades(McoSql::SqlEngine *engine,
                McoSql::Vector<McoSql::Value> *args)
    {
        McoSql::DataSource *res =  engine->executeQuery(
            engine->getCurrentTransaction(), "SELECT TradeDateTime, 
                SeqNum, GroupCode, ProductCode, InstrumentName,
                EntryPrice, EntryQuantity, AggressorSide, TickDirection
                FROM TOBSeq
                WHERE Book = %s AND
                SecurityID = %i AND
                EntryType = %s",
                Book.c_str(), SecurityID,
                EntryType.c_str() );
        ...
    }
     

Subsequent operations are performed in the context of the transaction calling this UDF, so if changing data in the database within the UDF, this transaction must be READ_WRITE. For INSERT and UPDATE statements the transaction will be upgraded automatically, but if it is necessary to change sequence data using a sequence iterator, the transaction must be upgraded explicitly in UDF code. For example the following could be a continuation of the previous code fragment:

 
        ...
        McoSql::Cursor *cursor = res->records();
        McoSql::Record *rec = cursor->next();
        McoSequence<int8> *fldEntryPrice = dynamic_cast
            <McoSequence<int8>*>(rec->get(5));
 
        engine->getCurrentTransaction()->upgrade();
        EntryPrice->append(values, 1000);
        ...
         

where EntryPrice is a sequence field in the database object TOBSeq.

Notes on Using Sequences with UDFs

As seen above, UDF code can access sequence iterators and perform operations on them. A sequence iterator is represented as an object of class McoGenericSequence or one of its descendants. Please note the following when using sequences within UDFs:

Appending to a sequence

It's possible to append values to a sequence using the McoGenericSequence::append() method only to a materialized sequence. A materialized sequence is represented with an object of type McoSequence<T> which has the following method:

 
    virtual void append(T const* items, mco_size_t nItems);
     

Here T is the actual sequence data type. Again, it is not possible to append items to a sequence that is not materialized. Note that a transaction must be opened in READ_WRITE mode when appending elements. If using automatic transactions and the SELECT statement to access sequence fields, it should be in form of SELECT ... FOR UPDATE as in the example above.

Iterating over sequences

Sequences may be iterated:

1. using low-level iterators obtained with method McoGenericSequence::getIterator()

2. using method McoSql::Value* McoGenericSequence::next(); This method will return an IntValue, RealValue or String representing the sequence element value or NULL if there are no more elements.

Note that in different scenarios the iterator may be not initialized, so before calling next() it is a good practice to call getIterator() which performs initialization. For example:

 
    {
        McoGenericSequence *time_it = dynamic_cast
            <McoGenericSequence *>(args->get(0));
        if (time_it == NULL) 
        {
            process_error();
        }
        time_it->getIterator();
        for (McoSql::Value *vt; vt != NULL; vt = time_it->next()) 
        {
            ...
        }
    }
     

Sequence classes

As mentioned in the "Appending to a sequence" section above, sequences may be represented either as McoGenericSequence class or McoSequence<T> which are inherited in differing scenarios. The best practice is to use McoSequence<T> in UDFs only when required, i.e. for updating or appending to a sequence.

NULL Arguments to UDFs

It is possible to pass NULL as an argument value to a user-defined function. For example:

 
    create function sf(s string) returns string in 'lua' as 'return s end';
    create table t(s string);
    insert into t values(['hello', null]);
    select sf(s) from t;
    select sf(null);
     
    #1
    ------------------------------------------------------------------------------
    hello
    null
 
    Selected records: 2