Aggregating User-Defined Functions

The Sql Engine allows user-defined functions (UDF) to perform custom aggregation, i.e. functions used with the SQL group by operator. These user-defined functions can be used like SQL functions sum(), count(), max() etc. (Please see page eXtremeSQL User Defined Functions for further details.)

For an overview see page eXtremeSQL Functions

To use user-defined aggregating functions the following steps must be performed:

     
    set implicit_aggregates=true;
     

Note that even if the UDF is expected to take a scalar argument and return single scalar value, the aggregating UDF must pass array arguments and return an array.

Lua Example

The UDF could be created with Lua code like the following:

     
    create function my_avg(state array(int), ops array(int)) returns array(int) in 'lua' as '
        require("test_udf")
        return my_avg_impl(state, ops)
        end
        ';
     

The function my_avg_body() could be implemented as follows:

     
    function my_avg_impl(state, ops)
        if ops ~= nil then
            -- Aggregating part
            v = ops[1]
            if state == nil then
                state = {v, 1}
            else
                state[1] = state[1] + v
                state[2] = state[2] + 1
            end
            return state
        else
            -- Finalizing part
            local avg = {state[1] / state[2]}
            return avg
        end
    end
     

Consider table Employee initialized with the following data:

     
    create table Employee( id integer, dept_no integer, salary integer);
    insert into Employee values(1, 1      90000);
    insert into Employee values(2, 1      100000);
    insert into Employee values(3, 1      110000);
    insert into Employee values(4, 2      100000);
    insert into Employee values(5, 2      120000);
    insert into Employee values(6, 2      130000);
     

Then the UDF would be invoked with SQL statements like the following:

 
    set implicit_aggregates=true;
    select dept_no, my_avg([null]) from Employee group by dept_no;
    select dept_no, my_avg([salary]) from Employee group by dept_no;
    select dept_no, my_avg([null]) from Employee group by dept_no;
     

and would return the following result set:

 
    dept_no #2
    --------------------------------------------------------
    1       [100000]
    2       [120000]
     

C++ Example

For a C++ UDF, the SQL function registration might look like the following:

 
    create function bx_wt_avg(state array(double), ops array(double)) returns array(double)
        as 'api-sql-09-functions-aggregate',  'bx_wt_avg';
 

The corresponding C++ function prototype could be as follows:

 
    McoSql::Value* bx_wt_avg(McoSql::Runtime* runtime, McoSql::Vector<McoSql::Value>* params);
     

The actual function implementation can be found in the SDK sample source file "samples/native/sql/api/sql-09-functions/aggregate.udf.cpp".

Then the UDF would be invoked with SQL statements like the following:

 
    create table t(x double,y double);
    insert into t values (1,10),(2,20),(3,30);
    select bx_wt_avg([x,y]) from t;
     

and would return the following result set:

 
     #1
    --------------------------------------------------------
    [2.33333333333333, 0.014, 140, 60, 3, 6.95335580371252e-310, 360]