Using Sequences in SQL

Inserting and Updating Sequences with SQL

Sequences can be inserted or updated through SQL in four different ways:

1) As a scalar value:

     
    INSERT OR UPDATE INTO t VALUES (1, 1);
     

2) As an array:

 
    INSERT OR UPDATE INTO t VALUES (1, [1,2,3]);
     

3) As a string:

 
    INSERT OR UPDATE INTO t VALUES (1, '{1,2,3}');
     

4) As a result of a subquery:

 
    INSERT OR UPDATE INTO t SELECT * FROM tt;
     

In eXtremeSQL releases prior to 7.1, when methods 1 through 3 were used, the sequence was appended to, while in the 4th method (subquery) the sequence was overwritten. With release 7.1 and later this behavior has been changed and is now regulated through the append_mode parameter: if append_mode == true, the sequence is appended to (which is the default behavior), otherwise overwritten. The append_mode parameter is modified via the SQL set command:

     
    set append_mode true
     

The INSERT OR UPDATE statement can also be used to insert elements at random positions inside sequences (eXtremeSQL allowes out-of-order modifications of sequences). This feature is useful for filling the "gaps" in ordered sequences (e.g. when some records in a time series are missing and must be filled in later). However, it is subject to the following limitations:

Note that consistency of the sequences must be ensured by the application; eXtremeSQL does not enforce any constraints (apart from ordering where explicitly defined), which means that it is up to the application to define the semantics of the data. For example, given three sequences timestamp, bid, and ask, the application must see to it that all three of them are updated (appended to) when needed.

The seq_remove() function may be used to remove elements from sequences. It accepts a single argument which must be a projection of elements to be removed from the sequence, and returns the number of elements removed.

 

Inserting and Removing from Sequences Example

Step 1: Create and populate table Quote

Create and populate table Quote with day, open and close sequences, indexed by symbol name (sym). As per the aforementioned requirements, the first field of the table (sym) is the primary key, others are sequences; the first sequence (day) is ordered (ascending). Notice that the sequence data contains a gap between 20170505 and 20170509, which we will fill in, in Step 2. Also notice that the parameter append_mode is explicitly set to true.

 
    set append_mode true;
    create table Quotes(sym string primary key,
        day sequence(int asc),
        open sequence(int),
        close sequence(int));
    insert into Quotes values('AAA',
        [20170501, 20170502, 20170503, 20170504, 20170505, 20170509, 20170510],
        [101, 102, 103, 104, 105, 109, 110],
        [111, 112, 113, 114, 115, 119, 120]);
    insert into Quotes values('BBB',
        [20170501, 20170502, 20170503, 20170504, 20170505, 20170509, 20170510],
        [201, 202, 203, 204, 205, 209, 210],
        [211, 212, 213, 214, 215, 219, 220]);
    insert into Quotes values('CCC',
        [20170501, 20170502, 20170503, 20170504, 20170505, 20170509, 20170510],
        [301, 302, 303, 304, 305, 309, 310],
        [311, 312, 313, 314, 315, 319, 320]);
     

Step 2: Create temporary table QuoteHorizontal

Create a temporary table QuoteHorizontal containing the missing elements and populate table Quotes from QuoteHorizontal using the insert or update statement with order by clause (notice that the order by day clause is required to properly order the elements):

 
    create table QuotesHorizontal(sym string , day int, open int, close int);
    insert into QuotesHorizontal values('AAA', 20170506, 106, 116),
        ('BBB', 20170506, 206, 216), ('CCC', 20170506, 306, 316);
    insert into QuotesHorizontal values('AAA', 20170507, 107, 117),
        ('BBB', 20170507, 207, 217), ('CCC', 20170507, 307, 317);
    insert into QuotesHorizontal values('AAA', 20170508, 108, 118),
        ('BBB', 20170508, 208, 218), ('CCC', 20170508, 308, 318);
    insert or update into Quotes select * from QuotesHorizontal order by day;
     

Step 3: Drop some data from the sequences

At this point there are no gaps in the sequence data. We will use the seq_remove() function to drop some data from the sequences, e.g. from May 1st to May 3rd. Note that the for update clause is required to modify the sequences (see the SQL DML Select page for further details regarding the for update clause):

 
    select !seq_search(day, 20170501, 20170503) as daterange,
        seq_remove(day@daterange), seq_remove(open@daterange),
        seq_remove(close@daterange)
    from Quotes for update;
     

(Note that the operator "!" is a shortcut for seq_ignore(). The ignored column may be used in other calculated columns but will not appear in the actual statement output.)

Parallel processing with sequences

Support for the parallel processing of sequences is implemented through a thread pool in the SQL engine that can optimize performance for sequence operations in multi-processor environments. The thread pool allows sequence functions to split sequences into equal parts across threads and perform calculations on the threads in parallel reducing the total execution time.

The size of the thread pool can be regulated by the environment variable MCO_CPU_NUMBER. If no MCO_CPU_NUMBER environment variable is present, then the total number of CPU cores on the system is used as the pool size. If the hardware allows Hyper-Threading (HT) - and most modern CPUs do - the number of threads in the pool is equal to the number of HT cores (normally two per physical core). So if the machine has 4 real cores, the number of threads is 8. This can actually hurt performance sometimes.

The xSQL configuration file the parameter sql_use_connection_pool can disable the use of the thread pool. Note that the default value of this parameter is true (meaning the thread pool is created on startup).

 
    sql_use_connection_pool:false
     

At runtime the function seq_parallelism() can be used to limit the number of threads utilized to run queries. The function accepts a number of threads to use as a parameter and returns the total number of threads in the pool.

 
    seq_parallelism(<n_threads>)
     

The value 1 for n_threads will effectively turn off the thread pool processing. For example, the following command sets the thread pool size to use 4 CPU cores:

 
    XSQL> SELECT seq_parallelism(4);
     

Note that function seq_parallelism() affects the entire server, not just a single client’s session. So seq_parallelism() should not be executed from different clients running concurrently. Also note that if the number of parallel threads requested by functions seq_parallelism() exceeds the size of the pool specified in the environment variable MCO_CPU_NUMBER (the default is the number of available cores), the SQL engine will use the number specified in MCO_CPU_NUMBER.

A Note on thread pool management on Windows

Support for the parallel processing of sequences is implemented thorough the class ThreadPool, which instantiates a global variable ThreadPool. Thus the destructor for the ThreadPool class is called upon exit from the main() automatically. The destructor obviously frees up all resources associated with the pool: memory, synchronization primitives, and tasks. This implementation presents a problem with the Windows C++ runtime. The Windows C++ runtime is implemented so that it stops all threads except for the main thread when the application exits from its main(). The C runtime then calls the destructors for all global objects, which in the case of the ThreadPool tasks is too late - those tasks were already stopped by the ThreadPool destructor.

A partial solution has been implemented to simply not call the ThreadPool destructor (only when running on Windows). It is remotely possible to introduce a resource leak in a scenario in which the database is created in the context of the xSQL server (or a .NET Framework or JNI server application) that loads the database DLL (mcoxsql.dll or mcoxsql_debug.dll) dynamically at runtime. If the database makes use of sequences and the server reloads the DLL multiple times, each time a resource leak is created.

A more resilient solution for this problem is forthcoming in a future release, but at present Windows developers should be aware of this potential problem.

Null Values in Sequences

Null values can be inserted into or appended to sequences. For example:

 
    XSQL>set append_mode true;
    XSQL>create table hor(i integer, f float);
    XSQL>insert into hor values(1, 100.0);
    XSQL>insert into hor values(1, NULL);
    XSQL>insert into hor values(1, 300.0);
    XSQL>insert into hor values(1, NULL);
    XSQL>create table ver(i integer primary key, f sequence(float));
    XSQL>insert or update into ver select * from hor;
     

Now the sequence field f in table ver has the expected values:

 
    XSQL>select * from ver;
    i	f
    ----------------------------------------------------------------
    1	{100, null, 300, null}
     

Using the FLATTENED keyword, the sequence can be displayed as follows:

 
    XSQL>select flattened * from ver;
    i	f
    ----------------------------------------------------------------
    1	100
    1	null
    1	300
    1	null
     

Note that it is not allowed to insert null values into ordered sequences, which is expected behavior. An attempt to do so will cause the SQL compiler to emit an error. For example:

 
    XSQL>insert into ver values(1, [1,2,3,null,5]);
    ERROR: Sequence item is out of order
     

Null Values with Sequence Functions

Where it makes sense, sequence functions support null values. For example, using the table ver defined above with the same inserted values:

 
    XSQL>select seq_last(f) from ver where i=1;
    #1
    ----------------------------------------------------------------
    null
 
    XSQL>select seq_first(f) from ver where i=1;
    #1
    ----------------------------------------------------------------
    100
     

The following SQL functions (and their xSQL counterparts) do not support sequences with null values, and hence will return MCO_E_SEQ_NULL_VALUE if called on a sequence with nulls:

SQL xSQL
mco_seq_window_agg*() seq_window_agg*()

mco_seq_group_agg*()

seq_group_agg*()

mco_seq_diff()

seq_diff()

mco_seq_map() seq_map()
mco_seq_union() seq_union()
mco_seq_cross() seq_cross()
mco_seq_extrema() seq_extrema()
mco_seq_stretch*() seq_stretch*()
mco_seq_asof_join() seq_asof_join()

Inserting Null Values with Array Syntax

There is a use case which is potentially confusing when inserting null values into sequences: when a single literal null value is inserted using the INSERT OR UPDATE clause, it sets the sequence field's value to null, instead of appending the null to the sequence. For example:

 
    XSQL>set append_mode true;
    XSQL>create table ver(i integer primary key, f sequence(float));
    XSQL>insert or update into ver values(1, [1,2,3]);
    XSQL>insert or update into ver values(1, null);
    XSQL>select * from ver;
 
    i	f
    ----------------------------------------------------------------
    1	null
     

If the null value needs to be appended to the sequence, the array syntax must be used. For example:

 
    XSQL>set append_mode true;
    XSQL>create table ver(i integer primary key, f sequence(float));
    XSQL>insert or update into ver values(1, [1,2,3]);
    XSQL>insert or update into ver values(1, [null]);
    XSQL>select * from ver;
     
    i	f
    ----------------------------------------------------------------
    1	{1, 2, 3, null}
     

If the initial sequence is empty, the latter syntax will yield a sequence with single null element. For example:

 
    XSQL>set append_mode true;
    XSQL>create table ver(i integer primary key, f sequence(float));
    XSQL>insert or update into ver values(1, [null]);
    XSQL>select * from ver;
    i	f
    ----------------------------------------------------------------
    1	{null}
     

Please note how this is different from the very first case. The first case sets the sequence field's value to null, while using the array syntax results in a sequence with one null element.

Related Topics Link IconRelated Topics