Sequences can be inserted or updated through SQL in four different ways:
1) As a scalar value:
INSERT OR UPDATE INTO t VALUES (1, 1);2) As an array:
INSERT OR UPDATE INTO t VALUES (1, [1,2,3]);3) As a string:
INSERT OR UPDATE INTO t VALUES (1, '{1,2,3}');4) As a result of a subquery:
INSERT OR UPDATE INTO t SELECT * FROM tt;In eXtremeSQL releases prior to 7.1, when methods 1 through 3 were used, the sequence was appended to, while in the 4th method (subquery) the sequence was overwritten. With release 7.1 and later this behavior has been changed and is now regulated through the
append_mode
parameter: ifappend_mode == true
, the sequence is appended to (which is the default behavior), otherwise overwritten. Theappend_mode
parameter is modified via the SQL set command:set append_mode trueThe
INSERT OR UPDATE
statement can also be used to insert elements at random positions inside sequences (eXtremeSQL allowes out-of-order modifications of sequences). This feature is useful for filling the "gaps" in ordered sequences (e.g. when some records in a time series are missing and must be filled in later). However, it is subject to the following limitations:
- the
append_mode
parameter must be set totrue
, otherwise the sequences will be overwritten with the new data;- the first field (or fields) of the table must be the primary key;
- the first sequence must be ordered;
- when inserting a range of elements, it must not overlap with existing elements;
- as a corollary, multiple ranges cannot be inserted with a single query if they overlap with existing elements;
- the elements may only be inserted from another (intermediate) table.
Note that consistency of the sequences must be ensured by the application; eXtremeSQL does not enforce any constraints (apart from ordering where explicitly defined), which means that it is up to the application to define the semantics of the data. For example, given three sequences
timestamp
,bid
, andask
, the application must see to it that all three of them are updated (appended to) when needed.The
seq_remove()
function may be used to remove elements from sequences. It accepts a single argument which must be a projection of elements to be removed from the sequence, and returns the number of elements removed.
Inserting and Removing from Sequences Example
Step 1: Create and populate table Quote
Create and populate table Quote with
day
,open
andclose
sequences, indexed by symbol name (sym
). As per the aforementioned requirements, the first field of the table (sym
) is the primary key, others are sequences; the first sequence (day
) is ordered (ascending). Notice that the sequence data contains a gap between20170505
and20170509
, which we will fill in, in Step 2. Also notice that the parameterappend_mode
is explicitly set totrue
.set append_mode true; create table Quotes(sym string primary key, day sequence(int asc), open sequence(int), close sequence(int)); insert into Quotes values('AAA', [20170501, 20170502, 20170503, 20170504, 20170505, 20170509, 20170510], [101, 102, 103, 104, 105, 109, 110], [111, 112, 113, 114, 115, 119, 120]); insert into Quotes values('BBB', [20170501, 20170502, 20170503, 20170504, 20170505, 20170509, 20170510], [201, 202, 203, 204, 205, 209, 210], [211, 212, 213, 214, 215, 219, 220]); insert into Quotes values('CCC', [20170501, 20170502, 20170503, 20170504, 20170505, 20170509, 20170510], [301, 302, 303, 304, 305, 309, 310], [311, 312, 313, 314, 315, 319, 320]);Step 2: Create temporary table QuoteHorizontal
Create a temporary table QuoteHorizontal containing the missing elements and populate table Quotes from QuoteHorizontal using the
insert or update
statement withorder by
clause (notice that theorder by day
clause is required to properly order the elements):create table QuotesHorizontal(sym string , day int, open int, close int); insert into QuotesHorizontal values('AAA', 20170506, 106, 116), ('BBB', 20170506, 206, 216), ('CCC', 20170506, 306, 316); insert into QuotesHorizontal values('AAA', 20170507, 107, 117), ('BBB', 20170507, 207, 217), ('CCC', 20170507, 307, 317); insert into QuotesHorizontal values('AAA', 20170508, 108, 118), ('BBB', 20170508, 208, 218), ('CCC', 20170508, 308, 318); insert or update into Quotes select * from QuotesHorizontal order by day;Step 3: Drop some data from the sequences
At this point there are no gaps in the sequence data. We will use the
seq_remove()
function to drop some data from the sequences, e.g. from May 1st to May 3rd. Note that thefor update
clause is required to modify the sequences (see the SQL DML Select page for further details regarding thefor update
clause):select !seq_search(day, 20170501, 20170503) as daterange, seq_remove(day@daterange), seq_remove(open@daterange), seq_remove(close@daterange) from Quotes for update;(Note that the operator "!" is a shortcut for
seq_ignore()
. The ignored column may be used in other calculated columns but will not appear in the actual statement output.)Parallel processing with sequences
Support for the parallel processing of sequences is implemented through a thread pool in the SQL engine that can optimize performance for sequence operations in multi-processor environments. The thread pool allows sequence functions to split sequences into equal parts across threads and perform calculations on the threads in parallel reducing the total execution time.
The size of the thread pool can be regulated by the environment variable
MCO_CPU_NUMBER
. If noMCO_CPU_NUMBER
environment variable is present, then the total number of CPU cores on the system is used as the pool size. If the hardware allows Hyper-Threading (HT) - and most modern CPUs do - the number of threads in the pool is equal to the number of HT cores (normally two per physical core). So if the machine has 4 real cores, the number of threads is 8. This can actually hurt performance sometimes.The xSQL configuration file the parameter
sql_use_connection_pool
can disable the use of the thread pool. Note that the default value of this parameter is true (meaning the thread pool is created on startup).sql_use_connection_pool:falseAt runtime the function
seq_parallelism()
can be used to limit the number of threads utilized to run queries. The function accepts a number of threads to use as a parameter and returns the total number of threads in the pool.seq_parallelism(<n_threads>)The value 1 for
n_threads
will effectively turn off the thread pool processing. For example, the following command sets the thread pool size to use 4 CPU cores:XSQL> SELECT seq_parallelism(4);
Note that function
seq_parallelism()
affects the entire server, not just a single client’s session. Soseq_parallelism()
should not be executed from different clients running concurrently. Also note that if the number of parallel threads requested by functionsseq_parallelism()
exceeds the size of the pool specified in the environment variableMCO_CPU_NUMBER
(the default is the number of available cores), the SQL engine will use the number specified inMCO_CPU_NUMBER
.A Note on thread pool management on Windows
Support for the parallel processing of sequences is implemented thorough the class ThreadPool, which instantiates a global variable
ThreadPool
. Thus the destructor for the ThreadPool class is called upon exit from themain()
automatically. The destructor obviously frees up all resources associated with the pool: memory, synchronization primitives, and tasks. This implementation presents a problem with the Windows C++ runtime. The Windows C++ runtime is implemented so that it stops all threads except for the main thread when the application exits from itsmain().
The C runtime then calls the destructors for all global objects, which in the case of theThreadPool
tasks is too late - those tasks were already stopped by the ThreadPool destructor.A partial solution has been implemented to simply not call the ThreadPool destructor (only when running on Windows). It is remotely possible to introduce a resource leak in a scenario in which the database is created in the context of the xSQL server (or a .NET Framework or JNI server application) that loads the database DLL (
mcoxsql.dll
ormcoxsql_debug.dll
) dynamically at runtime. If the database makes use of sequences and the server reloads the DLL multiple times, each time a resource leak is created.A more resilient solution for this problem is forthcoming in a future release, but at present Windows developers should be aware of this potential problem.
Null Values in Sequences
Null values can be inserted into or appended to sequences. For example:
XSQL>set append_mode true; XSQL>create table hor(i integer, f float); XSQL>insert into hor values(1, 100.0); XSQL>insert into hor values(1, NULL); XSQL>insert into hor values(1, 300.0); XSQL>insert into hor values(1, NULL); XSQL>create table ver(i integer primary key, f sequence(float)); XSQL>insert or update into ver select * from hor;Now the sequence field
f
in tablever
has the expected values:XSQL>select * from ver; i f ---------------------------------------------------------------- 1 {100, null, 300, null}Using the
FLATTENED
keyword, the sequence can be displayed as follows:XSQL>select flattened * from ver; i f ---------------------------------------------------------------- 1 100 1 null 1 300 1 null
Note that it is not allowed to insert null values into ordered sequences, which is expected behavior. An attempt to do so will cause the SQL compiler to emit an error. For example:
XSQL>insert into ver values(1, [1,2,3,null,5]); ERROR: Sequence item is out of orderNull Values with Sequence Functions
Where it makes sense, sequence functions support null values. For example, using the table
ver
defined above with the same inserted values:XSQL>select seq_last(f) from ver where i=1; #1 ---------------------------------------------------------------- null XSQL>select seq_first(f) from ver where i=1; #1 ---------------------------------------------------------------- 100The following SQL functions (and their xSQL counterparts) do not support sequences with null values, and hence will return
MCO_E_SEQ_NULL_VALUE
if called on a sequence with nulls:
SQL xSQL mco_seq_window_agg*() seq_window_agg*() mco_seq_group_agg*()
seq_group_agg*()
mco_seq_diff()
seq_diff()
mco_seq_map() seq_map() mco_seq_union() seq_union() mco_seq_cross() seq_cross() mco_seq_extrema() seq_extrema() mco_seq_stretch*() seq_stretch*() mco_seq_asof_join() seq_asof_join() Inserting Null Values with Array Syntax
There is a use case which is potentially confusing when inserting null values into sequences: when a single literal null value is inserted using the
INSERT OR UPDATE
clause, it sets the sequence field's value to null, instead of appending the null to the sequence. For example:XSQL>set append_mode true; XSQL>create table ver(i integer primary key, f sequence(float)); XSQL>insert or update into ver values(1, [1,2,3]); XSQL>insert or update into ver values(1, null); XSQL>select * from ver; i f ---------------------------------------------------------------- 1 nullIf the null value needs to be appended to the sequence, the array syntax must be used. For example:
XSQL>set append_mode true; XSQL>create table ver(i integer primary key, f sequence(float)); XSQL>insert or update into ver values(1, [1,2,3]); XSQL>insert or update into ver values(1, [null]); XSQL>select * from ver; i f ---------------------------------------------------------------- 1 {1, 2, 3, null}If the initial sequence is empty, the latter syntax will yield a sequence with single null element. For example:
XSQL>set append_mode true; XSQL>create table ver(i integer primary key, f sequence(float)); XSQL>insert or update into ver values(1, [null]); XSQL>select * from ver; i f ---------------------------------------------------------------- 1 {null}Please note how this is different from the very first case. The first case sets the sequence field's value to null, while using the array syntax results in a sequence with one null element.