Hash Aggregate Average
Compute the average
close
value for groups of trades havingvolume
values grouped by millions of trades per day (for the first two weeks of March 2013):SELECT symbol, seq_search(day, 20130304, 20130315) as Mar_2013, volume@Mar_2013 as "volume", close@Mar_2013 as "close", seq_hash_agg_avg(close@Mar_2013, volume@Mar_2013/1000000) as avg_vol, seq_hash_group_by(avg_vol) as "group_by" FROM Quote WHERE symbol='IBM'; symbol Mar_2013{} volume{} close{} avg_vol{} group_by{} ------------------------------------------------------------------------------ IBM {20130304, 20130305, 20130306, 20130307, 20130308, 20130311, 20130312, 20130313, 20130314, 20130315} {3693365, 3807706, 3594899, 3884317, 3700786, 3049701, 3592096, 3356946, 5505584, 7936544} {205.190002, 206.529999, 208.380005, 209.419998, 210.380005, 210.080002, 210.550003, 212.059998, 215.800003, 214.919998} {214.919998, 209.073751, 215.800003} {7, 3, 5}Note that the "hash_aggregate" functions calculate aggregate values for each group determined by an arbitrary sequence (not necessarily sorted, such as
close
above) using a hash table. So to interpret the result sets from these functions a “group_by” sequence is required. The functionseq_hash_group_by
returns the sequence of values used for grouping in order to make the correspondence between the values in the result sequence(s) and the groups.In the example above, the
close
prices are divided into groups associated withvolume
values where thevolume
is in ranges of(0-999,999)
,(1,000,000-1,999,999)
,(2,000,000-2,999,999)
, etc. To illustrate we analyze the output as follows:
- We have highlighted values in the output corresponding to the second element of the
group_by
sequence with value 3 (the last sequence in the output).- The corresponding value in the result sequence
avg_vol
is (209.073751).- To see how this is computed, we have highlighted the elements in the
volume
sequence having tradingvolume
between 3M and 4M (volume/1000000 = 3
). These values have position indexes of(0,1,2,3,4,5,6)
.- The
close
values for elements 0 through 6 are averaged as follows:
(205.190002 + 206.529999 + 208.380005 + 209.419998 + 210.380005 + 210.080002 + 210.550003 + 212.059998)
/ 8 =
209.073751
Note also that the single occurrence of
volume
in the 5M range corresponds to the singleavg_vol
value215.800003
, and the single occurrence ofvolume
in the 7M range corresponds to the singleavg_vol
value214.919998
.To run this example
A sample script to demonstrate this
select
statement using xSQL can be run from thesamples/xsql/scripts/financial
directory with the following command:x 11