Hash Aggregate Average
Compute the average
closevalue for groups of trades havingvolumevalues grouped by millions of trades per day (for the first two weeks of March 2013):SELECT symbol, seq_search(day, 20130304, 20130315) as Mar_2013, volume@Mar_2013 as "volume", close@Mar_2013 as "close", seq_hash_agg_avg(close@Mar_2013, volume@Mar_2013/1000000) as avg_vol, seq_hash_group_by(avg_vol) as "group_by" FROM Quote WHERE symbol='IBM'; symbol Mar_2013{} volume{} close{} avg_vol{} group_by{} ------------------------------------------------------------------------------ IBM {20130304, 20130305, 20130306, 20130307, 20130308, 20130311, 20130312, 20130313, 20130314, 20130315} {3693365, 3807706, 3594899, 3884317, 3700786, 3049701, 3592096, 3356946, 5505584, 7936544} {205.190002, 206.529999, 208.380005, 209.419998, 210.380005, 210.080002, 210.550003, 212.059998, 215.800003, 214.919998} {214.919998, 209.073751, 215.800003} {7, 3, 5}Note that the "hash_aggregate" functions calculate aggregate values for each group determined by an arbitrary sequence (not necessarily sorted, such as
closeabove) using a hash table. So to interpret the result sets from these functions a “group_by” sequence is required. The functionseq_hash_group_byreturns the sequence of values used for grouping in order to make the correspondence between the values in the result sequence(s) and the groups.In the example above, the
closeprices are divided into groups associated withvolumevalues where thevolumeis in ranges of(0-999,999),(1,000,000-1,999,999),(2,000,000-2,999,999), etc. To illustrate we analyze the output as follows:
- We have highlighted values in the output corresponding to the second element of the
group_bysequence with value 3 (the last sequence in the output).- The corresponding value in the result sequence
avg_volis (209.073751).- To see how this is computed, we have highlighted the elements in the
volumesequence having tradingvolumebetween 3M and 4M (volume/1000000 = 3). These values have position indexes of(0,1,2,3,4,5,6).- The
closevalues for elements 0 through 6 are averaged as follows:
(205.190002 + 206.529999 + 208.380005 + 209.419998 + 210.380005 + 210.080002 + 210.550003 + 212.059998)/ 8 =209.073751Note also that the single occurrence of
volumein the 5M range corresponds to the singleavg_volvalue215.800003, and the single occurrence ofvolumein the 7M range corresponds to the singleavg_volvalue214.919998.To run this example
A sample script to demonstrate this
selectstatement using xSQL can be run from thesamples/xsql/scripts/financialdirectory with the following command:x 11