ToC path:

The Java Aggregator Class

Aggregator is used to perform aggregation. Grouping and aggregation is done using two interfaces Aggregate and GroupBy. There are some predefined implementations for standard aggregates: max, min, sum, avg, count, etc. And it is possible to define user-defined aggregates. The GroupBy interface should be implemented by the developer (it is convenient to use anonymous classes) and defines aggregation for queries; i.e. how to split input data into groups that aggregate for the calculations.

For an overview see page Java Classes

Aggregator uses a map to associate aggregate states with groups. This map is returned as the result of aggregation. Aggregator can use an ordered or unordered map (i.e a TreeMap or HashMap). An ordered map returns results in ascending order of the group-by values. For example, consider the table:

 
    class Quote 
    {
                @Indexable
                public long  date;
                public float open;
        public float close;
                public float low;
        public float high;
        public int   volume;
    };

Now to execute a query like "select standard deviation of difference between low and high prices for IBM for each month since 1990", we would implement code like the following:

 
    Cursor<Quote> cursor = new Cursor<Quote>(con, Quote.class, "date");
         if (cursor.search(Operation.GreaterOrEquals, (new Date(1990, 0, 1)).getTime())) 
    {
                Map<Object,Aggregator.Aggregate> result = Aggregator.<Quote>aggregate(cursor,
                                    new Aggregator.GroupBy<Quote>() 
        {
                        public Aggregator.Aggregate getAggregate() 
            { 
                return new Aggregator.DevAggregate(); 
            }
            public Object getKey(Quote quote) 
            { 
                return (new Date(quote.date)).getMonth(); 
            }
            public Object getValue(Quote quote) 
            { 
                return quote.high - quote.low; 
            }
            public Aggregator.FilterResult filter(Quote quote) 
            { 
                return Filter.Use; 
            }
        }, true);
         
        for (Map.Entry<Object,Aggregator.Aggregate> pair : result.entrySet()) 
        {
                        System.out.println("Group " + pair.getKey() + "->" + pair.getValue().result());
        }
    }

Class Definition

 
    public  class Aggregator
    {
        ...
        public enum FilterResult 
        {
            Use,
            Skip,
            Stop
        };
         
        public interface Aggregate<T> {…}
         
        public interface GroupBy<T> {…}
 
        public static <T> Map<Object,Aggregate> ...  {…}
 
        public static void merge(Map<Object,Aggregate> dst, Map<Object,Aggregate> src) {…}
 
        public static class TopAggregate implements Aggregate<Comparable> {…}
         
        public static class MaxAggregate implements Aggregate<Comparable> {…}
         
        public static class MinAggregate implements Aggregate<Comparable> {…}
         
        public static class RealSumAggregate implements Aggregate<Number> {…}
         
        public static class IntegerSumAggregate implements Aggregate<Number> {…}
         
        public static class AvgAggregate implements Aggregate<Number> {…}
         
        public static class PrdAggregate implements Aggregate<Number> {…}
         
        public static class VarAggregate implements Aggregate<Number> {…}
         
        public static class DevAggregate extends VarAggregate {…}
         
        public static class CountAggregate implements Aggregate {…}
         
        public static class DistinctCountAggregate implements Aggregate {…}
         
        public static class RepeatCountAggregate implements Aggregate {…}
         
        public static class ApproxDistinctCountAggregate implements Aggregate {…}
         
        public static class FirstAggregate implements Aggregate {…}
         
        public static class LastAggregate implements Aggregate {…}
         
        public static class CompoundAggregate implements Aggregate {…}
         
    };

Methods

enum FilterResult

Enumerated constants used to control filtering of query results:

   
  public enum FilterResult 
  {
    Use,
    Skip,
    Stop
  };

Aggregate<T> Implemented by all standard aggregates and can be used to define custom aggregates

GroupBy<T> Used to specify the aggregation operation

<T> Map<Object,Aggregate>

aggregate(Iterable<T> iterable,

GroupBy<T> groupBy)

Performs the aggregation; Parameters:

`iterable`	A collection of the aggregated objects
`groupBy`	The aggregation operation

Returns: a map with the results of aggregation: <group-by,aggregate-value> pairs

<T> Map<Object,Aggregate>

aggregate(Iterable<T> iterable,

GroupBy<T> groupBy,

boolean orderByKey)

Performs the aggregation; Parameters:

`iterable`	A collection of the aggregated objects
`groupBy`	The aggregation operation
`orderByKey`	Specifies whether an ordered map (`TreeMap`) should be used for grouping (the Group-by key should provide a comparison operation in this case)

Returns: a map with the results of aggregation: <group-by,aggregate-value> pairs

void merge(Map<Object,Aggregate>

dst, Map<Object,Aggregate> src)

Merge two aggregation results. This method combines the state of aggregates in dst with the aggregate states in src; i.e. dst = merge(dst, src)

Embedded Classes

TopAggregate Aggregate returning the top N values

MaxAggregate The maximum aggregate

MinAggregate The minimum aggregate

RealSumAggregate The sum aggregate for real values

IntegerSumAggregate The sum aggregate for integer values

AvgAggregate The average aggregate

PrdAggregate The product aggregate

VarAggregate The variance aggregate

DevAggregate The standard deviation aggregate

CountAggregate The "count all" aggregate

DistinctCountAggregate The "distinct count" aggregate (note that this method can have a large memory footprint)

RepeatCountAggregate Counts the number of items repeated N or more times

ApproxDistinctCountAggregate The approximate "distinct count" aggregate (note that this is not a precise result)

FirstAggregate The first group element aggregate

LastAggregate The last group element aggregate

CompoundAggregate The compound aggregate: the combination of several aggregates (note that this can calculate more than one aggregate on one traversal)