The Java Aggregator Class

Aggregator is used to perform aggregation. Grouping and aggregation is done using two interfaces Aggregate and GroupBy. There are some predefined implementations for standard aggregates: max, min, sum, avg, count, etc. And it is possible to define user-defined aggregates. The GroupBy interface should be implemented by the developer (it is convenient to use anonymous classes) and defines aggregation for queries; i.e. how to split input data into groups that aggregate for the calculations.

For an overview see page Java Classes

Aggregator uses a map to associate aggregate states with groups. This map is returned as the result of aggregation. Aggregator can use an ordered or unordered map (i.e a TreeMap or HashMap). An ordered map returns results in ascending order of the group-by values. For example, consider the table:

 
    class Quote 
    {
                @Indexable
                public long  date;
                public float open;
        public float close;
                public float low;
        public float high;
        public int   volume;
    };
     

Now to execute a query like "select standard deviation of difference between low and high prices for IBM for each month since 1990", we would implement code like the following:

 
    Cursor<Quote> cursor = new Cursor<Quote>(con, Quote.class, "date");
         if (cursor.search(Operation.GreaterOrEquals, (new Date(1990, 0, 1)).getTime())) 
    {
                Map<Object,Aggregator.Aggregate> result = Aggregator.<Quote>aggregate(cursor,
                                    new Aggregator.GroupBy<Quote>() 
        {
                        public Aggregator.Aggregate getAggregate() 
            { 
                return new Aggregator.DevAggregate(); 
            }
            public Object getKey(Quote quote) 
            { 
                return (new Date(quote.date)).getMonth(); 
            }
            public Object getValue(Quote quote) 
            { 
                return quote.high - quote.low; 
            }
            public Aggregator.FilterResult filter(Quote quote) 
            { 
                return Filter.Use; 
            }
        }, true);
         
        for (Map.Entry<Object,Aggregator.Aggregate> pair : result.entrySet()) 
        {
                        System.out.println("Group " + pair.getKey() + "->" + pair.getValue().result());
        }
    }
     

 

Class Definition

 
    public  class Aggregator
    {
        ...
        public enum FilterResult 
        {
            Use,
            Skip,
            Stop
        };
         
        public interface Aggregate<T> {…}
         
        public interface GroupBy<T> {…}
 
        public static <T> Map<Object,Aggregate> ...  {…}
 
        public static void merge(Map<Object,Aggregate> dst, Map<Object,Aggregate> src) {…}
 
        public static class TopAggregate implements Aggregate<Comparable> {…}
         
        public static class MaxAggregate implements Aggregate<Comparable> {…}
         
        public static class MinAggregate implements Aggregate<Comparable> {…}
         
        public static class RealSumAggregate implements Aggregate<Number> {…}
         
        public static class IntegerSumAggregate implements Aggregate<Number> {…}
         
        public static class AvgAggregate implements Aggregate<Number> {…}
         
        public static class PrdAggregate implements Aggregate<Number> {…}
         
        public static class VarAggregate implements Aggregate<Number> {…}
         
        public static class DevAggregate extends VarAggregate {…}
         
        public static class CountAggregate implements Aggregate {…}
         
        public static class DistinctCountAggregate implements Aggregate {…}
         
        public static class RepeatCountAggregate implements Aggregate {…}
         
        public static class ApproxDistinctCountAggregate implements Aggregate {…}
         
        public static class FirstAggregate implements Aggregate {…}
         
        public static class LastAggregate implements Aggregate {…}
         
        public static class CompoundAggregate implements Aggregate {…}
         
    };
     

Methods

enum FilterResult

Enumerated constants used to control filtering of query results:

   
  public enum FilterResult 
  {
    Use,
    Skip,
    Stop
  };
   
Aggregate<T> Implemented by all standard aggregates and can be used to define custom aggregates
GroupBy<T> Used to specify the aggregation operation

<T> Map<Object,Aggregate>

aggregate(Iterable<T> iterable,

GroupBy<T> groupBy)

Performs the aggregation; Parameters:

iterable A collection of the aggregated objects
groupBy The aggregation operation

Returns: a map with the results of aggregation: <group-by,aggregate-value> pairs

<T> Map<Object,Aggregate>

aggregate(Iterable<T> iterable,

GroupBy<T> groupBy,

boolean orderByKey)

Performs the aggregation; Parameters:

iterable A collection of the aggregated objects
groupBy The aggregation operation
orderByKey Specifies whether an ordered map (TreeMap) should be used for grouping (the Group-by key should provide a comparison operation in this case)

Returns: a map with the results of aggregation: <group-by,aggregate-value> pairs

void merge(Map<Object,Aggregate>

dst, Map<Object,Aggregate> src)

Merge two aggregation results. This method combines the state of aggregates in dst with the aggregate states in src; i.e. dst = merge(dst, src)
Embedded Classes
TopAggregate Aggregate returning the top N values
MaxAggregate The maximum aggregate
MinAggregate The minimum aggregate
RealSumAggregate The sum aggregate for real values
IntegerSumAggregate The sum aggregate for integer values
AvgAggregate The average aggregate
PrdAggregate The product aggregate
VarAggregate The variance aggregate
DevAggregate The standard deviation aggregate
CountAggregate The "count all" aggregate
DistinctCountAggregate The "distinct count" aggregate (note that this method can have a large memory footprint)
RepeatCountAggregate Counts the number of items repeated N or more times
ApproxDistinctCountAggregate The approximate "distinct count" aggregate (note that this is not a precise result)
FirstAggregate The first group element aggregate
LastAggregate The last group element aggregate
CompoundAggregate The compound aggregate: the combination of several aggregates (note that this can calculate more than one aggregate on one traversal)