KD-Tree Indexes

As explained in the Indexes page, eXtremeDB supports a k-dimensional tree index, called kdtree which are ideal for multi-dimensional key value searches.

A kdtree is a data structure for organizing points in a k-dimensional space. The kdtree is a binary tree in which every node is a k-dimensional point. Every non-leaf node generates a splitting hyperplane that divides the space into two subspaces. Points left of the hyperplane represent the left sub-tree of that node and the points right of the hyperplane represent the right sub-tree. The hyperplane direction is chosen in the following way: every node split to sub-trees is associated with one of the k-dimensions, such that the hyperplane is perpendicular to that dimension vector.

The kdtree uses a Query-By-Example approach to locate objects that match a given search condition. The application creates pattern object(s) and assigns values to the fields that are included in the search criteria. The kdtree supports simple exact matches as well as range lookups. In the latter case, two pattern objects will be specified: one for the lower and one for the upper bounds of the search condition. If a field value is defined only for one boundary, it is considered an open interval that corresponds to a greater-than-or-equal-to or less-than-or-equal-to search condition.

Example KD-Tree Searches

To demonstrate kdtree index usage, consider the following sample schema:

 
    class Car
    {
        string vendor;
        string model;
        string color;
        uint4 year;
        uint4 mileage;
        boolean automatic;
        boolean ac;
        uint4 price;
        char<3> state;
        string description;
         
        kdtree <year, mileage, color, model, vendor, automatic, ac, price> index;
    };
     

Suppose we populate the database with the following Car objects:

 
    ("Ford", "Mustang", "grey", 2005, 60000, true, true, 20000, "MT");
    ("Ford", "Explorer", "black", 2000, 80000, true, true, 15000, "MA");
    ("Toyota", "Corolla", "green", 2007, 100000, true, true, 12000, "CA");
     

Exact Match Search

To perform a kdtree search, as with other index types, a cursor must first be created. Then, to search the database for all “Ford Mustangs”, we would first create a Car object with vendor="Ford" and model="Mustang". This Car "pattern object" is then passed to the cursor search operation (using the API of choice) as both the lower and upper bound. Once the search returns, the application can traverse the result set using the standard cursor operations first, last, next and prev. (Note that the order of objects in the result set is unpredictable, but only objects that match the specified search criteria are returned.)

Range Search

Range lookups are similar to the above "exact match". In this case we specify "from" and "to" pattern objects. For example we could create Car objects FromCar with vendor="Ford" and Year=2000 and ToCar with vendor="Ford" and Year=2006. Passing these two "pattern objects" to the cursor search operation would find both Car objects with vendor="Ford".

Spatial Search

To demonstrate another search capability with kdtree indexes, consider the following schema:

 
    class SpatialObject 
    {
        Int4 left;
        Int4 top;
        int4 right;
        int4 bottom;
        int  type;
        …
        kdtree <left, top, right, bottom, type> index;
    };
     

Again, using the Query-By-Example (QBE) approach, we could create "pattern objects" Low with coordinates left=LEFT, top=TOP, and High with coordinates right=RIGHT, bottom=BOTTOM. Then passing Low and High to a cursor search with operation type contains would select objects contained within the specified rectangle. Specifying operation type overlaps would select objects intersecting with the specified rectangle..

Note that kdtree indexes are inherently unbalanced. While they are supported for persistent classes, because they are unbalanced the performance may be sub-optimal and therefore kdtree indexes are most useful for transient (in-memory) classes.