As explained in the Indexes page, the eXtremeDB
hash
indexes are more efficient for exact match searches thantree
indexes within in-memory databases. eXtremeDB provideshash
index algorithms modified for efficient operations in memory when the associated class istransient
. Hash indexes will normally exhibit better average performance, for both insert and lookup operations, than atree
index, but this can depend on the initial hash table size and on key distribution.In the ideal case, the
hash
table will be sized to contain one entry (sometimes called a “bucket”) for each database object being indexed which means that a singlehash
table lookup (a simple hash function calculation) will suffice to access each database object. But in reality, regardless of the table size, the hash function does not guarantee a unique value and consequently multiple index values can map to the same bucket (called a “conflict”) and a linked list of object pointers (called a “collision chain”) will result.So if the number of database objects is known beforehand, this number can be specified in the schema declaration for the index and will result in optimal performance. But if the number of database objects cannot be known with a reasonable degree of confidence, the eXtremeDB runtime will dynamically allocate additional hash table space when needed.
When declared
unique
(which is the default),hash
indexes can only be used for exact match searches or unordered list retrieval. Exact match searches are performed with the find operation for the chosen API. However,hash
indexes can be declarednonunique
to allow duplicate values to be stored with the same hash table entry. When declarednonunique
, search operations can be performed to select all objects with the specified hash value.
Memory Consumption
Memory consumption is comparable for
tree
andhash
indexes oftransient
classes. A rough estimate for atree
index is 10 bytes per entry (exact size depends on the order of insertions or deletions); and H + 8 bytes per entry for ahash
index, where the constant H is fixed size space taken by the hash table and can be calculated as E / 5 * 4 where E is the estimated number of hash entries provided by you in the database schema and 5 is a constant hash factor used by eXtremeDB. If reallocation of the hash table is necessary, then the size will be H * 2.
Autoid indexes
The
autoid
declaration specifies that a class is stored with a system-generated unique identifier. Theautoid
is a guaranteed unique value generated by the eXtremeDB runtime and stored in an internalhash
index. Theautoid
value for a new database object is inserted into the index hash table at the moment it is created and is never changed. (Note that this is different from other types of indexes. Normally new values are inserted into the index (hash table or tree structure) when the transaction within which these values are created or updated is committed or when a transaction checkpoint is performed.)The
autoid
field is based on an unsigned 64-bit integer. Initially, during the database creation phase, all counters are initialized with zeros. During the database lifetime the counters increase each time a class object is created. eXtremeDB never decreasesautoid
values and never re-uses the same value. However technically it is possible for the counter to overflow and in this case, the counter would start to repeat the values. However note that ifautoids
are created every millisecond, the counter will not rollover for 6,405,119,470,038 days (over 17 million years)!Object Identifier indexes
The C API allows definition of unique Object Identifier (
oid
) structures that can be highly efficient. Please refer to the C API Indexes page for further details.