As explained in the Class Relationships page, eXtremeDB provides a number of ways to implement relationships (joins) between tables, each having distinct advantages and costs in terms of memory overhead and performance. The different techniques available to C developers are presented in the following sections.
OID references
In the "satellite radio receiver" example, an
OID
is used to join two classes. This is a special case that can be optimal when the data model contains a reasonable set of data that uniquely identifies the principal object. Please use this link to view details of this example.In this case the
OID
is struct ProgId which contains a single 4-byte integer. But the actual data set that uniquely identifies a class could be quite complex, containing several fields of different types. The eXtremeDB runtime has no knowledge of the internal structure of theOID
. They are represented by byte arrays and accessed via a hash index over that array. Theref
field that references theOID
in class ProgramData (TimeSlot.pId
in this example) has the same structure as theOID
and is used as key in theOID
-based lookup operations. (It is important to note that theoid
/ref
relationship joins this TimeSlot object to a specific ProgramData object but does not include any cascade-type operations or sanity (referential integrity) verifications of any kind.)Effectively this "
OID
-join” is implemented by an internal hash index created by the eXtremeDB runtime through theoid
/ref
declaration. Joins can also be implemented explicitly without using anOID
. Three such techniques are explained below.Index join
A simple one-to-many relationship between two tables, Department and Employee, could be implemented in SQL as follows:
select e.name, d.name from Employee e inner join Department d d.dept_no = d.dept_no;To implement a one-to-many relationship like the Department > Employee example, we might define an eXtremeDB database schema as follows:
#define uint2 unsigned<2> declare database index_join_db; class Department { string name; string code; uint2 dept_no; unique tree<name> Iname; unique tree<code> Icode; unique hash<dept_no> Idept_no[1000]; }; class Employee { string name; uint2 dept_no; unique tree<name> Iname; unique tree<dept_no,name> Idept_name; };Here the unique hash index
Idept_no
acts as the “primary key” on the Department class, and the unique B-Tree indexIdept_name
acts as the “foreign key” in the Employee class. Note thatIdept_name
is a compound index consisting of two field values in the Employee class:dept_no
andname
. It is not necessary that a “foreign key” be compound. A simple B-Tree index on fielddept_no
would be sufficient to form the join, but with the compound index we have the additional advantage that we are able to order all instances with the samedept_no
value in alphabetical order by thename
field.To demonstrate how the join is implemented, consider the following code snippets (taken from the SDK sample "05_indexes/joins/index_join”) that manage Department and Employee objects. First we populate the Department class with some sample data:
// The struct corresponding to class Department struct department_data { char name[20]; char code[10]; uint2 dept_no; }; struct department_data departments[] = { { "Accounting", "Acct", 101 }, { "Engineering", "Eng", 102 }, { "Customer Service", "CS", 103 }, { "Technical Support", "TS", 104 } }; ... printf("\nCreate departments:\n"); for (i = 0; i < sizeof(departments) / sizeof(departments[0]) && MCO_S_OK == rc; ++i) { rc = mco_trans_start(db, MCO_READ_WRITE, MCO_TRANS_FOREGROUND, &t); if (MCO_S_OK == rc) { Department dept; Department_new(t, &dept); Department_name_put(&dept, departments[i].name, strlen(departments[i].name)); Department_code_put(&dept, departments[i].code, strlen(departments[i].code)); Department_dept_no_put(&dept, departments[i].dept_no); printf("\n\t%d) %s, %s, dept_no=%d", i, departments[i].code, departments[i].name, departments[i].dept_no ); rc = mco_trans_commit(t); } }Then we insert Employee objects associating them with the appropriate Department object:
// The struct corresponding to class Employee struct employee_data { char name[20]; uint2 dept_no; }; // Define the relationship of Employees to Departments struct employee_department { char name[20]; char dept_code[10]; }; struct employee_department employee_departments[] = { { "John", "Acct" }, { "Samuel", "Acct" }, { "Thomas", "Acct" }, { "David", "Eng" }, { "James", "Eng" }, { "Robert", "Eng" }, { "William", "CS" }, { "Kevin", "CS" }, { "Alex", "CS" }, { "Daniel", "CS" }, { "Diego", "CS" }, { "Brandon", "TS" } }; ... printf("\n\nCreate employees and join each to a department:\n"); for (i = 0; i < sizeof(employee_departments) / sizeof(employee_departments[0]) && MCO_S_OK == rc; ++i) { rc = mco_trans_start(db, MCO_READ_WRITE, MCO_TRANS_FOREGROUND, &t); if (MCO_S_OK == rc) { // Find Department by code; extract dept_no; // create Employee and assign name, dept_no Department dept; rc = Department_Icode_find(t, employee_departments[i].dept_code, strlen(employee_departments[i].dept_code), &dept); if (MCO_S_OK == rc) { uint2 dept_no = 0; rc = Department_dept_no_get(&dept, &dept_no); if (MCO_S_OK == rc) { Employee emp; Employee_new(t, &emp); Employee_name_put(&emp, employee_departments[i].name, strlen(employee_departments[i].name)); Employee_dept_no_put(&emp, dept_no); printf("\n\t%d) %s, dept_no=%d", i, employee_departments[i].name, dept_no); rc = mco_trans_commit(t); } ... } ... } ... }Note that we use the
Department_Icode
index to find the Department object with the specified code; then the Department’sdept_no
is extracted to be stored in the new Employee object. This creates the association (relationship) between the Department object with the Employee object upon which the join is implemented during a subsequent query.Now to navigate the relationship between (i.e. “join”) these two classes we might use code like the following to display all of a given employee’s coworkers in their department:
// 1. Find the Employee object by name and extract dept_no rc = Employee_Iname_find(t, search_name, (uint2)strlen(search_name), &emp1); if (MCO_S_OK == rc) { printf("\n\n%s's co-workers in ", search_name); Employee_dept_no_get(&emp1, &dept_no1); Employee_Idept_name_index_cursor(t, &csr); // 2. Find the Department object by its dept_no and display Department name rc = Department_Idept_no_find(t, dept_no1, &dept1); if (MCO_S_OK == rc) { Department_name_get(&dept1, name, sizeof(name), &len); printf("%s are:\n", name); // 3. Position the cursor in the Idept_name index to the first object with // this dept_no rc = Employee_Idept_name_search(t, &csr, MCO_GE, dept_no1, "", 0); // Scroll forward through the cursor and display the names found... while (MCO_S_OK == rc) { // Check if the current Employee is same as the found Employee Employee_from_cursor(t, &csr, &emp2); Employee_name_get(&emp2, name, sizeof(name), &len); // If the two names are not equal, display the name (if dept_no is // still the same) if (0 != strcmp(name, search_name)) { // Verify that the dept_no is still the same, otherwise break out Employee_dept_no_get(&emp2, &dept_no2); if (dept_no1 != dept_no2) break; printf("\n\t%s", name); } rc = mco_cursor_next(t, &csr); } } }Note that we use the
Employee_Iname
index to find the Employee object with the specified name; then the Employee’sdept_no
is extracted to perform a search on indexEmployee_Idept_name
, which positions the cursor to the first node in the index tree having thisdept_no
. Then we scroll through the cursor until thedept_no
is different. For each index node we get the Employee object from the cursor and extract itsname
field. If this is not thename
of the original employee we are searching coworkers for, we display the name and callmco_cursor_next()
to go to the next node in the index tree.Pros and Cons:
This is a typical “relational” style join familiar to many developers. While it is intuitive and straightforward to implement, it incurs significant overhead to manage several indexes. When an insert transaction is committed, a hash index and two B-Tree index nodes are created for each Department object, and two B-Tree index nodes are created for each Employee object. This results in memory consumption in addition to the space occupied by the database objects themselves and the B-Tree index structures require “balancing” which can incur significant processor cycles. Hence, the overall performance for inserting objects is not optimal.
Lookup of Department objects by
dept_no
is efficient due to the hash indexIdept_no
. And the cursor created on the Employee objects through compound indexEmployee_Idept_name
facilitates “scrolling” through the index tree to list employees in alphabetical order byname
. Note that to display thename
field, an additional function call,Employee_from_cursor()
, is required to access the database object.Autoid reference
An alternative technique to the index-join example above is to implement the “foreign key” relationship with a direct
reference
from the Employee object to its Department. Anautoid
field can serve this purpose. For example, we might define a database schema as follows:#define uint2 unsigned<2> declare database autoid_ref_db; class Department { autoid[1000]; string name; string code; unique tree<name> Iname; unique tree<code> Icode; }; class Employee { string name; autoid_t dept; unique tree<name> Iname; unique tree<dept,name> Idept_name; };Note that the Department class no longer has the
dept_no
field and thus no indexIdept_no
. Instead it has the declarationautoid[1000];
. This actually causes a “hidden” hash index to be maintained for the Department objectautoid
values that are automatically incremented as new objects are created. And note the fielddept
in the Employee class which will store theautoid
of the associated Department object.
Note that there is an alternative form of the
autoid_t
field declaration that is useful when an SQL interface is used to access the referenced table. For example the Employee class could be defined:class Employee { string name; autoid_t<Department> dept; unique tree<name> Iname; unique tree<dept,name> Idept_name; };This syntax specifies the class being referenced by the field value. The eXtremeDB runtime does not verify this declaration and does not use it in any way (in fact it could be any integer value). However, the eXtremeSQL runtime “trusts" the declaration and makes use of it to implement references in SQL. Again, there are no cascading provisions or referential integrity checks of any kind. So care must be taken if, for example, an attempt is made to delete a Department object that is referenced by one or more Employee objects.
To demonstrate how this join is implemented, consider the following code snippets (taken from the SDK sample “05_indexes/joins/autoid_ref”). We populate the Department class as in the index_join sample above, but note that when inserting Employee objects we extract the
autoid
from the corresponding Department object:printf("\nCreate employees and join each to a department:\n"); for (i = 0; i < sizeof(employee_departments) / sizeof(employee_departments[0]) && MCO_S_OK == rc; ++i) { rc = mco_trans_start(db, MCO_READ_WRITE, MCO_TRANS_FOREGROUND, &t); if (MCO_S_OK == rc) { // Find Department by code; extract autoid; create Employee and // assign name, dept (autoid) Department dept; rc = Department_Icode_find(t, employee_departments[i].code, strlen(employee_departments[i].code), &dept); if (MCO_S_OK == rc) { autoid_t dept_id = 0; rc = Department_autoid_get(&dept, &dept_id); if (MCO_S_OK == rc) { Employee emp; Employee_new(t, &emp); Employee_name_put(&emp, employee_departments[i].name, strlen(employee_departments[i].name)); Employee_dept_put(&emp, dept_id); printf("\n\t%d) %s, Department.Autoid=%ld", i, employee_departments[i].name, (long)dept_id); rc = mco_trans_commit(t); } ... } } }The code to display all of a given employee’s coworkers in their department would be nearly identical to that in the index_join sample above with the exception that we call
Department_autoid_find()
to lookup the Department object by itsautoid
. This is equivalent to the call ofDepartment_Idept_no_find()
which uses hash indexIdept_no
, the difference is that the autoid is automatically generated whereas a unique value must be specified for the nodept_no
field. The same technique of scrolling through a cursor on compound indexEmployee_Idept_name
is used to display the employee names in alphabetical order:// 1. find employee record by name rc = Employee_Iname_find(t, search_name, (uint2)strlen(search_name), &emp); if (MCO_S_OK == rc) { printf("\n\n%s's co-workers in ", search_name); Employee_dept_get(&emp, &dept_id1); // 2. Find the Department object by its autoid and display the // Department name rc = Department_autoid_find(t, dept_id1, &dept1); if (MCO_S_OK == rc) { Department_name_get(&dept1, name, sizeof(name), &len); printf("%s are:\n", name); // 3. Position the cursor in the Idept_name index to the first object // with this autoid Employee_Idept_name_index_cursor(t, &csr); rc = Employee_Idept_name_search(t, &csr, MCO_GE, dept_id1, "", 0); if (MCO_S_OK == rc) { while (MCO_S_OK == rc) { // Check if the current Employee is same as the found Employee Employee_from_cursor(t, &csr, &emp2); Employee_name_get(&emp2, name, sizeof(name), &len); // If the two names are not equal, display the name (if dept_id is // still the same) if (0 != strcmp(name, search_name)) { // Verify that the dept_id is still the same, otherwise break // out of loop Employee_dept_get(&emp2, &dept_id2); if (dept_id1 != dept_id2) break; printf("\n\t%s", name); } rc = mco_cursor_next(t, &csr); } } } }Pros and Cons:
This implementation uses somewhat less memory by eliminating the field
dept_no
and the lookup of Department objects is likewise efficient due to the hash index generated for theautoid
. But some optimization might be gained by relaxing the application requirements.Note that the major drawback for performance and memory consumption, both for this and the index_join relational approach, is incurred by the B-Tree indexes:
Employee_Idept_name
,Employee_Iname
andDepartment_Iname
. For example, if it is not necessary to lookup aDepartment
object by its name, theDepartment_Iname
could be eliminated. And further, if it is not necessary to display the employees of a given department in alphabetical order, then we can eliminate the compound indexEmployee_Idept_name
. The following example illustrates this approach.Vector of autoids
Instead of storing the
autoid
of the Employee object’s Department in the Employee object, we could instead store an array orvector
of Employeeautoids
in the Department object. For example (see the SDK sample “05_indexes/joins/autoid_vector”), we might define a database schema as follows:#define uint2 unsigned<2> declare database autoid_vector_db; class Department { autoid[1000]; string name; string code; vector<autoid_t> employees; unique hash<code> Icode[1000]; }; class Employee { autoid[1000]; string name; autoid_t dept; unique tree<name> Iname; };Note that both classes are defined with
autoid
. The Employeeautoid
will be stored in the Department class vectoremployees
. We will use the Departmentautoid
in Employee fielddept
, as in the autoid_ref example above, to efficiently access the Department object from the Employee.The code to populate the database becomes a little more complicated as we need to allocate space for the dynamic vector of
autoid
values in the Department objects and assign theseautoids
when we insert the Employee objects. So for each Employee object we execute the following code to allocate space and insert an Employeeautoid
into the Departmentemployees
vector:// Get the autoid of this Department then create the new Employee object // and insert its autoid into the vector Department_autoid_get(&dept, &dept_id); Employee_new(t, &emp); Employee_name_put(&emp, employee_departments[i].name, strlen(employee_departments[i].name)); Employee_dept_put(&emp, dept_id); Employee_autoid_get(&emp, &emp_id); // Allocate space for a new vector element and insert this // Employee object's autoid Department_employees_size(&dept, &vector_size); Department_employees_alloc(&dept, vector_size + 1); Department_employees_put(&dept, vector_size, emp_id);Now the code to display all of a given employee’s coworkers in their department will simply find the Department object as in the autoid_ref sample above by calling
Department_autoid_find()
and then list this department’s employees from the vector as follows:Department_name_get(&dept1, name, sizeof(name), &len); printf("%s are:\n", name); // 3. Scroll through the vector of Employee autoids, find the Employee // object and display its name Department_employees_size(&dept1, &vector_size); for (short j = 0; j < vector_size; j++) { rc = Department_employees_at(&dept1, j, &emp_id2); // End loop when the value of this vector element is 0 if (0 == emp_id2) break; // Skip if this is the autoid of the "search_name" object if (emp_id1 != emp_id2) { // Find the Employee object by autoid and display the name rc = Employee_autoid_find(t, emp_id2, &emp2); if (MCO_S_OK == rc) { Employee_name_get(&emp2, name, sizeof(name), &len); printf("\t%s\n", name); } } }Pros and Cons:
This implementation uses significantly less memory and reduces behind-the-scenes tree structure maintenance by eliminating two B-Tree indexes with the small sacrifice of not ordering the list of Employees in alphabetical order. However, it could be a simple matter to introduce a sorting algorithm into the application to address this; but it should be noted that this approach could become untenable if the number of Employees per Department is very large.