Database Class Relationships in C

As explained in the Class Relationships page, eXtremeDB provides a number of ways to implement relationships (joins) between tables, each having distinct advantages and costs in terms of memory overhead and performance. The different techniques available to C developers are presented in the following sections.

OID references

In the "satellite radio receiver" example, an OID is used to join two classes. This is a special case that can be optimal when the data model contains a reasonable set of data that uniquely identifies the principal object. Please use this link to view details of this example.

In this case the OID is struct ProgId which contains a single 4-byte integer. But the actual data set that uniquely identifies a class could be quite complex, containing several fields of different types. The eXtremeDB runtime has no knowledge of the internal structure of the OID. They are represented by byte arrays and accessed via a hash index over that array. The ref field that references the OID in class ProgramData (TimeSlot.pId in this example) has the same structure as the OID and is used as key in the OID-based lookup operations. (It is important to note that the oid/ref relationship joins this TimeSlot object to a specific ProgramData object but does not include any cascade-type operations or sanity (referential integrity) verifications of any kind.)

Effectively this "OID-join” is implemented by an internal hash index created by the eXtremeDB runtime through the oid/ref declaration. Joins can also be implemented explicitly without using an OID. Three such techniques are explained below.

Index join

A simple one-to-many relationship between two tables, Department and Employee, could be implemented in SQL as follows:

 
    select e.name, d.name from Employee e 
    inner join Department d d.dept_no = d.dept_no;
     

To implement a one-to-many relationship like the Department > Employee example, we might define an eXtremeDB database schema as follows:

 
    #define uint2     unsigned<2>
    declare database  index_join_db;
 
    class Department
    {
        string name;
        string code;
        uint2  dept_no;
 
        unique tree<name> Iname;
        unique tree<code> Icode;
        unique hash<dept_no> Idept_no[1000];
    };
     
    class Employee
    {
        string name;
        uint2  dept_no;
 
        unique tree<name> Iname;
        unique tree<dept_no,name> Idept_name;
    };
     

Here the unique hash index Idept_no acts as the “primary key” on the Department class, and the unique B-Tree index Idept_name acts as the “foreign key” in the Employee class. Note that Idept_name is a compound index consisting of two field values in the Employee class: dept_no and name. It is not necessary that a “foreign key” be compound. A simple B-Tree index on field dept_no would be sufficient to form the join, but with the compound index we have the additional advantage that we are able to order all instances with the same dept_no value in alphabetical order by the name field.

To demonstrate how the join is implemented, consider the following code snippets (taken from the SDK sample "05_indexes/joins/index_join”) that manage Department and Employee objects. First we populate the Department class with some sample data:

 
    // The struct corresponding to class Department
    struct department_data 
    {
        char name[20];
        char code[10];
        uint2 dept_no;
    };
 
    struct department_data departments[] = 
    {
        { "Accounting", "Acct", 101 },
        { "Engineering", "Eng", 102 },
        { "Customer Service", "CS", 103 },
        { "Technical Support", "TS", 104 }
    };
    ...
 
    printf("\nCreate departments:\n");
    for (i = 0; i < sizeof(departments) / sizeof(departments[0]) && MCO_S_OK == rc; ++i) 
    {
        rc = mco_trans_start(db, MCO_READ_WRITE, MCO_TRANS_FOREGROUND, &t);
        if (MCO_S_OK == rc)  
        {
            Department dept;
            Department_new(t, &dept);
            Department_name_put(&dept, departments[i].name, strlen(departments[i].name));
            Department_code_put(&dept, departments[i].code, strlen(departments[i].code));
            Department_dept_no_put(&dept, departments[i].dept_no);
            printf("\n\t%d) %s, %s, dept_no=%d", i,
                departments[i].code, departments[i].name,
                departments[i].dept_no );
                rc = mco_trans_commit(t);
        }
    }
     

Then we insert Employee objects associating them with the appropriate Department object:

 
    // The struct corresponding to class Employee
    struct employee_data 
    {
        char name[20];
        uint2 dept_no;
    };
     
    // Define the relationship of Employees to Departments
    struct employee_department 
    {
        char name[20];
        char dept_code[10];
    };
     
    struct employee_department employee_departments[] = 
    {
        { "John", "Acct" }, { "Samuel", "Acct" }, { "Thomas", "Acct" },
        { "David", "Eng" }, { "James", "Eng" }, { "Robert", "Eng" },
        { "William", "CS" }, { "Kevin", "CS" }, { "Alex", "CS" },
        { "Daniel", "CS" }, { "Diego", "CS" }, { "Brandon", "TS" }
    };
    ...
     
    printf("\n\nCreate employees and join each to a department:\n");
    for (i = 0; i < sizeof(employee_departments) / sizeof(employee_departments[0])
            && MCO_S_OK == rc; ++i) 
    {
        rc = mco_trans_start(db, MCO_READ_WRITE, MCO_TRANS_FOREGROUND, &t);
        if (MCO_S_OK == rc) 
        {
            // Find Department by code; extract dept_no;
            // create Employee and assign name, dept_no
            Department dept;
            rc = Department_Icode_find(t, employee_departments[i].dept_code, 
                        strlen(employee_departments[i].dept_code), &dept);
            if (MCO_S_OK == rc)
            {
                uint2 dept_no = 0;
                rc = Department_dept_no_get(&dept, &dept_no);
                if (MCO_S_OK == rc)
                {
                    Employee emp;
                    Employee_new(t, &emp);
                    Employee_name_put(&emp,
                    employee_departments[i].name, strlen(employee_departments[i].name));
                    Employee_dept_no_put(&emp, dept_no);
                    printf("\n\t%d) %s, dept_no=%d", i, employee_departments[i].name, dept_no);
                    rc = mco_trans_commit(t);
                }
                ...
            }
            ...
        }
        ...
    }
     

Note that we use the Department_Icode index to find the Department object with the specified code; then the Department’s dept_no is extracted to be stored in the new Employee object. This creates the association (relationship) between the Department object with the Employee object upon which the join is implemented during a subsequent query.

Now to navigate the relationship between (i.e. “join”) these two classes we might use code like the following to display all of a given employee’s coworkers in their department:

 
    // 1. Find the Employee object by name and extract dept_no
    rc = Employee_Iname_find(t, search_name, (uint2)strlen(search_name), &emp1);
    if (MCO_S_OK == rc) 
    {
        printf("\n\n%s's co-workers in ", search_name);
        Employee_dept_no_get(&emp1, &dept_no1);
        Employee_Idept_name_index_cursor(t, &csr);
        // 2. Find the Department object by its dept_no and display Department name
        rc = Department_Idept_no_find(t, dept_no1, &dept1);
        if (MCO_S_OK == rc) 
        {
            Department_name_get(&dept1, name, sizeof(name), &len);
            printf("%s are:\n", name);
            // 3. Position the cursor in the Idept_name index to the first object with
            // this dept_no
            rc = Employee_Idept_name_search(t, &csr, MCO_GE, dept_no1, "", 0);
            // Scroll forward through the cursor and display the names found...
            while (MCO_S_OK == rc) 
            {
                // Check if the current Employee is same as the found Employee
                Employee_from_cursor(t, &csr, &emp2);
                Employee_name_get(&emp2, name, sizeof(name), &len);
                // If the two names are not equal, display the name (if dept_no is
                // still the same)
                if (0 != strcmp(name, search_name)) 
                {
                    // Verify that the dept_no is still the same, otherwise break out
                    Employee_dept_no_get(&emp2, &dept_no2);
                    if (dept_no1 != dept_no2) break;
                    printf("\n\t%s", name);
                }
                rc = mco_cursor_next(t, &csr);
            }
        }
    }
     

Note that we use the Employee_Iname index to find the Employee object with the specified name; then the Employee’s dept_no is extracted to perform a search on index Employee_Idept_name, which positions the cursor to the first node in the index tree having this dept_no. Then we scroll through the cursor until the dept_no is different. For each index node we get the Employee object from the cursor and extract its name field. If this is not the name of the original employee we are searching coworkers for, we display the name and call mco_cursor_next() to go to the next node in the index tree.

Pros and Cons:

This is a typical “relational” style join familiar to many developers. While it is intuitive and straightforward to implement, it incurs significant overhead to manage several indexes. When an insert transaction is committed, a hash index and two B-Tree index nodes are created for each Department object, and two B-Tree index nodes are created for each Employee object. This results in memory consumption in addition to the space occupied by the database objects themselves and the B-Tree index structures require “balancing” which can incur significant processor cycles. Hence, the overall performance for inserting objects is not optimal.

Lookup of Department objects by dept_no is efficient due to the hash index Idept_no. And the cursor created on the Employee objects through compound index Employee_Idept_name facilitates “scrolling” through the index tree to list employees in alphabetical order by name. Note that to display the name field, an additional function call, Employee_from_cursor(), is required to access the database object.

Autoid reference

An alternative technique to the index-join example above is to implement the “foreign key” relationship with a direct reference from the Employee object to its Department. An autoid field can serve this purpose. For example, we might define a database schema as follows:

 
    #define uint2     unsigned<2>
    declare database  autoid_ref_db;
 
    class Department
    {
        autoid[1000];
        string name;
        string code;
 
        unique tree<name> Iname;
        unique tree<code> Icode;
    };
 
    class Employee
    {
        string name;
        autoid_t dept;
 
        unique tree<name> Iname;
        unique tree<dept,name> Idept_name;
    };
     

Note that the Department class no longer has the dept_no field and thus no index Idept_no. Instead it has the declaration autoid[1000];. This actually causes a “hidden” hash index to be maintained for the Department object autoid values that are automatically incremented as new objects are created. And note the field dept in the Employee class which will store the autoid of the associated Department object.

Note that there is an alternative form of the autoid_t field declaration that is useful when an SQL interface is used to access the referenced table. For example the Employee class could be defined:

     
    class Employee
    {
        string name;
        autoid_t<Department> dept;
 
        unique tree<name> Iname;
        unique tree<dept,name> Idept_name;
    };
     

This syntax specifies the class being referenced by the field value. The eXtremeDB runtime does not verify this declaration and does not use it in any way (in fact it could be any integer value). However, the eXtremeSQL runtime “trusts" the declaration and makes use of it to implement references in SQL. Again, there are no cascading provisions or referential integrity checks of any kind. So care must be taken if, for example, an attempt is made to delete a Department object that is referenced by one or more Employee objects.

To demonstrate how this join is implemented, consider the following code snippets (taken from the SDK sample “05_indexes/joins/autoid_ref”). We populate the Department class as in the index_join sample above, but note that when inserting Employee objects we extract the autoid from the corresponding Department object:

 
    printf("\nCreate employees and join each to a department:\n");
    for (i = 0; i < sizeof(employee_departments) / sizeof(employee_departments[0])
            && MCO_S_OK == rc; ++i) 
    {
        rc = mco_trans_start(db, MCO_READ_WRITE, MCO_TRANS_FOREGROUND, &t);
        if (MCO_S_OK == rc) 
        {
            // Find Department by code; extract autoid; create Employee and
            // assign name, dept (autoid)
            Department dept;
            rc = Department_Icode_find(t, employee_departments[i].code,
                strlen(employee_departments[i].code), &dept);
            if (MCO_S_OK == rc)
            {
                autoid_t dept_id = 0;
                rc = Department_autoid_get(&dept, &dept_id);
                if (MCO_S_OK == rc)
                {
                    Employee emp;
                    Employee_new(t, &emp);
                    Employee_name_put(&emp, employee_departments[i].name,
                        strlen(employee_departments[i].name));
                    Employee_dept_put(&emp, dept_id);
                    printf("\n\t%d) %s, Department.Autoid=%ld", i,
                        employee_departments[i].name, (long)dept_id);
                    rc = mco_trans_commit(t);
                }
                ...
            }
        }
    }
     

The code to display all of a given employee’s coworkers in their department would be nearly identical to that in the index_join sample above with the exception that we call Department_autoid_find() to lookup the Department object by its autoid. This is equivalent to the call of Department_Idept_no_find() which uses hash index Idept_no, the difference is that the autoid is automatically generated whereas a unique value must be specified for the no dept_no field. The same technique of scrolling through a cursor on compound index Employee_Idept_name is used to display the employee names in alphabetical order:

 
    // 1. find employee record by name
    rc = Employee_Iname_find(t, search_name, (uint2)strlen(search_name), &emp);
    if (MCO_S_OK == rc) 
    {
        printf("\n\n%s's co-workers in ", search_name);
        Employee_dept_get(&emp, &dept_id1);
        // 2. Find the Department object by its autoid and display the
        // Department name
        rc = Department_autoid_find(t, dept_id1, &dept1);
        if (MCO_S_OK == rc) 
        {
            Department_name_get(&dept1, name, sizeof(name), &len);
            printf("%s are:\n", name);
            // 3. Position the cursor in the Idept_name index to the first object
            // with this autoid
            Employee_Idept_name_index_cursor(t, &csr);
            rc = Employee_Idept_name_search(t, &csr, MCO_GE, dept_id1, "", 0);
            if (MCO_S_OK == rc) 
            {
                while (MCO_S_OK == rc) 
                {
                    // Check if the current Employee is same as the found Employee
                    Employee_from_cursor(t, &csr, &emp2);
                    Employee_name_get(&emp2, name, sizeof(name), &len);
                    // If the two names are not equal, display the name (if dept_id is
                    // still the same)
                    if (0 != strcmp(name, search_name)) 
                    {
                        // Verify that the dept_id is still the same, otherwise break
                        // out of loop
                        Employee_dept_get(&emp2, &dept_id2);
                        if (dept_id1 != dept_id2) 
                            break;
                        printf("\n\t%s", name);
                    }
                    rc = mco_cursor_next(t, &csr);
                }
            }
        }
    }
     

Pros and Cons:

This implementation uses somewhat less memory by eliminating the field dept_no and the lookup of Department objects is likewise efficient due to the hash index generated for the autoid. But some optimization might be gained by relaxing the application requirements.

Note that the major drawback for performance and memory consumption, both for this and the index_join relational approach, is incurred by the B-Tree indexes: Employee_Idept_name, Employee_Iname and Department_Iname. For example, if it is not necessary to lookup a Department object by its name, the Department_Iname could be eliminated. And further, if it is not necessary to display the employees of a given department in alphabetical order, then we can eliminate the compound index Employee_Idept_name. The following example illustrates this approach.

Vector of autoids

Instead of storing the autoid of the Employee object’s Department in the Employee object, we could instead store an array or vector of Employee autoids in the Department object. For example (see the SDK sample “05_indexes/joins/autoid_vector”), we might define a database schema as follows:

 
    #define uint2     unsigned<2>
    declare database  autoid_vector_db;
 
    class Department
    {
        autoid[1000];
        string name;
        string code;
 
        vector<autoid_t> employees;
 
        unique hash<code> Icode[1000];
    };
 
    class Employee
    {
        autoid[1000];
        string name;
        autoid_t dept;
 
        unique tree<name> Iname;
    };
     

Note that both classes are defined with autoid. The Employee autoid will be stored in the Department class vector employees. We will use the Department autoid in Employee field dept, as in the autoid_ref example above, to efficiently access the Department object from the Employee.

The code to populate the database becomes a little more complicated as we need to allocate space for the dynamic vector of autoid values in the Department objects and assign these autoids when we insert the Employee objects. So for each Employee object we execute the following code to allocate space and insert an Employee autoid into the Department employees vector:

 
    // Get the autoid of this Department then create the new Employee object
    // and insert its autoid into the vector
    Department_autoid_get(&dept, &dept_id);
    Employee_new(t, &emp);
    Employee_name_put(&emp, employee_departments[i].name,
        strlen(employee_departments[i].name));
    Employee_dept_put(&emp, dept_id);
    Employee_autoid_get(&emp, &emp_id);
     
    // Allocate space for a new vector element and insert this
    // Employee object's autoid
    Department_employees_size(&dept, &vector_size);
    Department_employees_alloc(&dept, vector_size + 1);
    Department_employees_put(&dept, vector_size, emp_id);
     

Now the code to display all of a given employee’s coworkers in their department will simply find the Department object as in the autoid_ref sample above by calling Department_autoid_find() and then list this department’s employees from the vector as follows:

 
    Department_name_get(&dept1, name, sizeof(name), &len);
    printf("%s are:\n", name);
    // 3. Scroll through the vector of Employee autoids, find the Employee
    // object and display its name
    Department_employees_size(&dept1, &vector_size);
    for (short j = 0; j < vector_size; j++)
    {
        rc = Department_employees_at(&dept1, j, &emp_id2);
        // End loop when the value of this vector element is 0
        if (0 == emp_id2) 
            break;
         
        // Skip if this is the autoid of the "search_name" object
        if (emp_id1 != emp_id2)
        {
            // Find the Employee object by autoid and display the name
            rc = Employee_autoid_find(t, emp_id2, &emp2);
            if (MCO_S_OK == rc)
            {
                Employee_name_get(&emp2, name, sizeof(name), &len);
                printf("\t%s\n", name);
            }
        }
    }
 

Pros and Cons:

This implementation uses significantly less memory and reduces behind-the-scenes tree structure maintenance by eliminating two B-Tree indexes with the small sacrifice of not ordering the list of Employees in alphabetical order. However, it could be a simple matter to introduce a sorting algorithm into the application to address this; but it should be noted that this approach could become untenable if the number of Employees per Department is very large.