As explained in the Database Recovery page, eXtremeDB provides the “sniffer” utility to allow C and C++ applications to detect and remove “dead” connections.
Using the Sniffer API
Because there is no system-independent way to detect when a process has failed, the “sniffer” API
mco_db_sniffer()
is provided. Usuallymco_db_sniffer()
will be called periodically in a separate thread or from specific places in the application to check for “dead” connections. A user-supplied callback function is then called bymco_db_sniffer()
to actually detect if a given connection is “alive”, and if not to terminate it.To perform this check, some identifying information (typically the process identifier) is added to each connection context with code like the following:
int pid ; #ifdef _WIN32 pid = GetCurrentProcessId(); #else pid = getpid(); #endif ... mco_db_connect_ctx(dbName, &pid, &db);
Note that it is also necessary to specify the size of this connection context in the database parameters passed to
mco_db_open_dev()
. For example:db_params.connection_context_size = sizeof(int);The “sniffer callback” function could then be implemented as follows:
MCO_RET sniffer_callback(mco_db_h db, void* context, mco_trans_counter_t trans_no) { int pid = *(int*)context; #ifdef _WIN32 HANDLE h = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, pid); if (h != NULL) { CloseHandle(h); return MCO_S_OK; } #else if (kill(pid, 0) == 0) { return MCO_S_OK; } #endif printf("Process %d is crashed\n", pid); return MCO_S_DEAD_CONNECTION; }If the user callback function returns
MCO_S_DEAD_CONNECTION
, recovery will be performed for this connection. Nowmco_db_sniffer()
iterates through database connections and will call the user callback function depending on the policy specified (third parameter). The possible values for this policy parameter are as follows:
MCO_SNIFFER_INSPECT_ACTIVE_CONNECTIONS
: for all active connections,MCO_SNIFFER_INSPECT_ACTIVE_TRANSACTIONS
: for all connections with active transactions, orMCO_SNIFFER_INSPECT_HANGED_TRANSACTIONS
: for connections with active transactions whose transaction number has not changed since the previous invocation ofmco_db_sniffer()
(such a connection is assumed to be hung; it is up to the user to correctly specify and enforce the interval between successive calls tomco_db_sniffer()
to avoid false detection of hung transactions).A “watchdog” thread could then be implemented in the application as follows:
THREAD_PROC_DEFINE(sniffer_loop, arg) { mco_db_h db; int pid = get_pid(); mco_db_connect_ctx(dbName, &pid, &db)); while (1) { mco_db_sniffer(db, sniffer_callback, MCO_SNIFFER_INSPECT_ACTIVE_CONNECTIONS)); sleep(SNIFFER_INTERRVAL); } mco_db_disconnect(db); THREAD_RETURN(0); }Recovery actually consists of two stages. In the first stage the dead connection is “grabbed”. Each connection has private (process specified) pointers which must be adjusted to be used in the context of the process performing recovery. In the second stage, internal functions are called to rollback any transactions that might have been in progress and to release the dead connections’ data structures. (Please see SDK sample 19_recovery_sniffer for an example.)
NVRAM database support and recovery
eXtremeDB allows C and C++ applications to re-connect to databases created in non-volatile memory (NVRAM, or battery-backed RAM) after a system restart, or similar activities. The database can be created either in conventional or shared memory. If the database is corrupted, the eXtremeDB runtime makes an attempt to recover the database based on the content of the memory buffer specified in the call to
mco_db_open_dev()
.In order to reconnect to a database in NVRAM, the application specifies the memory device to
mco_db_open_dev()
and sets flagMCO_DB_OPEN_EXISTING
as a parameter (in themco_db_params_t.mode_mask
argument). For example:mco_db_params_t db_params; ... mco_db_params_init(&db_params); ... if (...) { db_params.mode_mask |= MCO_DB_OPEN_EXISTING; } ... rc = mco_db_open_dev(db_name... , &db_params);The database runtime performs the necessary steps to ensure consistency of the database metadata and the database content. If
mco_db_open_dev()
returnsMCO_S_OK
, the application can proceed to connect to the database normally by callingmco_db_connect()
.Note that database recovery can fail under certain conditions (such as application errors that corrupt the database runtime metadata). If recovery fails,
mco_db_open_dev()
returns an error code. (Please refer to the “Recovery from failed processes” section above for further discussion about eXtremeDB recovery procedures. Also refer to the SDK sample 02-open_nvram).