Introduction to eXtremeDB Transaction Logging

As explained in the eXtremeDB Transaction Logging User's Guide, Transaction logging is an industry standard term that normally refers to database log file processing. eXtremeDB Transaction Logging is mainly used for a set of tasks only one of which is adding persistence to in-memory databases. Mostly it is used and for replaying the log to/from a certain point in time and communicating database transactions to the outside environment (Data Relay). The industry-standard term for this is "Change Data Capture (CDC)”.

So eXtremeDB Transaction Logging extends eXtremeDB with three important capabilities:

1. Durability (and, hence, recoverability) for in-memory databases (i.e transient classes)

2. Data Relay facilitates seamless, fine-grained data sharing between real-time systems based on eXtremeDB, and external systems such as enterprise DBMSs

3. Extended event queue processing

eXtremeDB Transaction Logging increases the options for persistence of eXtremeDB in-memory databases with the introduction of transaction logging – a process that journals changes made to a database (by transactions), as they are made. With transaction logging enabled, the eXtremeDB runtime captures database changes and writes them to a file known as a transaction log. In the event of a hardware or software failure, the runtime can recover the database using this log.

Transaction logging provides durability of eXtremeDB all-in-memory and hybrid (containing both transient and persistent classes) databases through the eXtremeDB Transaction Logging (TL) API. In a TL-enabled application, every insert/update/delete action of a transaction is recorded in in-memory buffers. When the transaction is committed, these buffers are appended to the transaction log. No records are added to the log if the transaction is read-only, or aborted.

For all-in-memory databases, transaction logging does not alter the in-memory architecture of eXtremeDB, and it retains a performance advantage over persistent-media-based databases. Read performance is unaffected by transaction logging and write performance will far exceed write performance of traditional persistent-media-based databases. The reasons are simple:

(a) eXtremeDB Transaction Logging requires exactly one write to the file system per database transaction,

(b) the write is sequential, and

(c) there is no overhead associated with maintaining a cache and/or cache lookups.

A persistent-media-based database, however, will perform many reads and writes per transaction (data pages, index pages, transaction log, etc.) and the larger the transaction and the more indexes that are modified, the more reads and writes that are necessary.

In C/C++ applications the TL API logs all transactions committed to the database between calls to the functions mco_translog_start() and mco_translog_stop(). Thus, logging can be turned on and off at the application’s discretion. For maximum efficiency, the log file is written by blocks aligned with the file system’s pages. There is a single additional requirement to enable TL applications; the database schema must declare an auto_oid (see the section titled "DDL Requirements" in the TL Applications page).

The TL facility is indifferent towards memory mode (conventional or shared) and transaction manager (MURSIW, Exclusive or MVCC). And when transaction logging is used for Data Relay or event handling, it is also indifferent as to whether the database is transient, persistent or hybrid.