Shared memory databases on Asymmetric Multiprocessing systems

Modern SoC platforms often have heterogeneous remote processor devices in asymmetric multiprocessing (AMP) configurations. Two widely accepted architectures are NXP®'s i.MX 8 QuadMax and i.MX 8 QuadPlus series that include varying counts of ARM® Cortex®-A and Cortex-M CPU cores. Another popular alternative includes STM’s STM32MP157F devices that offer dual-core ARM® Cortex®-A7 in combination with a Cortex®-M4 32-bit RISC core.

The heterogeneous multicore architecture allows the offloading of critical hard real-time tasks to the Cortex-M processors for extremely low latency processing, while using the Cortex-A cores for high-performance tasks.

Different CPUs usually run instances of different operating systems, e.g., Linux on the A cores and real-time OS (FreeRTOS, MICROSAR, etc.,) on the M core(s).

The shared memory regions integrated with the AMP hardware are directly accessible allowing for communications and/or message passing between the “clusters” (a cluster refers to several cores capable of independent instruction execution and running a separate operating system). Traditionally, the shared memory was used for communications / message passing only. Yet due to the high complexity and critical nature of applications utilising the i.MX8 hardware, and the amount of external memory, real-time data can be shared between the Cortex-A and Cortex-M cores.

A storage subsystem in the AMP systems can be organised in two ways:

  1. "Client-Server": the database is maintained within the scope of a process running on a single cluster and accessed by this process exclusively. The data is then available to any other processes running within the same cluster or a different cluster through "messages" passed to the "server" process through the shared memory.
  2. "Shared memory database": the storage repository is kept in the device’s shared memory and made directly accessible from all clusters’ processes/applications. We find this attractive to data-driven applications for several reasons: the performance 
is higher with direct access to storage, and the storage is better utilized because the entire DDR memory can be used as a storage device. Managing database access directly from multiple clusters presents greater flexibility to multiple threads by fine-tuning contention resolution. In addition, the absence of a single point of failure – the server process - is essential for data driven safety-critical systems that often make use of the AMP hardware.

eXtremeDB/rt provides the implementation of the AMP shared memory database approach which has the following benefits:

How to use AMP shared database with eXtremeDB/rt API

AMP shared memory database storage is described by a MCO_MEMORY_AMP memory device and a corresponding element of mco_device_t::dev union


                struct {
                    unsigned int flags;
                    char *address;
                    char *hint;
                } amp;
where

OS-specific components

Unlike SMP systems, AMP clusters are controlled by different operating systems. Therefore, from the standpoint sharing of data across clusters, the AMP system can be viewed as a distributed network in which "nodes" communicate with each other via shared memory. It follows that a mechanism is needed to synchronize access to metadata structures and consequently shared storage that is independent of each node’s operating system. We call this mechanism a Distributed Semaphore. The distributed semaphore is the component that controls multiple tasks’ synchronization across the RTOS running atop of Cortex-M cores and Linux running on top of the Cortex-A.

On Linux the distributed semaphore is implemented via the kernel module. Change the working directory to target/sal/sync/rpmsg_sem/, adjust makefile to set correct paths to kernel sources and cross-compiler, run "make" to build the module. Load it via "insmod mco_rpmsg_sem.ko".

On FreeRTOS the distributed semaphore is implemented via library rpmsg_lite. It should be linked in.

Sample

See samples/amp