Digital archive of theses discussed at the University of Pisa


Thesis etd-02012022-171237

Thesis type
Tesi di laurea magistrale
Thesis title
Thesis by Implementation: Improving Performance of Real-time (automotive) Applications on Multicore Platforms using the Logical Execution(LET) model and Inter-Core DMA data Transfer for task communications
Course of study
relatore Prof. Di Natale, Marco
correlatore Prof. Biondi, Alessandro
  • LET
  • inter-core communications
  • multicore task communications
  • logical execution time
  • DMA
  • direct memory access
  • Aurix Tricore
Graduation session start date
Release date
Abstract—By providing parallel computational units, multicore platforms have provided immense opportunity to improve performance and predictability of safety critical real-time embedded applications such as in the automotive world. For shared memory multicore systems, both the communication architecture used to connect the different core computational units, and the software used for efficient application task communications play major role in the overall real time performance of the system.
To promote high timing predictability, the Infineon AURIX TriCore family of microcontrollers have provided large scratchpad memories and a crossbar interconnect. The latter has been introduced to reduce inter-core interference in accessing the memory system and peripherals even if the crossbar doesn’t prevent requests from different cores to the same target resource(for example shared memory) to suffer contention. The Aurix TC27x/TC29x microcontrollers also provide multiple DMA channels(64, 128, or 256 channels depending on the MCU version) and two DMA move engines operating at the core CPUs’ operating max frequency to serve two DMA channel service requests in parallel at a time. The channels can transfer a block or linked List of (shared) data with just a single trigger by the CPU.
The ERIKA RTOS which implements the AUTOSAR OS and OSEK/VDX API specification provides support for multicore communications for instance by implementing a remote procedure call(RPC) API for inter-core task communications. The Infineon Aurix architecture fully support the AUTOSAR multicore features.
The Logical Execution Time (LET) has been a reference communication design model to improve the predictability and correctness of time-critical multicore applications.
In this thesis, by blending the Aurix TriCore architectural features with Erika OS multicore API features an Implementation (based on a DMA shared data transfer) for the LET communication model is provided. The implementation has been done with a goal of improving performance of Realtime applications by maximizing shared data transfer throughput and improving processor utilization while data is being DMA transferred.
This thesis document provides detail description of both the implementation methodology proposed and the realization of the implementation on an Aurix TriCore MCU using the Erika OS. An implementation for a sample automotive real-time application is provided. [A performance test has been done with a Lauterbach Debugger and Tracer.]

The aim of this thesis is to improve performance of Realtime applications running on multicore platforms, specifically on Infineon Aurix Tricore TC27x/Tc297 family, using DMA data transfer for inter-core task communications. Global data shared between tasks allocated on different cores is stored on a shared memory accessible from all cores. The DMA transfers data between the shared memory and the core local memories. In the listed Aurix Tricore architectures the DMA controller can operate up to the highest frequency the core CPUs can operate. That means the shared data transfer time (memory access time + bus access time) by the DMA can be equivalent or even lesser to the data transfer time by the CPUs(if the data was to be transferred by CPU load operations). Therefore, the performance improvement implication is the core CPUs can execute other important operations while the shared data is being DMA transferred. Moreover, DMA channel interrupts provide better and efficient inter-core synchronization between cores than a synchronization made by a spinning loop wait with a shared pointer.
In the implementation, the Aurix Tricore provided the required architectures in shared memory, core local memories, multi channel DMA(64 and 128 channels) with DMA Linked List feature, two DMA engines, and communication buses. Erika RTOS provide task and inter-core communication features and primitives. An implementation is provided on Aurix Tricore TC277.
The timing of the inter-core communications(timing of shared data read/write operations and inter-core synchronization between the cores to access the shared memory) is derived from the LET(Logical Execution Time) implementation paradigm provided by prof. Marco Di Natale and Prof. Alessandro Biondi.