Virtual Open Systems Scientific Publications
23rd IEEE FRUCT Conference (FRUCT23), Bologna, Italy.
ISO-26262 virtualization, functional safety, mixed-criticality, ARM TrustZone, VOSYSmonitor.
This work has been supported by the H2020 dReDBox project; it has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No. 687632. This work reflects only the authors’ view and the European Commission is not responsible for any use that may be made of the information it contains.
With the emergence of multicore embedded System on Chip (SoC), the integration of several applications with different levels of criticality on the same platform is becoming increasingly popular. These platforms, known as mixed-criticality systems, need to meet numerous requirements (e.g. real-time constraints, multiple Operating Systems (OS) scheduling, providing temporal and spatial isolation).
In this context, Virtual Open Systems has developed VOSYSmonitor, a thin software layer, which allows the co-execution of a safety-critical and non-critical applications on a single ARM-based multi-core SoC. This software element has been developed according to the ISO 26262 standard. One of the key aspects of this standard is the control of random and systematic failures, including the ones induced by faulty or aging hardware.
In the case of a software component, the means to detect anomalies on the hardware are limited and depend on choices of the manufacturer (e.g. implementation of Dual redundant Core Lock step (DCLS)). However, the software is able to check a part of these failures. It can be by either reading the configuration registers of a peripheral, or checking the sanity of a memory region. The purpose of this paper is to showcase how a safety-related software element (e.g. VOSYSmonitor) can detect and recover from failures, while ensuring that the safety-related goals are still reached.
In mixed-criticality domains, the term functional safety has become a topic of high importance. Indeed, functional safety generally means that malfunctions of the operating system, which contain mission-critical tasks, that lead to any kind of threat or even accident have to be avoided or mitigated. Therefore, it is fundamental in the field of functional safety to identify and understand potential risks and failure causes of a system. If ideally all potential failure causes are known and the consequences understood it is possible to define countermeasures. Thus, failures are detected before a hazardous event occurs and the safe state is initiated with the needed of functional safety reaction.
The safe states can importantly vary according to the final application as well as the injuries which might be led by the system failure without countermeasures. As every application is different and has its own particularities and thus potential failure causes and related safe states, the functional safety analysis is very interesting challenges.
In this context, many functional safety standards have been established to define the main requirements to fulfill during the development of critical systems in order to ensure a high level of reliability in the critical systems. The main functional safety standard is the IEC/EN 61508 that defines the basis for functional safety developments for E/E/EP (electronics, electronic or programmable electronic) applications. In addition, the IEC/EN 61508 is expanded by additional industry sector specific standards, such as the ISO 26262 Road vehicles Functional Safety which has been specially defined for the automotive domain (see section II-A).
Indeed, the automotive industry is rapidly evolving towards the connected autonomous vehicle which will considerably increase the hardware/software complexity, while functional safety will be a topic of high importance since critical features will be controlled by electronics components (e.g., autonomous driving, etc.). Thus, the ISO 26262 defines a functional safety lifecycle for each automotive product development phase, ranging from the hazard analysis and risk assessment to design, implementation, integration, verification, validation and production release.
In this context, Virtual Open Systems has developed VOSYSmonitor, a hypervisor based on ARM TrustZone that enables the consolidation of mixed-critical Operating Systems (e.g., Linux-KVM along with a RTOS) on a single ARM-based platform with special attention to safety and security. This software technology has been developed as a Safety Element out of Context (SEooC) in compliance with the ASIL-C requirements of the ISO 26262 standard and it ensures freedom from interferences for the safety critical partition.
As a mater of fact, VOSYSmonitor is a perfect solution to support a modern generation of car virtual cockpit where the In-Vehicle Infotainment (IVI) system and the Instrument Digital Cluster are consolidated and interact on a single platform. Indeed, traditional gauges and lamps are replaced by digital screens offering opportunities for new functions and interactivity. Vehicle information, entertainment, navigation, camera/video and device connectivity are being combined into displays. However, this different information does not have the same level of criticality and the consolidation of mixed-critical applications represent a real challenge that must respect the stringent requirements of the ISO 26262 functional safety standard.
Since VOSYSmonitor is only a software component, the paper will detail the definition of safety functionalities by applying the ISO 26262-6 Product development at the software level. After a summary of the ISO 26262 standard and the different technologies involved in the Section II, we present related work and emphasize the advantages and drawbacks of existing solutions compared to our design in Section III. Then, a more-detailed presentation of the safety features of VOSYSmonitor is presented in Section IV. These features are divided into two parts: detection and recovery mechanisms. The performances of these mechanisms is evaluated in Section V, by measuring the latency between a fault detection and the entry to the mitigation state. Finally, Section VI summarizes this work findings and directions for future works.
Access the full content of this publication
Login or register to access full information