Create Kernel_Self_Monitoring.md#44

Open

igor-stoppa wants to merge 1 commit intoelisa-tech:mainfrom

igor-stoppa:Kernel_Self_Monitoring

Collaborator

igor-stoppa commented Oct 7, 2024

Requirements for a Kernel SSSafety Monitor

As discussed during the OSEP call of Week 40, let's try to have this as a collaborative effort, to provide the ARCH WG with OSEP-reviewed requirements.


          Create Kernel_Self_Monitoring.md

1de216e

Requirements for a Kernel SSSafety Monitor

As discussed during the OSEP call of Week 40, let's try to have this as a collaborative effort, to provide the ARCH WG with OSEP-reviewed requirements.

Signed-off-by: Igor Stoppa <[email protected]>

dweingaertner reviewed

View reviewed changes

Contributions/Kernel_Self_Monitoring.md

+                - An external HW watchdog.
+                - An external core, running safety-qualified code (e.g. a safety island)
+                - An external execution context running on the same system (e.g. an hypervisor)
+              - The failure analysis must consider that the meta-data used by the self-monitor is exposed to interference as well:

Collaborator

dweingaertner Oct 8, 2024

I understand that you mean the Kernel meta-data here. Perhaps explicitly stating that the meta-data that is to be monitored is the kernel internal data structures, as opposed to userspace data?
I kind of miss the information of what the self monitor is supposed to monitor. Can we place a requirement on that?

Collaborator Author

igor-stoppa Jan 23, 2025

The kernel doesn't even have access - by design - to userspace data, of any kind, shape and form.
It would be a security violation.

Contributions/Kernel_Self_Monitoring.md

+              ## **Structure of the document**
+              - The document presents a brief overview of what safety problems can be found while using Linux, and how they can be mitigated through self-monitoring.
+              - The next sections, instead, discuss more in detail typical pitfalls that must be avoided, to make the monitoring useful and what goals should be targeted.

Collaborator

dweingaertner Oct 8, 2024

remove ", instead,"

Contributions/Kernel_Self_Monitoring.md

		This is where the concept of kernel self monitoring comes into the picture.


		## The concept - Safety-oriented self monitoring

Collaborator

dweingaertner Oct 8, 2024

I do not see the title fit for the content. The content could be a continuation of the previous section, as you are still introducing the subject. With this title I would expect you to explain the concept, which is not the case.

paolonig reviewed

View reviewed changes

Contributions/Kernel_Self_Monitoring.md

+              - Watchdogs, to detect abnormal delays either in processing events or performing actions.
+              - Output vetting, to detect abnormal actuator control signals produced by the system.
+              These external safety measures, though, are not always sufficient, because they might have limited ability to observe the internal status of the kernel, or it might be desirable to exert a tighter control over the evolution of its internal states.

Contributor

paolonig Oct 10, 2024

In the statement above I would add "especially in scenarios where the Kernel is used to enable and manage complex safety workloads"

paolonig reviewed

View reviewed changes

Contributions/Kernel_Self_Monitoring.md

		This is where the concept of kernel self monitoring comes into the picture.


		## The concept - Safety-oriented self monitoring

Contributor

paolonig Oct 10, 2024

While I agree on the technical content, I think that in this doc we are missing the theoretical principles, that BTW are quite simple:

The Self Monitoring must be developed or qualified with a systematic capability level that is equal or higher than the ASIL or SIL level associated with the safety claim that it supports
The scope of the monitored part of the Kernel and the scope of the self monitoring part shall be clearly identified, including part of the design or code that can be common between the two
A safety analysis on the monitored code shall define the dangerous failure modes originating from it that would violate the target safety claim
Assuming the lack of interference from the monitored code and from any other code running in the Kernel, the safety monitor shall be verified to be effective against a subset of the dangerous failure modes mentioned in 3). Any residual dangerous failure mode shall be covered by additional mitigations (that are out of the scope of this document)
A comprehensive analyses on the possible interference failure modes originating from the monitored code or any other code running in the Kernel shall be carried over. With respect to such analysis the dangerous interference failure modes (leading to a violation of the target safety claim) shall be identified and mitigated through adequate measures.
For example adequate measures could range from Assumption on Use or proper configurations that would lead to the avoidance of such failure modes, designing and implementing additional detection mechanism or the qualification of the code generating interference up to a level of systematic capability that is equal or higher than ASIL or SIL allocated to the target safety claim.

Note: When designing the safety monitors and any additional mitigation measure, also availability, security and performance requirements shall be considered.
E.g. a measure making the overall safety function unavailable is perfectly safe but it would be useless

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet