Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
orbtop-rtos Technical Documentation
Real-time RTOS Thread Profiling via ITM/DWT for ARM Cortex-M
Overview
orbtop-rtosis a real-time thread profiling tool that monitors RTOS thread execution on ARM Cortex-M targets using ITM (Instrumentation Trace Macrocell) and DWT (Data Watchpoint and Trace) hardware debugging features. It provides live CPU usage statistics without modifying the target firmware.Architecture
graph TD subgraph "Target MCU (STM32H7)" RTOS[RTOS Kernel<br/>RTX5] TCB[osRtxInfo.thread.run.curr] DWT[DWT Comparator 1] ITM[ITM] TPIU[TPIU/SWO] RTOS -->|writes| TCB TCB -->|monitored by| DWT DWT -->|HW event| ITM ITM -->|ITM packets| TPIU end subgraph "Debug Probe" STLINK[ST-Link<br/>or J-Link] end subgraph "Host PC" OpenOCD[OpenOCD<br/>telnet:4444<br/>ITM/SWO:46000] OrbtopRTOS[orbtop-rtos] subgraph "Output Formats" Console[Console Output] JSON[JSON File/UDP] FTrace[FTrace Text File] end end TPIU -->|SWO pin| STLINK STLINK -->|USB| OpenOCD OpenOCD -->|"tcp:46000<br/>(ITM stream)"| OrbtopRTOS OrbtopRTOS <-->|"telnet:4444<br/>(memory reads)"| OpenOCD OrbtopRTOS --> Console OrbtopRTOS --> JSON OrbtopRTOS --> FTraceHow It Works
sequenceDiagram participant App as orbtop-rtos participant Telnet as OpenOCD Telnet participant Kernel as RTOS Kernel participant TCB as osRtxInfo.thread.run.curr participant DWT as DWT Hardware participant ITM as ITM Stream App->>Telnet: Find osRtxInfo symbol Telnet-->>App: Address 0x20001234 App->>Telnet: rtos_dwt_config 0x20001248 Note over Telnet,DWT: Configure DWT_COMP1 to watch<br/>address 0x20001248<br/>(osRtxInfo.thread.run.curr) Note over ITM: ITM constantly generates<br/>timestamp packets that<br/>App accumulates loop Thread Context Switch Note over Kernel: Context switch occurs Kernel->>TCB: Write new TCB address<br/>to monitored location TCB->>DWT: Memory write detected DWT->>ITM: Generate HW event (comp match) Note over ITM: HW packet includes:<br/>- Comparator number<br/>- Data value (new TCB addr)<br/>- TIMESTAMP in packet! ITM->>App: HW packet WITH timestamp App->>App: Use accumulated timestamp<br/>from ITM stream App->>Telnet: Read TCB at new address Telnet-->>App: Thread name, priority, func App->>App: Update thread statistics<br/>with timestamp end loop Every interval (1000ms) App->>App: Calculate CPU percentages App->>Console: Display thread statistics endKey Components
1. ITM Configuration Requirements
The target must be configured with specific ITM settings:
ITM_TCR.TSENAITM_TCR.DWTENAITM_TCR.SYNCENAITM_TERDWT_CTRL.CYCCNTENA2. DWT Configuration (via OpenOCD Telnet)
The
rtos_dwt_configfunction instm32h74x.cfgconfigures DWT Comparator 1:3. RTX5 Thread Control Block Structure
The tool reads RTX5 TCB structures from target memory:
idnameprioritythread_addr4. Memory Reading via Telnet with Caching
The tool uses OpenOCD's telnet interface with an intelligent cache system:
This caching is critical because reading TCB fields (name, priority, function) for each thread switch would otherwise require 3+ telnet round-trips per switch.
Note: Cache is cleared ONLY when a NEW TCB is detected (not on every switch to an existing thread). This means thread properties are read once and cached indefinitely.
Output Formats
When exceptions are enabled with
-Eoption, additional exception statistics are displayed:Output Formats
Console Output (Default)
Features:
[ITM OVERFLOW DETECTED!]or timing warningsJSON Output Modes
RTOS Threads Output (
-j output.json){ "threads": [ { "tcb": "0x20001234", "name": "main", "func": "main_thread", "prio": 24, "time_ms": 451, "cpu": 45.123, "max": 48.567, "switches": 1234 }, { "tcb": "0x20001456", "name": "sensor_task", "func": "sensor_loop", "prio": 40, "time_ms": 234, "cpu": 23.456, "max": 25.890, "switches": 567 }, { "tcb": "0x20001000", "name": "idle", "func": "os_idle", "prio": 1, "time_ms": 214, "cpu": 21.400, "max": 22.100, "switches": 2345 } ], "interval_ms": 1000, "cpu_usage": 78.600, "cpu_max": 82.345, "cpu_freq": 480000000, "overflow": false }Exceptions Output (when using
-E){ "exceptions": [ { "num": 15, "name": "SysTick", "count": 1000, "maxd": 1, "total": 1234567, "pct": 2.5, "ave": 1234, "min": 1000, "max": 2000, "maxwall": 2500 }, { "num": 37, "name": "IRQ 21", "count": 500, "maxd": 2, "total": 567890, "pct": 1.2, "ave": 1135, "min": 900, "max": 1500, "maxwall": 1800 }, { "num": 53, "name": "IRQ 37", "count": 250, "maxd": 1, "total": 234567, "pct": 0.5, "ave": 938, "min": 800, "max": 1200, "maxwall": 1400 } ], "timestamp": 1699123456789 }UDP Streaming (
-j udp:46006)nc -lu 46006Example UDP stream received:
$ nc -lu 46006 {"threads":[{"tcb":"0x20001234","name":"main","func":"main_thread","prio":24,"time_ms":451,"cpu":45.123,"max":48.567,"switches":1234},{"tcb":"0x20001456","name":"sensor_task","func":"sensor_loop","prio":40,"time_ms":234,"cpu":23.456,"max":25.890,"switches":567},{"tcb":"0x20001678","name":"network","func":"net_handler","prio":24,"time_ms":101,"cpu":10.123,"max":12.345,"switches":890},{"tcb":"0x20001000","name":"idle","func":"os_idle","prio":1,"time_ms":214,"cpu":21.400,"max":22.100,"switches":2345}],"interval_ms":1000,"cpu_usage":78.600,"cpu_max":82.345,"cpu_freq":480000000,"overflow":false} {"threads":[{"tcb":"0x20001234","name":"main","func":"main_thread","prio":24,"time_ms":502,"cpu":50.234,"max":50.234,"switches":1245},{"tcb":"0x20001456","name":"sensor_task","func":"sensor_loop","prio":40,"time_ms":198,"cpu":19.823,"max":25.890,"switches":578},{"tcb":"0x20001678","name":"network","func":"net_handler","prio":24,"time_ms":95,"cpu":9.500,"max":12.345,"switches":901},{"tcb":"0x20001000","name":"idle","func":"os_idle","prio":1,"time_ms":205,"cpu":20.500,"max":22.100,"switches":2389}],"interval_ms":1000,"cpu_usage":79.557,"cpu_max":82.345,"cpu_freq":480000000,"overflow":false}When exceptions are enabled (
-E), separate exception packets are also sent:{"ex":1,"num":15,"name":"SysTick","count":1000,"maxd":1,"total":1234567,"pct":2.5,"ave":1234,"min":1000,"max":2000,"maxwall":2500} {"ex":1,"num":37,"name":"IRQ 21","count":500,"maxd":2,"total":567890,"pct":1.2,"ave":1135,"min":900,"max":1500,"maxwall":1800} {"ex":1,"num":53,"name":"IRQ 37","count":250,"maxd":1,"total":234567,"pct":0.5,"ave":938,"min":800,"max":1200,"maxwall":1400}Each packet arrives as a complete JSON object on a single line, making it easy to parse in real-time.
FTrace Output (
--ftrace trace.txt)Generates Linux kernel ftrace text format for analysis with Eclipse TraceCompass:
Key features:
thread_name|entry_functionfor easy identificationVisualization with Eclipse TraceCompass
trace.txtfileUsage:
The FTrace output shows actual thread context switches in real-time, making it ideal for analyzing scheduling behavior, finding priority inversions, and understanding system timing.
Usage Examples
Prerequisites: OpenOCD Configuration
Use the provided
stm32h74x.cfgfrom the project. Key parts for ITM/DWT configuration:Start OpenOCD with the provided cfg file:
THAT'S IT! The cfg file automatically does EVERYTHING:
From the cfg file:
THAT'S ALL! OpenOCD is ALREADY serving the ITM stream on port 46000!
The data flow is simply:
No manual telnet commands needed for ITM! Everything is automatic when you start OpenOCD with the cfg.
BUT THE MAGIC IS: The cfg file defines helper functions that orbtop-rtos WILL USE via telnet:
When orbtop-rtos starts, it:
rtos_dwt_config 0xXXXXXXXXvia telnet to configure DWT Comparator 1So the cfg provides both:
Basic RTOS Monitoring
Real-world Example
JSON UDP Output (No Console)
With Exception Tracking
orbtop-rtos \ -s localhost:46000 \ -p ITM \ -e firmware.elf \ -T rtxv5 \ -W 4444 \ -F 480000000 \ -E # Enable exception statisticsFTrace Output for TraceCompass
Thread Switch Detection and Processing
When a new TCB is detected via DWT:
Adding Support for Other RTOS
To add FreeRTOS or other RTOS support, implement the
rtosOpsinterface:Example FreeRTOS implementation would:
pxCurrentTCBsymbol instead ofosRtxInfopxCurrentTCBaddressImplementation Flow
flowchart TD Start([orbtop-rtos start]) Start --> LoadELF[Load ELF symbols] LoadELF --> DetectRTOS{Detect RTOS type} DetectRTOS -->|RTX5 found| FindSymbol[Find osRtxInfo symbol] DetectRTOS -->|Not found| Error[Exit: RTOS not supported] FindSymbol --> CalcAddr[Calculate thread.run.curr address] CalcAddr --> ConnectTelnet[Connect to OpenOCD telnet] ConnectTelnet --> ConfigDWT[Call rtos_dwt_config via telnet] ConfigDWT --> ConnectITM[Connect to ITM stream] ConnectITM --> MainLoop{Process ITM packets} MainLoop --> PacketType{Packet type?} PacketType -->|HW Event| ReadTCB[Read TCB via telnet] PacketType -->|Timestamp| UpdateTime[Update timestamp] ReadTCB --> UpdateStats[Update thread statistics] UpdateStats --> CheckInterval{Interval complete?} UpdateTime --> CheckInterval CheckInterval -->|No| MainLoop CheckInterval -->|Yes| Output[Generate output] Output --> ResetCounters[Reset interval counters] ResetCounters --> MainLoopTechnical Details
DWT Comparator Configuration
The DWT comparator monitors writes to
osRtxInfo.thread.run.curr:ITM Timestamp Handling
ITM timestamps are incremental, not absolute. The tool accumulates them to track real time:
The ITM generates timestamp packets:
With proper prescaler settings:
Thread Statistics Calculation
ITM Overflow and Its Impact on CPU Measurements
The Problem
When ITM overflow occurs, packets are LOST, including:
Since ITM timestamps are incremental (not absolute), losing packets means:
How It Shows in Output
Or when total doesn't add up to ~100%:
Why This Happens
Solutions
Reduce ITM traffic:
-E)-I 2000for 2 seconds)Increase SWO bandwidth (in cfg file):
$_CHIPNAME.swo configure -protocol uart -traceclk 480000000 -pin-freq 4000000Monitor overflow counter: Watch the
Ovfcounter in outputIMPORTANT: When overflow occurs, CPU usage percentages are UNRELIABLE! The tool shows warnings but continues running with incorrect data.
Troubleshooting
Performance Considerations
References