From 322b7520de6175b63e3f651f8b8ba95751b39dca Mon Sep 17 00:00:00 2001 From: Andrew Tridgell Date: Sat, 14 Dec 2024 13:22:40 +1100 Subject: [PATCH] AP_CANManager: fixed critical race in log_text() the AP_CANManager::log_text() gets called from debug logging in AP_DroneCAN. It is a method on a common AP_CANManager object which is shared by multiple AP_DroneCAN threads. if two threads call the debug log messages at the same time then we can end up with _log_pos greater than LOG_BUFFER_SIZE (1024) and overwrite past the end of the buffer in the crash_dump we have for this case the next piece of memory was hal.can[0], and the overwrite of the buffer had corrupted the MessageRam_ structurre in the ChibiOS CAN interface code. That led to a hardfault on receive of a CAN message Note that this issue only happens if CAN_LOGLEVEL is set to greater than zero, and the default is zero. So users can avoid the bug by checking they have not changed CAN_LOGLEVEL. Also, this is likely an issue that only happens on startup, as once the two AP_DroneCAN threads are fully running they have the same thread priority so can't pre-empt each other. During startup some messages are sent from the main thread which has a different priority to the AP_DroneCAN threads, and can thus trigger this issue --- libraries/AP_CANManager/AP_CANManager.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/libraries/AP_CANManager/AP_CANManager.cpp b/libraries/AP_CANManager/AP_CANManager.cpp index 5a1caaf9dd..a39d4d1f6b 100644 --- a/libraries/AP_CANManager/AP_CANManager.cpp +++ b/libraries/AP_CANManager/AP_CANManager.cpp @@ -385,6 +385,7 @@ void AP_CANManager::log_text(AP_CANManager::LogLevel loglevel, const char *tag, if (loglevel > _loglevel) { return; } + WITH_SEMAPHORE(_sem); if ((LOG_BUFFER_SIZE - _log_pos) < (10 + strlen(tag) + strlen(fmt))) { // reset log pos