AP_CANManager: fixed critical race in log_text()

the AP_CANManager::log_text() gets called from debug logging in
AP_DroneCAN. It is a method on a common AP_CANManager object which is
shared by multiple AP_DroneCAN threads.

if two threads call the debug log messages at the same time then we
can end up with _log_pos greater than LOG_BUFFER_SIZE (1024) and
overwrite past the end of the buffer

in the crash_dump we have for this case the next piece of memory was
hal.can[0], and the overwrite of the buffer had corrupted the
MessageRam_ structurre in the ChibiOS CAN interface code. That led to
a hardfault on receive of a CAN message

Note that this issue only happens if CAN_LOGLEVEL is set to greater
than zero, and the default is zero. So users can avoid the bug by
checking they have not changed CAN_LOGLEVEL.

Also, this is likely an issue that only happens on startup, as once
the two AP_DroneCAN threads are fully running they have the same
thread priority so can't pre-empt each other. During startup some
messages are sent from the main thread which has a different priority
to the AP_DroneCAN threads, and can thus trigger this issue
This commit is contained in:
Andrew Tridgell 2024-12-14 13:22:40 +11:00 committed by Peter Barker
parent eedb8ae617
commit 322b7520de
1 changed files with 1 additions and 0 deletions

View File

@ -385,6 +385,7 @@ void AP_CANManager::log_text(AP_CANManager::LogLevel loglevel, const char *tag,
if (loglevel > _loglevel) { if (loglevel > _loglevel) {
return; return;
} }
WITH_SEMAPHORE(_sem);
if ((LOG_BUFFER_SIZE - _log_pos) < (10 + strlen(tag) + strlen(fmt))) { if ((LOG_BUFFER_SIZE - _log_pos) < (10 + strlen(tag) + strlen(fmt))) {
// reset log pos // reset log pos