Skip to content

fix(shutdown): Prevent race condition when GlobalObject destruction routine unlocks global mutex #8652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions src/common/classes/RefCounted.h
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ namespace Firebird
return refCnt;
}

AtomicCounter::counter_type getRefCount() const noexcept
{
return m_refCnt.value();
}

void assertNonZero()
{
fb_assert(m_refCnt.value() > 0);
Expand Down
46 changes: 40 additions & 6 deletions src/jrd/Database.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -586,12 +586,39 @@ namespace Jrd
{
MutexLockGuard guard(g_mutex, FB_FUNCTION);

Database::GlobalObjectHolder::DbId* entry = g_hashTable->lookup(id);
// Get entry with incrementing ref counter, so if someone is currently destroying it, the object itself
// will remain alive.
RefPtr<Database::GlobalObjectHolder::DbId> entry(g_hashTable->lookup(id));
if (entry)
{
auto& shutdownMutex = entry->shutdownMutex;
// Check if someone else currently destroying GlobalObject.
if (shutdownMutex.tryEnter(FB_FUNCTION))
{
// No one is destroying GlobalObject, continue init routine.
shutdownMutex.leave();
}
else
{
// Someone is currently destroying GlobalObject, wait until he finish it to eliminate potential
// race conditions.
{
MutexUnlockGuard unlockGuard(g_mutex, FB_FUNCTION);

MutexLockGuard guard(shutdownMutex, FB_FUNCTION);
}
// Now we are the one who owned DbId object.
// It also was removed from hash table, so simply delete it and recreate it next.
fb_assert(entry->getRefCount() == 1);
Copy link
Member

@hvlad hvlad Jul 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are many concurrent initializers, then this assert could be violated.
I think it is not needed.
Instead, you may nullify entry->holder at ~GlobalObjectHolder() and check it for nullptr here.

entry = nullptr;
}
}
if (!entry)
{
const auto holder = FB_NEW Database::GlobalObjectHolder(id, filename, config);
entry = FB_NEW Database::GlobalObjectHolder::DbId(id, holder);
entry = makeRef(FB_NEW Database::GlobalObjectHolder::DbId(id, holder));
g_hashTable->add(entry);
entry->addRef();
}

entry->holder->addRef();
Expand All @@ -601,9 +628,15 @@ namespace Jrd
Database::GlobalObjectHolder::~GlobalObjectHolder()
{
// dtor is executed under g_mutex protection
Database::GlobalObjectHolder::DbId* entry = g_hashTable->lookup(m_id);
if (!g_hashTable->remove(m_id))
fb_assert(false);

// Stole the object from the hash table without incrementing ref counter, so we will be the one who will delete the object
// at the end of this function.
RefPtr<Database::GlobalObjectHolder::DbId> entry(REF_NO_INCR, g_hashTable->lookup(m_id));
fb_assert(entry);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add also fb_assert(entry->holder == this) ?

// We need to unlock the global mutex to safely shutdown some managers, so lock shutdown mutex to make sure that
// other threads will wait until we done our shutdown routine.
// This is done to eliminate potential race conditions involving global objects, such as shared memory.
MutexLockGuard guard(entry->shutdownMutex, FB_FUNCTION);

{ // scope
// here we cleanup what should not be globally protected
Expand All @@ -616,7 +649,8 @@ namespace Jrd
m_eventMgr = nullptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On enter g_mutex should be locked.
At line 639 the shutdownMutex is locked.
So we have order of mutexes: g_mutex then shutdownMutex.

At line 643 g_mutex is unlocked, while shutdownMutex still locked.
At line 646 g_mutex will be locked again and we have inverted order of mutexes: shutdownMutex then g_mutex.
This is a way for deadlock.

Am I wrong ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I can see every instance of DbId is linked to a specific GlobalObjectHolder, so shutdownMutex is a different object every time the ~GlobalObjectHolder() is called. Therefore, a deadlock can only occur if the desctructor will be called twice on the same GlobalObjectHolder, which is not possible, so we are safe here.
shutdownMutex can be locked by another thread in GlobalObjectHolder::init, but only when g_mutex is unlocked. Therefore, a deadlock is also not possible here too.

m_replMgr = nullptr;

delete entry;
if (!g_hashTable->remove(m_id))
fb_assert(false);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add entry->holder = nullptr ?

fb_assert(m_tempCacheUsage == 0);
}
Expand Down
4 changes: 3 additions & 1 deletion src/jrd/Database.h
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,7 @@ class Database : public pool_alloc<type_dbb>
typedef Firebird::HashTable<DbId, Firebird::DEFAULT_HASH_SIZE,
Firebird::string, DbId, DbId > DbIdHash;

struct DbId : public DbIdHash::Entry, public Firebird::GlobalStorage
struct DbId : public DbIdHash::Entry, public Firebird::GlobalStorage, public Firebird::RefCounted
{
DbId(const Firebird::string& x, GlobalObjectHolder* h)
: id(getPool(), x), holder(h)
Expand Down Expand Up @@ -286,6 +286,8 @@ class Database : public pool_alloc<type_dbb>

const Firebird::string id;
GlobalObjectHolder* const holder;

Firebird::Mutex shutdownMutex;
};

static Firebird::GlobalPtr<DbIdHash> g_hashTable;
Expand Down
Loading