Skip to content

Server watchdog crash caused by ChunkHandler calling getBlockData() from Netty IO thread #39

Description

@imcalledfyre

Server watchdog crash caused by ChunkHandler calling getBlockData() from Netty IO thread

Server hard crashes (watchdog kills it after 60s freeze) due to a deadlock between the Netty IO thread and the main server thread. The deadlock is triggered by ChunkHandler.handleBlockChange() calling CraftBlock.getBlockData() from inside a Netty pipeline write, which forces a synchronous chunk load (ServerChunkCache.syncLoad) on the IO thread — while the server thread is simultaneously blocked waiting on Netty to finish a connection handshake (Connection.syncAfterConfigurationChange). Neither thread can proceed, so the server locks up entirely.


Environment

  • Server software: Purpur 1.21.11-2568-f57bd86 (MC: 1.21.11)
  • TuffXPlus version: 1.0.0 Stable release
  • Java: 21.0.11
  • Hardware: 4-thread CPU, 6GB RAM, Pterodactyl Docker Container
  • Other relevant plugins: GrimAC 2.3.73, ProtocolLib 5.4.0, TAB 6.0.3, ViaVersion 5.9.1, ViaBackwards 5.9.1

Steps to reproduce

Difficult to reproduce on demand, but it consistently occurs under these conditions:

  1. Server is running with active players in multiple worlds
  2. A player initiates a login/join (triggering the ServerboundLoginAcknowledgedPacket handshake on the server thread)
  3. Simultaneously, TuffXPlus intercepts an outgoing packet on a Netty IO thread and calls getExtraDataForSingleBlock() to handle a block change

The two threads deadlock and the server freezes permanently until the watchdog terminates it.


What I expected

The plugin should handle block data lookups asynchronously or cache the data beforehand, not block the Netty IO thread on a synchronous main-thread operation.


What actually happened

The server froze completely for ~60 seconds. No players could interact. The watchdog thread eventually printed a full thread dump and forcibly stopped the server. All players were disconnected and any unsaved data was lost.


Thread dump (relevant excerpts)

The deadlock involves exactly two threads:

Server thread — waiting on Netty to complete a protocol change (player joining):

Current Thread: Server thread
  State: WAITING
  Stack:
    java.lang.Object.wait0(Native Method)
    io.netty.util.concurrent.DefaultPromise.awaitUninterruptibly
    io.netty.channel.DefaultChannelPromise.awaitUninterruptibly
    net.minecraft.network.Connection.syncAfterConfigurationChange(Connection.java:306)
    net.minecraft.network.Connection.setupOutboundProtocol(Connection.java:346)
    net.minecraft.server.network.ServerLoginPacketListenerImpl.handleLoginAcknowledgement(ServerLoginPacketListenerImpl.java:424)

Netty Epoll IO #0 — blocking on a synchronous chunk load initiated from within a TuffXPlus packet handler:

Current Thread: Netty Epoll IO #0
  State: WAITING
  Stack:
    java.util.concurrent.CompletableFuture.join(CompletableFuture.java:2117)
    net.minecraft.server.level.ServerChunkCache.syncLoad(ServerChunkCache.java:124)
    net.minecraft.server.level.ServerChunkCache.getChunkFallback(ServerChunkCache.java:154)
    net.minecraft.world.level.Level.getBlockState(Level.java:1347)
    org.bukkit.craftbukkit.block.CraftBlock.getBlockData(CraftBlock.java:169)
    tf.tuff.viablocks.CustomBlockListener.getExtraDataForSingleBlock(CustomBlockListener.java:343)  <-- HERE
    tf.tuff.netty.ChunkHandler.handleBlockChange(ChunkHandler.java:146)
    tf.tuff.netty.ChunkHandler.write(ChunkHandler.java:80)
    [... Netty pipeline write handlers ...]

The IO thread is stuck waiting for the server thread to process the chunk load, but the server thread is stuck waiting for the IO thread to finish the protocol handshake. Classic deadlock.


Root cause

CustomBlockListener.getExtraDataForSingleBlock() calls CraftBlock.getBlockData(), which under certain conditions falls through to ServerChunkCache.syncLoad(). This is a blocking call that must be completed by the server thread. Calling it from a Netty IO thread is inherently unsafe and will deadlock whenever the server thread is itself blocked on Netty (which happens routinely during player logins).


Suggested fix

getExtraDataForSingleBlock (and any similar block state lookups in ChunkHandler) should not be called from the Netty pipeline. Options:

  • Cache block data when chunks are loaded/updated, and read from the cache in the Netty handler (no thread crossing needed)
  • Use getChunkIfLoaded() and skip/fallback gracefully if the chunk isn't available on the IO thread rather than forcing a sync load
  • Move the lookup off the Netty thread entirely — e.g. schedule it on the main thread before the packet write stage

yes this was generated by my pet clanker claude

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingduplicateThis issue or pull request already exists

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions