Five Polling Loops and Zero Communication: An Architecture Archaeology
I asked Claude why my Discord botâs status updates felt slow. What I discovered: five independent polling loops that donât talk to each other.
The Problem That Felt Simple
The bot checks player count every 30 seconds. Join the Discord server, glance at the bot, and youâll see something like âPlaying: 3 players online.â Except sometimes it lies. A player leaves, and for up to 30 seconds, the bot still shows them as online.
This matters more than I initially admitted. A friend checks Discord to see if Iâm playing before launching the game. They see â1 player online,â boot up Minecraft, connectâand find an empty server because Iâd logged off 25 seconds ago. Minor? Sure. But itâs the kind of friction that makes a system feel unreliable.
I figured Iâd just add an event listener. Flag goes up, bot reacts instantly. Easy, right?
What Claude Found Instead
Some background: my Minecraft server runs on two components. An EC2 instance hosts the actual game server plus a Discord bot that handles gameplay featuresâchat bridging, player notifications, status display. A separate Lambda function manages infrastructureâstarting and stopping the EC2 instance on demand, tracking costs. Theyâre different bots because they serve different purposes and have different availability requirements.
After crawling through my codebase, Claude mapped out what Iâd actually built. I expected one or two polling loops. Instead:
Status updater: every 30 seconds (shows player count in Discord)
Performance monitor: every 5 minutes (alerts if TPS drops)
Chat watcher: every 2 seconds (bridges Minecraft chat to Discord)
Player activity check: every 5 minutes (triggers auto-shutdown after idle)
Lambda startup monitor: every 15 seconds (waits for Minecraft to accept connections)
Five different polling loops. Five different timing intervals. Some running on EC2, some on Lambda, none talking to each other.
Technical Debt Doesnât Announce Itself
It accumulates in 30-second intervals and 5-minute cron jobs until someone asks âwhy is this slow?â and discovers youâve built a machine that constantly asks âare we there yet?â instead of waiting to be told.
The architecture made sense when I built each piece individually. The status updater was my first featureâ30 seconds seemed responsive enough. The performance monitor came later when I wanted TPS alertsâ5 minutes avoided spam. The chat watcher needed to feel real-time, so 2 seconds. Each decision was reasonable in isolation.
Together? Accidental complexity.
The Hidden Costs
Claude identified four significant limitations, but one stands out: the Lambda bot canât see Minecraft stateâonly EC2 instance state. It knows the machine is running but not whether Minecraft actually started.
Picture this: youâre at work and want to check if the server is ready for your lunch break session. You ask the Discord bot to start the server. Lambda spins up the EC2 instance and tells you âServer starting!â Then⌠silence. The Lambda startup monitor polls every 15 seconds, but itâs checking whether the EC2 instance is running, not whether Minecraft has finished loading. Meanwhile, the EC2 bot isnât running yet because Minecraft hasnât started. Youâre left refreshing Discord, wondering if it worked.
The other issues compound this:
- No shared state between Lambda and EC2 bots. Each maintains its own view of reality, so they sometimes disagree about whatâs happening.
- No notifications for auto-shutdown. The server quietly stops after 60 minutes of inactivity, but Discord users only find out when they try to connect.
- Redundant checks. Both bots independently verify server status. Same data, different timing, different results.
The worst part? I already had webhook infrastructure for some features. The backup script sends Discord notifications when backups complete. The whitelist system uses webhooks for approval flows. The pattern existedâI just never connected the dots.
What Event-Driven Would Actually Look Like
The alternative isnât magicâitâs just inverting the responsibility. Instead of the bot asking âwhatâs the player count?â every 30 seconds, the server announces âplayer joinedâ when it happens.
# Instead of polling...
@tasks.loop(seconds=30)
async def update_status():
players = await rcon_client.get_online_players()
await update_presence(players['count'])
# ...parse server logs for state changes
def on_log_line(line):
if "joined the game" in line:
webhook_notify("player_joined", parse_player(line))
elif "left the game" in line:
webhook_notify("player_left", parse_player(line))
Hereâs the kicker: my chat watcher already reads the server logs at 2-second intervals to bridge messages to Discord. It sees join/leave events scroll pastâand ignores them. The infrastructure for instant status updates exists. I just never wired it up.
Why Iâm Not Fixing It Yet
Hereâs my honest assessment: polling works. Users rarely notice a 30-second delay in bot status. The system isnât brokenâitâs just inelegant.
The real question is opportunity cost. Refactoring five polling loops into an event-driven system means touching every bot feature, updating both Lambda and EC2 code, and testing all the edge cases. Thatâs a weekend project. I could spend that weekend adding features users actually request, like better backup management or mod support.
So Iâm being strategic. The changes worth making now:
- Auto-shutdown notifications (users actually complain about thisâmoderate effort, clear value)
- A âserver readyâ event instead of polling RCON during startup (removes the âdid it work?â uncertainty)
The full consolidation waits until I add a feature that forces it. Iâm considering a âserver startingâ notification so players know when to connect. Building that on top of polling means adding a sixth timer. Building it event-driven means I finally have to fix the architecture.
Architecture Archaeology
AI assistants are remarkably good at this kind of excavation. I knew I had polling in multiple places. I didnât know I had five polling loops with overlapping concerns and zero communication between them.
Sometimes the most valuable analysis isnât âhereâs how to fix itââitâs âhereâs what you actually built.â That map of accidental complexity, the list of what each component can and canât see, the inventory of patterns Iâd already established but never connectedâthatâs the insight worth preserving.
The bot still polls. But now I have an honest map of the debt and a clear trigger for when to pay it down.