Five Polling Loops and Zero Communication: An Architecture Archaeology
My Discord bot updates its presence status to show how many players are on my Minecraft server. Simple feature, simple implementationāor so I thought until I asked Claude about making it faster and discovered Iād accidentally built five independent timers all doing versions of the same work.
The Problem That Felt Simple
The bot checks player count every 30 seconds. Join the Discord server, glance at the bot, and youāll see something like āPlaying: 3 players online.ā Except sometimes it lies. A player leaves, and for up to 30 seconds, the bot still shows them as online.
This matters more than I initially admitted. A friend checks Discord to see if Iām playing before launching the game. They see ā1 player online,ā boot up Minecraft, connectāand find an empty server because Iād logged off 25 seconds ago. Minor? Sure. But itās the kind of friction that makes a system feel unreliable.
I figured Iād just add an event listener. Flag goes up, bot reacts instantly. Easy, right?
What Claude Found Instead
After crawling through my codebase, Claude mapped out what Iād actually built. I expected one or two polling loops. Instead:
Status updater: every 30 seconds
Performance monitor: every 5 minutes
Chat watcher: every 2 seconds
Player activity check: every 5 minutes
Lambda startup monitor: every 15 seconds (during boot only)
Five different polling loops. Five different timing intervals. Some running on EC2, some on Lambda, none talking to each other.
Technical debt doesnāt announce itself. It accumulates in 30-second intervals and 5-minute cron jobs until someone asks āwhy is this slow?ā and discovers youāve built a machine that constantly asks āare we there yet?ā instead of waiting to be told.
The architecture made sense when I built each piece individually. The status updater was my first featureā30 seconds seemed responsive enough. The performance monitor came later when I wanted TPS alertsā5 minutes avoided spam. The chat watcher needed to feel real-time, so 2 seconds. Each decision was reasonable in isolation.
Together? Accidental complexity.
The Hidden Costs
Claude identified four significant limitations:
-
The Lambda bot canāt see Minecraft stateāonly EC2 instance state. It knows the machine is running but not whether Minecraft actually started.
-
No shared state between Lambda and EC2 bots. Each maintains its own view of reality.
-
No notifications for auto-shutdown. The server quietly stops after 60 minutes of inactivity, but Discord users only find out when they try to connect.
-
Redundant checks. Both bots independently verify server status. Same data, different timing, different results.
The worst part? I already had webhook infrastructure for some features. The backup script sends Discord notifications when backups complete. The whitelist system uses webhooks for approval flows. The pattern existedāI just never connected the dots.
What Event-Driven Would Actually Look Like
The alternative isnāt magicāitās just inverting the responsibility:
# Instead of polling...
@tasks.loop(seconds=30)
async def update_status():
players = await rcon_client.get_online_players()
await update_presence(players['count'])
# ...push state changes
def on_player_join(player_name):
webhook_notify("player_joined", player_name)
def on_player_leave(player_name):
webhook_notify("player_left", player_name)
Minecraft server logs already emit join/leave events. The chat watcher reads them at 2-second intervals. I could trigger status updates from those same events instead of running a separate polling loop.
Prioritizing the Fix
Hereās my honest assessment: polling works. Users rarely notice a 30-second delay in bot status. The system isnāt brokenāitās just inelegant.
But Claudeās analysis gave me a clear prioritization framework:
High value, low effort:
- Add webhook notifications for auto-shutdown (users actually complain about this)
- Emit a āserver readyā event instead of polling RCON during startup
High value, moderate effort:
- Share state between Lambda and EC2 through DynamoDB so they stop maintaining separate views of reality
Lower priority:
- Consolidate the five polling loops into event-driven reactions (significant refactor, marginal user benefit)
The trigger for actually making these changes? When I add the next feature that needs real-time state. Right now Iām considering a āserver startingā notification so players know when to connect. Building that on top of polling would mean adding a sixth timer. Building it event-driven means I finally have to fix the architecture.
The Real Lesson
AI assistants are remarkably good at architecture archaeology. I knew I had polling in multiple places. I didnāt know I had five polling loops with overlapping concerns and zero communication between them.
Sometimes the most valuable analysis isnāt āhereās how to fix itāāitās āhereās what you actually built.ā That map of accidental complexity is worth the conversation alone.
The bot still polls. But now I know exactly where the debt lives, I have a prioritized plan, and I know what will finally force me to pay it down: the next feature that makes six timers feel absurd.