De-flake TestDrainPendingSerializesPerHost (CI stability) #33
Reference in New Issue
Block a user
Delete Branch "fix-flaky-server-http-tests"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes the intermittent
server-httpCI failures that forced merging-past-red on recent PRs.Root cause
-racereports no data race — the per-host drain mutex is correct. Under parallel CI load the test's WS conn could be dropped/unregistered, soDrainPendingcorrectly no-ops whenHub.Conn(hostID) == nil. The test client stopped reading the socket during its concurrent burst (a real agent is always in a read loop), so under load the server-side conn keepalive lapsed → unregistered → drains no-op'd. The test then saw a partial/empty drain (observed both1 job / 4 pendingand0 jobs / 5 pending) and failed its assertion of immediate, complete drainage — which production only guarantees eventually, via repeated drains (30s tick / on reconnect).Fix
5-jobs assertion (which actually proves the mutex prevents double-dispatch) is unchanged — the test isn't weakened.No production code changed; test-only.
Verification
-raceiterations.