test: poll pending-row count in drain-on-reconnect test (race fix)

CI run #50 failed with:

  --- FAIL: TestDrainPendingDispatchesOnReconnect (1.03s)
      pending_drain_test.go:150: pending rows after drain: got 1, want 0

The test waits for a backup command.run envelope on the wire and
then checks the pending-row count. But conn.Send (the wire write)
returns BEFORE DeletePendingRun runs in the drain goroutine — both
fire serially inside drainOne, but the wire-side reader can observe
the Send while the delete is still pending.

Use the existing waitForPendingCount helper to poll the count with
a 2s deadline. Behaviour unchanged when the delete is fast (count
hits 0 immediately); only relevant under CI scheduling pressure.
-race -count=10 locally now passes consistently.
This commit is contained in:
2026-05-04 10:20:54 +01:00
parent 51a7ea302f
commit e850f6f44c
+6 -1
View File
@@ -145,7 +145,12 @@ func TestDrainPendingDispatchesOnReconnect(t *testing.T) {
t.Errorf("backup tag: %q", got.Tag)
}
// Pending row should be gone.
// Pending row should be gone. Poll briefly: the drain goroutine
// sends command.run via conn.Send and only then calls
// DeletePendingRun. Reading the envelope off the wire above proves
// the send happened, but the delete runs after that on the drain
// goroutine — small window where the count is still 1.
waitForPendingCount(t, st, hostID, 0, 2*time.Second)
if n := countPendingForHost(t, st, hostID); n != 0 {
t.Errorf("pending rows after drain: got %d, want 0", n)
}