-
Notifications
You must be signed in to change notification settings - Fork 0
Re-attempt pub and sub operations #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks @corrideat! I'll have a closer look, in the meantime, here's a quick review from Opus 4.5: DetailsCritical Bug: First Retry Has Zero DelayIn the // 4. Schedule retry
setTimeout(() => {
if (++attemptNo > options.maxOpRetries) {
console.warn(`Giving up ${type} for channel`, channelID)
return
}
send()
}, options.opRetryInterval * attemptNo) // attemptNo is 0 here!The flow is:
Fix: // 4. Schedule retry
setTimeout(() => {
if (++attemptNo > options.maxOpRetries) {
console.warn(`Giving up ${type} for channel`, channelID)
return
}
send()
}, options.opRetryInterval * (attemptNo + 1))Potential Issue: Retries Continue After SuccessWhen a SUB/UNSUB succeeds, the entry is deleted from the pending map, but the scheduled Consider tracking and clearing timeouts on success: function runWithRetry (
client: PubSubClient,
channelID: string,
type: RequestTypeEnum,
instance: object
) {
let attemptNo = 0
let retryTimeoutID: ReturnType<typeof setTimeout> | undefined
const { socket, options } = client
const send = () => {
if (client.socket !== socket || socket?.readyState !== WebSocket.OPEN) return
const currentInstance = type === REQUEST_TYPE.SUB
? client.pendingSubscriptionMap.get(channelID)
: client.pendingUnsubscriptionMap.get(channelID)
if (currentInstance !== instance) return
const kvFilter = client.kvFilter.get(channelID)
const payload: JSONType = (type === REQUEST_TYPE.SUB && kvFilter)
? { channelID, kvFilter }
: { channelID }
socket.send(createRequest(type, payload))
retryTimeoutID = setTimeout(() => {
if (++attemptNo > options.maxOpRetries) {
console.warn(`Giving up ${type} for channel`, channelID)
return
}
send()
}, options.opRetryInterval * (attemptNo + 1))
}
send()
}Code Clarity: Redundant-Looking Loop in Close HandlerThe close handler has two sequential operations on if (client.shouldReconnect) {
const instance = {}
client.subscriptionSet.forEach((channelID) => {
if (!client.pendingUnsubscriptionMap.has(channelID)) {
client.pendingSubscriptionMap.set(channelID, instance)
}
})
for (const [channelID] of client.pendingSubscriptionMap) {
client.pendingSubscriptionMap.set(channelID, instance)
}
}The second loop re-sets items that may have just been added. While this is intentional (to reset instances for existing pending subscriptions so old retry timers are invalidated), it's not immediately clear. Consider adding a comment or refactoring: if (client.shouldReconnect) {
const instance = {}
// Invalidate all existing pending subscription instances so that their
// retry timers will stop (due to instance mismatch checks in runWithRetry)
// and prepare them to be retried with the new socket.
for (const [channelID] of client.pendingSubscriptionMap) {
client.pendingSubscriptionMap.set(channelID, instance)
}
// Re-add current subscriptions as pending so they get restored on reconnect.
client.subscriptionSet.forEach((channelID) => {
if (!client.pendingUnsubscriptionMap.has(channelID)) {
client.pendingSubscriptionMap.set(channelID, instance)
}
})
}Minor: No Retry for UNSUB on ReconnectThe // Send any pending subscription request.
for (const [channelID, instance] of client.pendingSubscriptionMap) {
runWithRetry(client, channelID, REQUEST_TYPE.SUB, instance)
}
// There should be no pending unsubscription since we just got connected.While the comment explains the rationale (and Summary
|
|
Addressed the zero-delay issue (result of refactoring) and the double loop in The second issue (Retry timeouts not cleared on success) seems like something where the benefit is a bit unclear. It is true that the timeout will not be cleared on success or error, but then it'll fail fairly quickly at the 'Closure check' or 'Cancellation check' steps. I could keep track of these and cancel them, for example upon receiving an OK response or when closing the connection, but that'll add extra logic, and the way it is now shouldn't break anything or be too wasteful. |
taoeffect
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this latest Opus 4.5 review require any changes in your mind?
Details
I've reviewed the changes carefully. Overall, the refactoring from Set to Map for tracking pending operations with instance tokens is a reasonable approach for implementing cancellable retries. However, I found several issues worth addressing.
1. Retry Timers Are Never Cleared (Memory/Resource Leak)
The setTimeout calls in runWithRetry are never tracked or cleared. When destroy() is called or the socket closes, these timers continue to exist and will fire (though they exit early).
Problematic code in runWithRetry:
// 4. Schedule retry
setTimeout(() => {
if (++attemptNo > options.maxOpRetries) {
console.warn(`Giving up ${type} for channel`, channelID)
return
}
send()
}, options.opRetryInterval * (attemptNo + 1))Suggested improvement — Track retry timers so they can be cleared:
function runWithRetry (
client: PubSubClient,
channelID: string,
type: RequestTypeEnum,
instance: object
) {
let attemptNo = 0
let retryTimeoutID: TimeoutID | undefined
const { socket, options } = client
const send = () => {
// 1. Closure check: ensure socket instance hasn't been replaced
if (client.socket !== socket || socket?.readyState !== WebSocket.OPEN) return
// 2. Cancellation check
const currentInstance = type === REQUEST_TYPE.SUB
? client.pendingSubscriptionMap.get(channelID)
: client.pendingUnsubscriptionMap.get(channelID)
if (currentInstance !== instance) {
clearTimeout(retryTimeoutID)
return
}
// 3. Send logic
const kvFilter = client.kvFilter.get(channelID)
const payload: JSONType = (type === REQUEST_TYPE.SUB && kvFilter)
? { channelID, kvFilter }
: { channelID }
socket.send(createRequest(type, payload))
// 4. Schedule retry
retryTimeoutID = setTimeout(() => {
if (++attemptNo > options.maxOpRetries) {
console.warn(`Giving up ${type} for channel`, channelID)
return
}
send()
}, options.opRetryInterval * (attemptNo + 1))
}
send()
}You may also want to store all active retry timer IDs at the client level so clearAllTimers() can cancel them on destroy.
2. Duplicate Requests Sent Before Server Response
The retry is scheduled immediately after each send, regardless of whether a response arrives. With opRetryInterval: 2000, if the server takes longer than 2 seconds to respond, a duplicate request will be sent.
Problematic code:
socket.send(createRequest(type, payload))
// 4. Schedule retry
setTimeout(() => {
// ... retry logic
}, options.opRetryInterval * (attemptNo + 1))Suggestion: Consider whether this is acceptable behavior. If server latency commonly exceeds 2 seconds, you may want to:
- Increase
opRetryIntervaldefault - Add a comment documenting that duplicate requests are expected and the server should be idempotent
- Or implement a different pattern where retry only happens if the connection drops (not as a timeout)
3. Redundant kvFilter.delete() Call
The kvFilter is deleted in two places for UNSUB operations:
In unsub() method:
unsub (channelID: string) {
const client = this
if (!client.pendingUnsubscriptionMap.has(channelID)) {
// ...
client.kvFilter.delete(channelID) // <-- First deletion
runWithRetry(client, channelID, REQUEST_TYPE.UNSUB, instance)
}
}In OK response handler:
case REQUEST_TYPE.UNSUB: {
console.debug(`[pubsub] Unsubscribed from ${channelID}`)
client.pendingUnsubscriptionMap.delete(channelID)
client.subscriptionSet.delete(channelID)
client.kvFilter.delete(channelID) // <-- Second deletion (redundant)
break
}Suggestion: Remove the deletion from the OK handler since it's already done eagerly in unsub():
case REQUEST_TYPE.UNSUB: {
console.debug(`[pubsub] Unsubscribed from ${channelID}`)
client.pendingUnsubscriptionMap.delete(channelID)
client.subscriptionSet.delete(channelID)
// kvFilter already deleted in unsub()
break
}4. unsub() Allows Unsubscribing From Non-Subscribed Channels
Unlike the new sub() method which checks !client.subscriptionSet.has(channelID), the unsub() method has no such guard:
sub (channelID: string) {
const client = this
if (!client.pendingSubscriptionMap.has(channelID) && !client.subscriptionSet.has(channelID)) {
// ... guarded against already-subscribed
}
},
unsub (channelID: string) {
const client = this
if (!client.pendingUnsubscriptionMap.has(channelID)) {
// ... no check for subscriptionSet or pendingSubscriptionMap
}
}Suggestion: Add a guard to prevent unnecessary UNSUB requests:
unsub (channelID: string) {
const client = this
// Only unsub if we're subscribed or have a pending subscription
if (!client.pendingUnsubscriptionMap.has(channelID) &&
(client.subscriptionSet.has(channelID) || client.pendingSubscriptionMap.has(channelID))) {
const instance = {}
client.pendingSubscriptionMap.delete(channelID)
client.pendingUnsubscriptionMap.set(channelID, instance)
client.kvFilter.delete(channelID)
runWithRetry(client, channelID, REQUEST_TYPE.UNSUB, instance)
}
}5. Potential Confusion: Total Attempts vs. Retries
With maxOpRetries: 4, the code allows for 5 total attempts (1 initial + 4 retries). The option name suggests retries, so this is technically correct, but worth documenting clearly.
Suggestion: Add a clarifying comment:
const defaultOptions: Options = {
// ...
maxOpRetries: 4, // Maximum retry attempts after initial send (5 total attempts)
opRetryInterval: 2000 // Base interval between retries (linear backoff: 2s, 4s, 6s, 8s)
}6. Minor: setKvFilter Lacks Retry Mechanism
For consistency, setKvFilter sends without the retry mechanism:
setKvFilter (channelID: string, kvFilter?: string[]) {
// ...
if (client.subscriptionSet.has(channelID)) {
if (socket?.readyState === WebSocket.OPEN) {
socket.send(
createRequest(REQUEST_TYPE.KV_FILTER, kvFilter ? { channelID, kvFilter } : { channelID })
)
}
}
},Is this intentional? If the KV_FILTER message is lost, the filter won't be applied. Consider whether this should also use a retry mechanism or document why it doesn't need one.
Summary
| Issue | Severity | Action |
|---|---|---|
| Retry timers not cleared | Medium | Track and clear timers on destroy/cancel |
| Duplicate requests on slow servers | Low-Medium | Document or increase interval |
Redundant kvFilter.delete() |
Low | Remove from OK handler |
unsub() missing guard |
Low | Add subscription check |
| Unclear total attempts | Low | Add documenting comment |
setKvFilter no retry |
Low | Document intentionality or add retry |
|
@taoeffect I've also noticed this piece of logic: // If we are definetely offline then do not try to reconnect now,
// unless the server is local.
if (!isDefinetelyOffline() || client.isLocal) {
client.scheduleConnectionAttempt()
}Personally, I'm not sure this makes a lot of sense to keep. For one, logic is already complex as is to have special cases like |
|
I challenged Opus with making DetailsYou're right. A single unified map with composite keys would be cleaner than creating a separate map for each message type. Solution: Unified
|
| Before | After |
|---|---|
3 separate maps (pendingSubscriptionMap, pendingUnsubscriptionMap, + awkward reuse for KV_FILTER) |
1 unified pendingOperations map |
Type-checking inside runWithRetry to determine which map to use |
Generic runWithRetry that works with any operation |
Payload construction inside runWithRetry with type-specific logic |
Each operation carries its own getPayload() function |
| Hard to add new retriable message types | Add new types by simply creating an operation object and calling runWithRetry |
Adding New Retriable Operations
To add retry support for any new message type, you just need to:
const opKey = createOpKey(REQUEST_TYPE.NEW_TYPE, identifier)
client.pendingOperations.set(opKey, {
type: REQUEST_TYPE.NEW_TYPE,
getPayload: () => ({ /* your payload */ }),
instance: {}
})
runWithRetry(client, opKey)And handle the response in the OK/ERROR handlers by deleting from pendingOperations.
|
Here's Gemini Pro 3: DetailsHere is the solution. The core issue is that the client was maintaining separate Structure Change:
Here is the refactored code. // ... imports ...
export type PubSubClient = {
// ... existing fields ...
// REPLACED: pendingSubscriptionMap / pendingUnsubscriptionMap
// WITH: A unified map tracking any pending request type for a channel
pendingRequests: Map<string, Map<string, object>>;
// ... existing fields ...
};
// ... helpers ...
function runWithRetry (
client: PubSubClient,
channelID: string,
type: RequestTypeEnum,
instance: object
) {
let attemptNo = 0
const { socket, options } = client
const send = () => {
// 1. Closure check: ensure socket instance hasn't been replaced
if (client.socket !== socket || socket?.readyState !== WebSocket.OPEN) return
// 2. Cancellation check
// We now look up the nested map. If the instance token for this
// specific type and channel has changed, we stop.
const currentInstance = client.pendingRequests.get(channelID)?.get(type)
if (currentInstance !== instance) return
// 3. Send logic
const kvFilter = client.kvFilter.get(channelID)
const payload: JSONType = (
kvFilter && ([REQUEST_TYPE.SUB, REQUEST_TYPE.KV_FILTER] as string[]).includes(type)
)
? { channelID, kvFilter }
: { channelID }
socket.send(createRequest(type, payload))
// 4. Schedule retry
setTimeout(() => {
if (++attemptNo > options.maxOpRetries) {
console.warn(`Giving up ${type} for channel`, channelID)
return
}
send()
}, options.opRetryInterval * (attemptNo + 1))
}
send()
}
// ... API ...
export function createClient (url: string, options: Partial<Options> = {}): PubSubClient {
const client: PubSubClient = {
// ... other props ...
// Initialize the unified map
pendingRequests: new Map(),
// pendingSubscriptionMap: new Map(), // Deleted
// pendingUnsubscriptionMap: new Map(), // Deleted
// ... other props ...
}
// ... listeners setup ...
return client
}
// ...
// These handlers receive the PubSubClient instance through the `this` binding.
const defaultClientEventHandlers: ClientEventHandlers = {
close (event) {
const client = this
// ... logging and cleanup ...
if (client.shouldReconnect) {
// Create a new instance token to invalidate any in-flight retry closures
// from the previous connection.
const instance = {}
// 1. Refresh all currently pending requests with the new token
for (const requests of client.pendingRequests.values()) {
for (const type of requests.keys()) {
requests.set(type, instance)
}
}
// 2. Restore subscriptions: Ensure every channel we are subscribed to
// is queued for re-subscription (SUB), unless we are explicitly unsubscribing.
client.subscriptionSet.forEach((channelID) => {
let requests = client.pendingRequests.get(channelID)
if (!requests) {
requests = new Map()
client.pendingRequests.set(channelID, requests)
}
// Only queue a SUB if we aren't waiting for an UNSUB
if (!requests.has(REQUEST_TYPE.UNSUB)) {
requests.set(REQUEST_TYPE.SUB, instance)
}
})
} else {
client.pendingRequests.clear()
}
client.subscriptionSet.clear()
// No need to clear pending maps here, logic above handled it or we cleared all
// ... reconnection logic ...
},
// ... error, message, offline, online handlers ...
open () {
// ... existing setup ...
// Send ANY pending request (SUB, KV_FILTER, or even UNSUB if pending)
for (const [channelID, requests] of client.pendingRequests) {
for (const [type, instance] of requests) {
runWithRetry(client, channelID, type as RequestTypeEnum, instance)
}
}
},
// ... other handlers ...
}
const defaultMessageHandlers: MessageHandlers = {
// ... other handlers ...
[RESPONSE_TYPE.ERROR] ({ data }) {
const { type, channelID, reason } = data
const client = this
// Generic cleanup for any request type
const requests = client.pendingRequests.get(channelID)
if (requests) {
requests.delete(type)
if (requests.size === 0) client.pendingRequests.delete(channelID)
}
// ... existing switch/case for specific logging ...
},
[RESPONSE_TYPE.OK] ({ data: { type, channelID, kvFilter } }) {
const client = this
// Generic cleanup: Request succeeded, stop retrying.
const requests = client.pendingRequests.get(channelID)
if (requests) {
requests.delete(type)
if (requests.size === 0) client.pendingRequests.delete(channelID)
}
switch (type) {
case REQUEST_TYPE.SUB: {
client.subscriptionSet.add(channelID)
sbp('okTurtles.events/emit', PUBSUB_SUBSCRIPTION_SUCCEEDED, client, { channelID })
// ... KV Filter logic stays largely the same ...
// Only change: calling this.setKvFilter checks the new map structure
// (Logic omitted for brevity, logic matches original snippet provided)
break
}
case REQUEST_TYPE.UNSUB: {
console.debug(`[pubsub] Unsubscribed from ${channelID}`)
client.subscriptionSet.delete(channelID)
break
}
case REQUEST_TYPE.KV_FILTER: {
client.subscriptionSet.add(channelID)
console.debug(`[pubsub] Set KV filter for ${channelID}`)
break
}
// ...
}
}
}
// ...
const publicMethods: {
// ...
destroy(this: PubSubClient): void;
// ...
} = {
// ...
destroy () {
const client = this
// ...
client.pendingRequests.clear() // Clean unified map
client.subscriptionSet.clear()
// ...
},
sub (channelID: string) {
const client = this
// Get or Create the request map for this channel
let requests = client.pendingRequests.get(channelID)
if (!requests) {
requests = new Map()
client.pendingRequests.set(channelID, requests)
}
// Calling sub() cancels any pending unsub()
requests.delete(REQUEST_TYPE.UNSUB)
// If we aren't subscribed and aren't already trying to subscribe
if (!client.subscriptionSet.has(channelID) && !requests.has(REQUEST_TYPE.SUB)) {
const instance = {}
requests.set(REQUEST_TYPE.SUB, instance)
runWithRetry(client, channelID, REQUEST_TYPE.SUB, instance)
}
},
setKvFilter (channelID: string, kvFilter?: string[]) {
const client = this
if (kvFilter) {
client.kvFilter.set(channelID, kvFilter)
} else {
client.kvFilter.delete(channelID)
}
// Helper to check pending status
const requests = client.pendingRequests.get(channelID)
const isUnsubscribing = requests?.has(REQUEST_TYPE.UNSUB)
// If subscribed and not currently unsubscribing
if (client.subscriptionSet.has(channelID) && !isUnsubscribing) {
const reqs = requests || new Map()
if (!requests) client.pendingRequests.set(channelID, reqs)
const instance = {}
reqs.set(REQUEST_TYPE.KV_FILTER, instance)
runWithRetry(client, channelID, REQUEST_TYPE.KV_FILTER, instance)
}
},
unsub (channelID: string) {
const client = this
let requests = client.pendingRequests.get(channelID)
if (!requests) {
requests = new Map()
client.pendingRequests.set(channelID, requests)
}
// Calling unsub() cancels pending sub() and kv_filter()
const hadPendingSub = requests.delete(REQUEST_TYPE.SUB)
requests.delete(REQUEST_TYPE.KV_FILTER)
client.kvFilter.delete(channelID)
// If we are currently subscribed, OR if we were trying to subscribe
// (meaning the server might have received the SUB), we must send UNSUB.
if (
!requests.has(REQUEST_TYPE.UNSUB) &&
(client.subscriptionSet.has(channelID) || hadPendingSub)
) {
const instance = {}
requests.set(REQUEST_TYPE.UNSUB, instance)
runWithRetry(client, channelID, REQUEST_TYPE.UNSUB, instance)
}
// Cleanup empty map if nothing happened
if (requests.size === 0) client.pendingRequests.delete(channelID)
}
}And here's Opus 4.5 again but with thinking enabled: DetailsRefactoring
|
|
Ran the latest changes through Opus 4.5: DetailsBugs1.
|
taoeffect
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done!
No description provided.