-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Startup Retry & Configurable Startup Time-out #166
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 6 out of 8 changed files in this pull request and generated no comments.
Files not reviewed (2)
- src/ConfigurationClientWrapper.ts: Evaluated as low risk
- src/load.ts: Evaluated as low risk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 6 out of 8 changed files in this pull request and generated no comments.
Files not reviewed (2)
- src/ConfigurationClientWrapper.ts: Evaluated as low risk
- src/load.ts: Evaluated as low risk
Comments suppressed due to low confidence (1)
src/failover.ts:35
- The comment should be 'random value between [-1, 1) * JITTER_RATIO * calculatedBackoffDuration' for consistency.
// jitter: random value between [-1, 1) * jitterRatio * calculatedBackoffMs
Is there a reason the retry can be disabled? |
If people want to throw immediatly when initial load fails, they can disable it. Setting a very short timeout is not feasible for this case. The |
Anyone asked for it? What's the scenario for it? |
@zhenlan No one asks for it. I personally want to disable the retry so that I can know whether there is something wrong immediately. Anyway, I am ok with removing the |
If the connection string is malformed, I hope it will throw so we can fail immediately. Otherwise, you can't really tell what's wrong. Set a shorter timeout if you don't want to wait for too long.
Configuration is in critical code path. Most applications can't start without configuration loaded properly. Transient errors can happen for any cloud solutions. We rely on the client retry (and failover etc.) to deliver customers high availability. Please don't design a "feature" so customers can shoot their own feet. |
…ScriptProvider into zhiyuanliang/startup-timeout
src/ConfigurationClientManager.ts
Outdated
@@ -58,7 +65,7 @@ export class ConfigurationClientManager { | |||
this.#id = regexMatch[2]; | |||
this.#secret = regexMatch[3]; | |||
} else { | |||
throw new Error(`Invalid connection string. Valid connection strings should match the regex '${ConnectionStringRegex.source}'.`); | |||
throw new RangeError(`Invalid connection string. Valid connection strings should match the regex '${ConnectionStringRegex.source}'.`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case, call it as a RangeError sounds odd, how about rename it to ArgumentError
or ParameterError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RangeError is a built-in error type in JavaScript. There is no built-in ArgumentError. I did consider about create an ArgumentError type by myself. But I think it overlaps with RangeError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like JS doesnt have anything equivalent to ArgumentError. By definition, RangeError is also not a good fit here. I'd vote for custom error that extends Error
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 6 out of 16 changed files in this pull request and generated 2 comments.
Files not reviewed (10)
- src/ConfigurationClientWrapper.ts: Evaluated as low risk
- src/load.ts: Evaluated as low risk
- test/failover.test.ts: Evaluated as low risk
- src/keyvault/AzureKeyVaultKeyValueAdapter.ts: Evaluated as low risk
- src/AzureAppConfigurationOptions.ts: Evaluated as low risk
- src/refresh/RefreshTimer.ts: Evaluated as low risk
- test/keyvault.test.ts: Evaluated as low risk
- src/ConfigurationClientManager.ts: Evaluated as low risk
- test/requestTracing.test.ts: Evaluated as low risk
- test/clientOptions.test.ts: Evaluated as low risk
Comments suppressed due to low confidence (1)
src/AzureAppConfigurationImpl.ts:249
- [nitpick] The error message 'Load operation timed out.' is unclear. Consider making it more descriptive, such as 'Loading configuration settings timed out.'
reject(new Error("Load operation timed out."));
await new Promise(resolve => setTimeout(resolve, backoffDuration)); | ||
console.warn("Failed to load configuration settings at startup. Retrying..."); | ||
} | ||
} while (!abortSignal.aborted); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the check on abortSignal is at the end of each try operation, so there may be a scenario that when it's still in loading process but timeout happens, program still need to run a little while before abort. May be you can setup timeout for signal like this https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal/timeout_static
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did think about this option, it is an alternative for using Promise.race for timeout.
If we take this option, the code will be
async function load() {
try {
await initializeWithRetryPolicy(AbortSignal.timeout(5200)); // 200ms longer than default backoff duration
} catch (error) {}
}
async function initializeWithRetryPolicy(signal) {
do {
try {
//--> Position A
// load keyvalue from App Config
} catch (error) {
//--> Position B
// wait for 5000ms (default backoff duration)
}
} while (signal.abort)
if (signal.abort) {
throw new Error("Timeout")
}
}
Let's say one failure of load kv from App Config (position A) costs 50ms
At T0, we set timeout for the signal (At T0+5200, it should be aborted)
At T0 + 50ms, the first attempt failed and it went to position B and started to wait for 5000ms
At T0+5050ms, it started to retry and it took 50m to fail again
At T0+5100ms, failed at position A again and started to wait for 5000ms or maybe even longer due to exponential backoff duration increase
At T0+5200ms, the signal timeout, but the code is still stuck at await initialize.
program still need to run a little while before abort.
This is half-true. Unless the customer write code like this:
try {
const setting = await load();
} catch (err) {
// do not throw here
}
The initializeWithRetryPolicy promise will not be resolved immediately after timeout. Otherwise, error is thrown, the node process is terminated. All promises running no longer exist.
0d91484
to
9e32db4
Compare
Why this PR?
Support startup retry and timeout.
The retry/backoff behavior is the same as .NET provider.
reference: #488
Usage
By default the startup timeout will be 100 seconds.
The pseudo code of the whole startup retry + failover logic: