Improve URL checks to reduce false-negatives
This commit improves the URL health checking mechanism to reduce false negatives. - Treat all 2XX status codes as successful, addressing issues with codes like `204`. - Exclude URLs within Markdown inline code blocks. - Send the Host header for improved handling of webpages behind proxies. - Improve formatting and context for output messages. - Fix the defaulting options for redirects and cookie handling. - Add URL exclusion support for non-responsive URLs. - Update the user agent pool to modern browsers and platforms. - Improve CI/CD workflow to respond to modifications in the `test/checks/external-urls` directory, offering immediate feedback on potential impacts to the external URL test. - Add support for randomizing TLS fingerprint to mimic various clients better, improving the effectiveness of checks. However, this is not fully supported by Node.js's HTTP client; see nodejs/undici#1983 for more details. - Use `AbortSignal` instead of `AbortController` as more modern and simpler way to handle timeouts.
This commit is contained in:
@@ -13,7 +13,10 @@ A CLI and SDK for checking the availability of external URLs.
|
||||
- 😇 **Rate Limiting**: Queues requests by domain to be polite.
|
||||
- 🔁 **Retries**: Implements retry pattern with exponential back-off.
|
||||
- ⌚ **Timeouts**: Configurable timeout for each request.
|
||||
- 🎭️ **User-Agent Rotation**: Change user agents for each request.
|
||||
- 🎭️ **Impersonation**: Impersonate different browsers for each request.
|
||||
- **🌐 User-Agent Rotation**: Change user agents.
|
||||
- **🔑 TLS Handshakes**: Perform TLS and HTTP handshakes that are identical to that of a real browser.
|
||||
- 🫙 **Cookie jar**: Preserve cookies during redirects to mimic real browser.
|
||||
|
||||
## CLI
|
||||
|
||||
@@ -54,6 +57,7 @@ const statuses = await getUrlStatusesInParallel([ 'https://privacy.sexy', /* ...
|
||||
- **`sameDomainDelayInMs`** (*number*), default: `3000` (3 seconds)
|
||||
- Sets the delay between requests to the same domain.
|
||||
- `requestOptions` (*object*): See [request options](#request-options).
|
||||
- `followOptions` (*object*): See [follow options](#follow-options).
|
||||
|
||||
### `getUrlStatus`
|
||||
|
||||
@@ -72,7 +76,6 @@ console.log(`Status code: ${status.code}`);
|
||||
- The longer the base time, the greater the intervals between retries.
|
||||
- **`additionalHeaders`** (*object*), default: `false`
|
||||
- Additional HTTP headers to send along with the default headers. Overrides default headers if specified.
|
||||
- **`followOptions`** (*object*): See [follow options](#follow-options).
|
||||
- **`requestTimeoutInMs`** (*number*), default: `60000` (60 seconds)
|
||||
- Time limit to abort the request if no response is received within the specified time frame.
|
||||
|
||||
@@ -83,19 +86,7 @@ Follows `3XX` redirects while preserving cookies.
|
||||
Same fetch API except third parameter that specifies [follow options](#follow-options), `redirect: 'follow' | 'manual' | 'error'` is discarded in favor of the third parameter.
|
||||
|
||||
```js
|
||||
const status = await fetchFollow('https://privacy.sexy', {
|
||||
// First argument is same options as fetch API, except `redirect` options
|
||||
// that's discarded in favor of next argument follow options
|
||||
headers: {
|
||||
'user-agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0) Gecko/20100101 Firefox/47.0'
|
||||
},
|
||||
}, {
|
||||
// Second argument sets the redirect behavior
|
||||
followRedirects: true,
|
||||
maximumRedirectFollowDepth: 20,
|
||||
enableCookies: true,
|
||||
}
|
||||
);
|
||||
const status = await fetchFollow('https://privacy.sexy', 1000 /* timeout in milliseconds */);
|
||||
console.log(`Status code: ${status.code}`);
|
||||
```
|
||||
|
||||
|
||||
Reference in New Issue
Block a user