stygian_browser/interstitial_router/mod.rs
1//! Queue and interstitial detection routing.
2//!
3//! ## What is an "interstitial"?
4//!
5//! Anti-bot vendors (`Cloudflare`, `DataDome`, `PerimeterX`,
6//! `Akamai` Bot Manager, `Kasada`, `Fingerprint.com`) often
7//! respond to a high-risk navigation with a **non-success page**
8//! that is not a hard `4xx`/`5xx` and not the target
9//! document. These intermediate pages come in four shapes:
10//!
11//! - **Queue / waiting room** — the user is told to wait
12//! ("Please wait...", "You are #5 in line", "Estimated wait
13//! time 2 minutes"). The page often returns a `2xx` or `3xx`
14//! with `queue` / `wait` body markers.
15//! - **Challenge interstitial** — a vendor-issued
16//! captcha / turnstile / proof-of-work challenge that the
17//! client must solve before being allowed to the target
18//! document. Common markers: `cf-chl-bypass`, `g-recaptcha`,
19//! `h-captcha`, `cf-turnstile`, `akamai`, `perimeterx`,
20//! `_abck`.
21//! - **Hard block** — a terminal vendor block page
22//! ("Access denied", "Request blocked", `Just a moment...`
23//! that does not auto-resolve, vendor-specific
24//! `/blocked` / `/forbidden` URLs).
25//! - **Transient redirect** — a `3xx` redirect chain that
26//! should be followed before classifying the response
27//! (often the case for region / cookie-consent
28//! redirections that vendors insert before the
29//! challenge).
30//!
31//! ## What this module provides
32//!
33//! 1. [`InterstitialClassifier`] — pure deterministic
34//! classifier that consumes a [`PageSignature`] and
35//! returns an [`InterstitialKind`].
36//! 2. [`InterstitialRouter`] — maps the classification to a
37//! dedicated [`InterstitialRoute`] (the dedicated
38//! acquisition strategy per kind) with explicit
39//! diagnostics in a [`RouterDecision`].
40//! 3. A stable [`severity`][InterstitialSeverity] field on
41//! [`RouterDecision`] that observability tooling can use
42//! to distinguish [`InterstitialKind::Queue`] (retryable
43//! wait) from [`InterstitialKind::HardBlock`] (terminal
44//! escalation) without branching on the kind itself.
45//!
46//! ## Routing behavior table
47//!
48//! | [`InterstitialKind`] | Default route | Default severity | Strategy hint |
49//! |---|---|---|---|
50//! | [`Queue`][InterstitialKind::Queue] | [`InterstitialRoute::WaitAndRetry`] | [`Retryable`][InterstitialSeverity::Retryable] | Wait the configured interval, then retry. Honors the optional queue position hint. |
51//! | [`Challenge`][InterstitialKind::Challenge] | [`InterstitialRoute::ChallengeSolve`] | [`RequiresSolve`][InterstitialSeverity::RequiresSolve] | Escalate to a browser with sticky session + solve budget. Optional vendor hint narrows the strategy. |
52//! | [`HardBlock`][InterstitialKind::HardBlock] | [`InterstitialRoute::HardBlock`] | [`Terminal`][InterstitialSeverity::Terminal] | Rotate session + invalidate sticky context + escalate to the strongest available strategy. |
53//! | [`Transient`][InterstitialKind::Transient] | [`InterstitialRoute::Transient`] | [`Retryable`][InterstitialSeverity::Retryable] | Follow redirect chain (bounded hops), then re-classify. |
54//!
55//! ## Integration with `AcquisitionRunner`
56//!
57//! [`AcquisitionRequest::interstitial`][crate::acquisition::AcquisitionRequest::interstitial]
58//! carries a previously-observed [`PageSignature`] plus an
59//! [`InterstitialPolicy`] into the runner. The runner
60//! evaluates the signature via [`InterstitialClassifier`]
61//! before any stage executes:
62//!
63//! 1. The resulting [`RouterDecision`] is attached to
64//! [`AcquisitionResult::interstitial`][crate::acquisition::AcquisitionResult::interstitial]
65//! so downstream policy mapping (T83 / T85 / T89 / T93)
66//! can consume the decision as a strategy hint.
67//! 2. When the decision is non-[`Transient`][InterstitialKind::Transient]
68//! **and** [`InterstitialPolicy::short_circuit_on_classified`]
69//! is `true` (the default), the runner short-circuits
70//! with a structured
71//! [`StageFailureKind::InterstitialRouted`][crate::acquisition::StageFailureKind::InterstitialRouted]
72//! failure tagged with the decision so the calling
73//! layer can route via the dedicated strategy without
74//! burning through the generic ladder.
75//!
76//! Transient redirects do not short-circuit by default —
77//! they flow through the ladder so the redirect can be
78//! followed normally.
79//!
80//! ## Feature flag
81//!
82//! This module is **default-on** and is always compiled as
83//! part of the `stygian-browser` crate. No new feature gate
84//! is introduced; the integration is purely additive on
85//! [`crate::acquisition::AcquisitionRequest`] and
86//! [`crate::acquisition::AcquisitionResult`].
87//!
88//! # Example
89//!
90//! ```
91//! use stygian_browser::interstitial_router::{
92//! InterstitialClassifier, InterstitialKind, InterstitialRouter, PageSignature,
93//! };
94//!
95//! // A Cloudflare challenge interstitial observed on a previous attempt.
96//! let signature = PageSignature::new(
97//! "https://example.com/cdn-cgi/challenge-platform/h/b",
98//! Some(403),
99//! )
100//! .with_body_marker("cf-chl-bypass")
101//! .with_header("cf-mitigated");
102//!
103//! let classifier = InterstitialClassifier::new();
104//! let kind = classifier.classify(&signature);
105//! assert_eq!(kind, InterstitialKind::Challenge);
106//!
107//! let router = InterstitialRouter::with_defaults();
108//! let decision = router.route(&signature, kind);
109//! assert!(decision.is_classified());
110//! assert_eq!(decision.kind(), InterstitialKind::Challenge);
111//! ```
112
113mod classifier;
114mod policy;
115mod report;
116mod router;
117
118pub use classifier::{InterstitialClassifier, PageSignature};
119pub use policy::{
120 DEFAULT_CHALLENGE_SOLVE_BUDGET_MS, DEFAULT_HARD_BLOCK_ESCALATION, DEFAULT_MAX_TRANSIENT_HOPS,
121 DEFAULT_QUEUE_INTERVAL_MS, DEFAULT_QUEUE_MAX_RETRIES, DEFAULT_TRANSIENT_FOLLOW_REDIRECT,
122 InterstitialKind, InterstitialPolicy, InterstitialRoute, InterstitialSeverity,
123};
124pub use report::{PageSignatureEvidence, RouterDecision, RouterDecisionLog};
125pub use router::{InterstitialRouter, classify_and_route, route};