stygian_charon/pow_profile/mod.rs
1//! Proof-of-work capability profile (T93).
2//!
3//! ## What this module does
4//!
5//! Quantifies the scraper's **proof-of-work (PoW) handling
6//! capability** for a `(domain, target_class, vendor_family)`
7//! triple and feeds the resulting score into the runtime
8//! policy. PoW challenges are the vendor-issued JS / WASM
9//! computations a scraper must solve to pass an
10//! anti-bot gate (e.g. `Akamai` `_abck` derivation,
11//! `Fingerprint.com` proof-of-work, `DataDome` interstitial
12//! challenge). Naïve scrapers that always try the same
13//! solve strategy train the vendor to escalate the
14//! challenge, eventually locking the scraper out.
15//!
16//! A [`PowCapabilityProfile`] aggregates solve latency,
17//! success rate, retry count, and failure modes into a
18//! stable, serialisable record. A
19//! [`PowCapabilityScorer`] consumes the profile and
20//! produces a deterministic unit-interval score plus a
21//! coarse [`PowCapabilityBand`] label. The policy mapper
22//! ([`adjust_runtime_policy_for_pow`]) then nudges the
23//! runtime policy toward a posture that matches the
24//! observed capability (faster pacing for `Strong`,
25//! browser+sticky escalation for `Weak`).
26//!
27//! ## Schema overview
28//!
29//! | Field | Range / type | Source |
30//! |---------------------------|---------------------------------------|--------|
31//! | `solved_count` | `u32` | count of solved samples |
32//! | `failed_count` | `u32` | count of failed samples |
33//! | `retry_count` | `u32` (cumulative) | sum of sample retries |
34//! | `solve_latency_ms_p50` | `Option<u64>` | running median of solved samples |
35//! | `solve_latency_ms_p95` | `Option<u64>` | running tail of solved samples |
36//! | `failure_modes` | `BTreeMap<PowFailureMode, u32>` | histogram of failure modes |
37//! | `observation_window_secs` | `u64` | width of the sampling window |
38//! | `recorded_at_unix_secs` | `u64` | wall-clock timestamp of last merge |
39//!
40//! ## Sampling window defaults
41//!
42//! The default sampling window is
43//! [`DEFAULT_SAMPLE_WINDOW_SECS`] (one hour). The default
44//! store TTL ([`DEFAULT_POW_TTL`]) matches the default
45//! window so a profile that was built over the default
46//! window expires exactly when the window elapses.
47//! Operators can override the window by calling
48//! [`PowCapabilityProfile::merge`] with a sample after
49//! adjusting `observation_window_secs` on the profile, or
50//! by calling [`PowCapabilityStore::new`] with a custom
51//! TTL.
52//!
53//! ## Sparse-telemetry fallback
54//!
55//! When the profile's `total_attempts` is below
56//! [`MIN_OBSERVATIONS_FOR_SCORING`] (3) the scorer returns
57//! [`SPARSE_FALLBACK_SCORE`] (`0.5`) and the band is
58//! [`PowCapabilityBand::Unknown`]. The fallback is the
59//! **same** value the empty profile returns, so the policy
60//! mapper treats unobserved targets as the no-op baseline
61//! (no escalation, no risk-score lift). This is the
62//! "I have no signal" default — the operator's policy is
63//! not perturbed by a profile that has not earned
64//! statistical confidence.
65//!
66//! ## Persistence
67//!
68//! The persistence layer reuses the same
69//! `LruTtlStore` primitive
70//! the T83 [`ChallengeMemory`][crate::challenge_feedback::ChallengeMemory]
71//! and the T91 [`NonceBook`][crate::token_lifecycle::NonceBook]
72//! use. That keeps eviction + expiry semantics consistent
73//! across all three short-horizon stores and satisfies the
74//! "no new cache store" requirement. The key namespace is
75//! `charon:pow:...` (see [`pow_profile_key`]) so PoW
76//! entries never collide with `charon:challenge:...` (T83)
77//! or `charon:token_nonce:...` (T91) on a shared backing
78//! primitive.
79//!
80//! ## Feature flag
81//!
82//! The module is **default-on** (gated behind the
83//! `caching` feature, which is part of the `stygian-charon`
84//! default feature set). It is purely additive — no
85//! existing public type gains a new field, no existing
86//! behaviour changes, and no new feature gate is
87//! introduced. The schema is serialised as a flat record
88//! with additive `Option<T>` fields
89//! (`#[serde(default, skip_serializing_if = "Option::is_none")]`
90//! on `solve_latency_ms_p50` and `solve_latency_ms_p95`)
91//! so older JSON payloads still deserialize and newer
92//! payloads omit the optional fields when no latency has
93//! been observed yet.
94//!
95//! # Example
96//!
97//! ```
98//! use stygian_charon::pow_profile::{
99//! PowCapabilityProfile, PowCapabilitySample, PowCapabilityScorer,
100//! PowCapabilityStore, adjust_runtime_policy_for_pow, PowPolicyThresholds,
101//! PowCapabilityScore,
102//! };
103//! use stygian_charon::types::{ExecutionMode, RuntimePolicy, SessionMode, TargetClass, TelemetryLevel};
104//! use stygian_charon::vendor_classifier::VendorId;
105//! use std::collections::BTreeMap;
106//!
107//! // Record a few samples into a store.
108//! let store = PowCapabilityStore::with_defaults();
109//! for _ in 0..6 {
110//! store.record_sample(
111//! "example.com",
112//! TargetClass::ContentSite,
113//! VendorId::Cloudflare,
114//! &PowCapabilitySample::solved(800, 0),
115//! );
116//! }
117//!
118//! // Look up the aggregated profile and score it.
119//! let profile = store
120//! .lookup("example.com", TargetClass::ContentSite, VendorId::Cloudflare)
121//! .expect("profile");
122//! let scorer = PowCapabilityScorer::new();
123//! let value = scorer.score(&profile);
124//! let score = PowCapabilityScore::new(value);
125//!
126//! // Apply the policy mapper.
127//! let policy = RuntimePolicy {
128//! execution_mode: ExecutionMode::Http,
129//! session_mode: SessionMode::Stateless,
130//! telemetry_level: TelemetryLevel::Standard,
131//! rate_limit_rps: 3.0,
132//! max_retries: 2,
133//! backoff_base_ms: 250,
134//! enable_warmup: false,
135//! enforce_webrtc_proxy_only: false,
136//! sticky_session_ttl_secs: None,
137//! required_stygian_features: Vec::new(),
138//! config_hints: BTreeMap::new(),
139//! risk_score: 0.30,
140//! };
141//! let adjusted =
142//! adjust_runtime_policy_for_pow(&policy, &score, &PowPolicyThresholds::default());
143//! assert!(adjusted.rate_limit_rps >= 1.0);
144//! assert!(adjusted
145//! .config_hints
146//! .contains_key("pow.capability"));
147//! ```
148
149mod policy;
150mod profile;
151mod scorer;
152mod store;
153
154pub use policy::{
155 MAX_POW_RISK_DELTA, PowCapabilityScore, PowPolicyThresholds, adjust_runtime_policy_for_pow,
156};
157pub use profile::{
158 DEFAULT_SAMPLE_WINDOW_SECS, PowCapabilityProfile, PowCapabilitySample, PowFailureMode,
159};
160pub use scorer::{
161 DEFAULT_LATENCY_BUDGET_MS, DEFAULT_RETRY_BUDGET, MIN_OBSERVATIONS_FOR_SCORING,
162 PowCapabilityBand, PowCapabilityScorer, ProfileWeights, SPARSE_FALLBACK_SCORE, band_for_score,
163};
164pub use store::{DEFAULT_POW_CAPACITY, DEFAULT_POW_TTL, PowCapabilityStore, pow_profile_key};
165
166/// Convenience helper: score a profile and wrap the result
167/// in a [`PowCapabilityScore`].
168///
169/// This is the "operator-friendly" path — most callers want
170/// "give me a score I can pass to the policy mapper" and
171/// do not want to assemble the [`PowCapabilityScore`]
172/// manually.
173#[must_use]
174pub fn score_from_profile(
175 profile: &PowCapabilityProfile,
176 scorer: &PowCapabilityScorer,
177) -> PowCapabilityScore {
178 PowCapabilityScore::new(scorer.score(profile))
179}