Expand description
§Stygian Graph
A high-performance, graph-based web scraping engine for Rust.
§Overview
Stygian treats scraping pipelines as Directed Acyclic Graphs (DAGs) where each node is a pluggable service module (HTTP fetchers, AI extractors, headless browsers). Built for extreme concurrency and extensibility using hexagonal architecture.
§Quick Start
use stygian_graph::domain::graph::Pipeline;
use stygian_graph::domain::pipeline::PipelineUnvalidated;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create a simple scraping pipeline
let config = serde_json::json!({
"nodes": [],
"edges": []
});
let pipeline = PipelineUnvalidated::new(config)
.validate()?
.execute()
.complete(serde_json::json!({"status": "success"}));
println!("Pipeline complete: {:?}", pipeline.results());
Ok(())
}§Architecture
Stygian follows hexagonal (ports & adapters) architecture:
- Domain: Core business logic (graph execution, pipeline orchestration)
- Ports: Trait definitions (service interfaces, abstractions)
- Adapters: Implementations (HTTP, AI providers, storage, caching)
- Application: Orchestration (service registry, executor, CLI)
§Features
- 🕸️ Graph-based execution: DAG pipelines with petgraph
- 🤖 Multi-AI support: Claude, GPT, Gemini, Copilot, Ollama
- 🌐 JavaScript rendering: Optional browser automation via
stygian-browser - 📊 Multi-modal extraction: HTML, PDF, images, video, audio
- 🛡️ Anti-bot handling: User-Agent rotation, proxy support, rate limiting
- 🚀 High concurrency: Worker pools, backpressure, Tokio + Rayon
- 🔄 Idempotent operations: Safe retries with idempotency keys
- 📈 Observability: Metrics, tracing, monitoring
§Crate Features
browser(default): Include stygian-browser for JavaScript renderingfull: All features enabled
§Request Signing
Use ports::signing::SigningPort + adapters::signing::HttpSigningAdapter to attach
HMAC signatures, AWS Sig V4, OAuth 1.0a, or Frida RPC tokens to any outbound request.
No feature flag required — zero additional dependencies.
Re-exports§
pub use stygian_browser;
Modules§
- adapters
- Adapter implementations - infrastructure concerns
- application
- Application layer - orchestration and coordination
- domain
- Core domain logic - graph execution, pipelines, orchestration
- error
- Error types used throughout the crate
- ports
- Port trait definitions - service abstractions
- prelude
- Re-exports for convenient imports