Crate stygian_graph

Crate stygian_graph 

Source
Expand description

§Stygian Graph

A high-performance, graph-based web scraping engine for Rust.

§Overview

Stygian treats scraping pipelines as Directed Acyclic Graphs (DAGs) where each node is a pluggable service module (HTTP fetchers, AI extractors, headless browsers). Built for extreme concurrency and extensibility using hexagonal architecture.

§Quick Start

use stygian_graph::domain::graph::Pipeline;
use stygian_graph::domain::pipeline::PipelineUnvalidated;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create a simple scraping pipeline
    let config = serde_json::json!({
        "nodes": [],
        "edges": []
    });
     
    let pipeline = PipelineUnvalidated::new(config)
        .validate()?
        .execute()
        .complete(serde_json::json!({"status": "success"}));
     
    println!("Pipeline complete: {:?}", pipeline.results());
    Ok(())
}

§Architecture

Stygian follows hexagonal (ports & adapters) architecture:

  • Domain: Core business logic (graph execution, pipeline orchestration)
  • Ports: Trait definitions (service interfaces, abstractions)
  • Adapters: Implementations (HTTP, AI providers, storage, caching)
  • Application: Orchestration (service registry, executor, CLI)

§Features

  • 🕸️ Graph-based execution: DAG pipelines with petgraph
  • 🤖 Multi-AI support: Claude, GPT, Gemini, Copilot, Ollama
  • 🌐 JavaScript rendering: Optional browser automation via stygian-browser
  • 📊 Multi-modal extraction: HTML, PDF, images, video, audio
  • 🛡️ Anti-bot handling: User-Agent rotation, proxy support, rate limiting
  • 🚀 High concurrency: Worker pools, backpressure, Tokio + Rayon
  • 🔄 Idempotent operations: Safe retries with idempotency keys
  • 📈 Observability: Metrics, tracing, monitoring

§Crate Features

  • browser (default): Include stygian-browser for JavaScript rendering
  • full: All features enabled

§Request Signing

Use ports::signing::SigningPort + adapters::signing::HttpSigningAdapter to attach HMAC signatures, AWS Sig V4, OAuth 1.0a, or Frida RPC tokens to any outbound request. No feature flag required — zero additional dependencies.

Re-exports§

pub use stygian_browser;

Modules§

adapters
Adapter implementations - infrastructure concerns
application
Application layer - orchestration and coordination
domain
Core domain logic - graph execution, pipelines, orchestration
error
Error types used throughout the crate
ports
Port trait definitions - service abstractions
prelude
Re-exports for convenient imports