Graph MCP Tools

stygian-graph exposes scraping, graph inspection, and optional Charon diagnostics tools.


Enabling

[dependencies]
stygian-graph = { version = "*", features = ["mcp"] }

Enable Charon-backed diagnostics/planning tools with:

stygian-graph = { version = "*", features = ["mcp", "charon"] }

To use as a standalone MCP server (without the aggregator), embed McpGraphServer in your own binary:

use stygian_graph::mcp::McpGraphServer;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    McpGraphServer::new().run().await
}

When using the aggregator, all tools are prefixed with graph_ (e.g. graph_scrape instead of scrape).


Tools

scrape

Fetch a URL with anti-bot User-Agent rotation and automatic retries. Returns raw HTML or JSON content with response metadata.

ParameterTypeRequiredDescription
urlstringTarget URL
timeout_secsintegerRequest timeout in seconds (default: 30)
proxy_urlstringHTTP/SOCKS5 proxy URL — e.g. socks5://user:pass@host:1080
rotate_uabooleanRotate the User-Agent header on each request (default: true)

Returns:

{
  "data": "<html>...</html>",
  "metadata": { "status": 200, "url": "https://...", "content_type": "text/html" }
}

scrape_rest

Call a REST/JSON API endpoint. Supports all common HTTP methods, authentication schemes, query parameters, arbitrary request bodies, pagination, and dot-path response extraction.

ParameterTypeRequiredDescription
urlstringAPI endpoint URL
methodstringHTTP method: GET, POST, PUT, PATCH, DELETE (default: GET)
authobjectAuthentication config (see below)
queryobjectURL query parameters as key-value pairs
bodyobjectJSON request body
headersobjectCustom request headers
paginationobjectPagination config (see below)
data_pathstringDot-separated path to extract from response — e.g. data.items

auth object:

FieldValuesDescription
typebearer | api_key | basic | headerAuth scheme
tokenstringToken or credential value
headerstringCustom header name (when type = "header")

pagination object:

FieldValuesDescription
strategylink_header | offset | cursorPagination style
max_pagesintegerMaximum pages to fetch (default: 1)

Example — GitHub issues list:

{
  "url": "https://api.github.com/repos/owner/repo/issues",
  "auth": { "type": "bearer", "token": "ghp_..." },
  "query": { "state": "open", "per_page": "100" },
  "data_path": ""
}

scrape_graphql

Execute a GraphQL query or mutation against any spec-compliant endpoint.

ParameterTypeRequiredDescription
urlstringGraphQL endpoint URL
querystringGraphQL query or mutation string
variablesobjectQuery variables (JSON object)
authobjectAuth config (see below)
data_pathstringDot-separated path to extract — e.g. data.countries
timeout_secsintegerRequest timeout in seconds (default: 30)

auth object:

FieldValuesDescription
kindbearer | api_key | header | noneAuth scheme
tokenstringAuth token or key
header_namestringCustom header name (default: X-Api-Key)

Example — countries query:

{
  "url": "https://countries.trevorblades.com/graphql",
  "query": "{ countries { name capital currency } }",
  "data_path": "data.countries"
}

scrape_sitemap

Parse a sitemap.xml or sitemap index and return all discovered URLs with their priorities and change frequencies.

ParameterTypeRequiredDescription
urlstringSitemap URL (sitemap.xml or sitemap index)
max_depthintegerMaximum sitemap index recursion depth (default: 5)

Returns: A JSON array of URL entries:

{
  "data": [
    { "url": "https://example.com/page", "priority": 0.8, "changefreq": "weekly" }
  ],
  "metadata": { "total_urls": 1234, "source": "https://example.com/sitemap.xml" }
}

scrape_rss

Parse an RSS 2.0 or Atom feed and return all items as structured JSON.

ParameterTypeRequiredDescription
urlstringRSS/Atom feed URL

Returns: A JSON array of feed items:

{
  "data": [
    {
      "title": "Article title",
      "link": "https://...",
      "published": "2025-03-01T12:00:00Z",
      "description": "..."
    }
  ],
  "metadata": { "feed_title": "My Blog", "total_items": 20 }
}

pipeline_validate

Parse and validate a TOML pipeline definition without executing it. Returns the parsed node and service lists, detected cycles, and computed topological execution order.

ParameterTypeRequiredDescription
tomlstringTOML pipeline definition string

Returns on success:

{
  "valid": true,
  "services": ["http_default", "graphql_api"],
  "nodes": ["fetch_homepage", "extract_links"],
  "execution_order": ["fetch_homepage", "extract_links"]
}

Returns on failure:

{
  "valid": false,
  "error": "Cycle detected: node_a → node_b → node_a"
}

pipeline_run

Parse, validate, and execute a TOML pipeline DAG.

  • Nodes of kind http, rest, graphql, sitemap, and rss are executed directly.
  • Nodes of kind ai are recorded in the skipped list.
  • Nodes of kind browser are executed only when the node includes an opt-in acquisition block and stygian-graph is built with the acquisition-runner feature. If either condition is missing — no acquisition block, or the feature is not enabled — the node is skipped.
ParameterTypeRequiredDescription
tomlstringTOML pipeline definition string
timeout_secsintegerPer-node timeout in seconds (default: 30)

Returns:

{
  "outputs": {
    "fetch_homepage": { "data": "<html>...", "metadata": { "status": 200 } }
  },
  "skipped": ["ai_extract"],
  "errors": {}
}

Browser acquisition opt-in example:

[[services]]
name = "browser_service"
kind = "browser"

[[nodes]]
name = "render"
service = "browser_service"
url = "https://example.com"

[nodes.params.acquisition]
enabled = true
mode = "resilient"
wait_for_selector = "main"
total_timeout_secs = 45

If the acquisition block is omitted, browser nodes remain non-breaking and are added to skipped as before.


Optional Charon tools

These tools are available only when stygian-graph is built with the charon feature.

charon_classify_transaction

Classify a single HTTP transaction for likely anti-bot provider signals.

charon_investigate_har

Turn a HAR payload into a normalized InvestigationReport.

charon_infer_requirements

Infer operational requirements from an existing investigation report.

charon_build_runtime_policy

Build a runtime policy from an investigation report plus inferred requirements.

charon_map_runtime_policy

Map a runtime policy into acquisition hints suitable for downstream runners.

charon_analyze_and_plan

Run HAR investigation, requirement inference, runtime policy planning, and acquisition mapping in one call.

Typical aggregator names are prefixed automatically, for example:

  • graph_charon_classify_transaction
  • graph_charon_investigate_har
  • graph_charon_analyze_and_plan

Example pipeline TOML:

[[services]]
name  = "http_default"
kind  = "http"

[[nodes]]
name    = "fetch_homepage"
service = "http_default"
url     = "https://example.com"

[[nodes]]
name       = "fetch_about"
service    = "http_default"
url        = "https://example.com/about"
depends_on = ["fetch_homepage"]