Agent402 / tools / skill-rag-prep

Skill: RAG corpus prep

FREE with proof-of-work · or $0.050 in USDC · POST /api/skill/rag-prep

Bundled execution of the RAG corpus prep workflow — Take a raw document and turn it into a vector-DB-ready JSONL dataset, deterministically. Measures the corpus, token-counts it with the real OpenAI BPE, chunks at the right token boundary, attaches entities + keywords as metadata, emits NDJSON, then validates every record against a JSON Schema before you ingest it. Seven pure-CPU tools, free-tier eligible — the canonical 'prep my docs for embeddings' workflow done as deterministic tool calls instead of a hand-rolled Python script. One x402 payment runs 7 underlying tools (text-stats, token-count, text-chunk, extract-entities, keywords, jsonl, json-validate); partial-success per step.

Input

FieldTypeDescription
docstringthe source document to prep for embedding (raw text, no markup required)

Example output

{
  "pack": "rag-prep",
  "args": {
    "doc": "Alice from acme@example.com filed a support ticket on 2026-06-21 about the checkout flow returning a 502 from api.acme.com/v2/orders. Engineer Bob investigated and found the issue was a connection-pool exhaustion in the order-service: postgres max_connections was 100 and the pool had been silently leaking since the rollout of feature flag #orders-2026. Fix landed in commit 9a3b2c1; deploy went out 2026-06-22. Follow-up: add pgbouncer in front of the order-service and an alert on pool.in_use / max_connections > 0.8 in PagerDuty. Slack thread: #incident-orders-502. Mentioned engineers: @alice @bob @carol."
  },
  "steps": [
    {
      "slug": "text-stats",
      "ok": true,
      "result": {}
    },
    {
      "slug": "token-count",
      "ok": true,
      "result": {}
    },
    {
      "slug": "text-chunk",
      "ok": true,
      "result": {}
    },
    {
      "slug": "extract-entities",
      "ok": true,
      "result": {}
    },
    {
      "slug": "keywords",
      "ok": true,
      "result": {}
    },
    {
      "slug": "jsonl",
      "ok": true,
      "result": {}
    },
    {
      "slug": "json-validate",
      "ok": true,
      "result": {}
    }
  ],
  "summary": "7/7 steps succeeded"
}

Try it — see the 402 challenge (free)

curl -i -X POST https://agent402.tools/api/skill/rag-prep \
  -H "Content-Type: application/json" \
  -d '{"doc":"Alice from acme@example.com filed a support ticket on 2026-06-21 about the checkout flow returning a 502 from api.acme.com/v2/orders. Engineer Bob investigated and found the issue was a connection-pool exhaustion in the order-service: postgres max_connections was 100 and the pool had been silently leaking since the rollout of feature flag #orders-2026. Fix landed in commit 9a3b2c1; deploy went out 2026-06-22. Follow-up: add pgbouncer in front of the order-service and an alert on pool.in_use / max_connections > 0.8 in PagerDuty. Slack thread: #incident-orders-502. Mentioned engineers: @alice @bob @carol."}'

The response is HTTP 402 Payment Required with exact payment requirements. Any x402 v2 client pays automatically and retries:

Paid call (JavaScript agent)

import { wrapFetchWithPayment } from "@x402/fetch";
import { x402Client } from "@x402/core/client";
import { registerExactEvmScheme } from "@x402/evm/exact/client";
import { privateKeyToAccount } from "viem/accounts";

const client = new x402Client();
registerExactEvmScheme(client, { signer: privateKeyToAccount(KEY) });
const payFetch = wrapFetchWithPayment(fetch, client);

const res = await payFetch("https://agent402.tools/api/skill/rag-prep", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    "doc": "Alice from acme@example.com filed a support ticket on 2026-06-21 about the checkout flow returning a 502 from api.acme.com/v2/orders. Engineer Bob investigated and found the issue was a connection-pool exhaustion in the order-service: postgres max_connections was 100 and the pool had been silently leaking since the rollout of feature flag #orders-2026. Fix landed in commit 9a3b2c1; deploy went out 2026-06-22. Follow-up: add pgbouncer in front of the order-service and an alert on pool.in_use / max_connections > 0.8 in PagerDuty. Slack thread: #incident-orders-502. Mentioned engineers: @alice @bob @carol."
  }),
});

No wallet? Pay with compute

This is a pure-CPU tool, so an agent without a wallet can pay with proof-of-work instead of USDC: fetch a challenge, solve the sha256 puzzle (16 leading zero bits — a fraction of a second of CPU, no money, no AI tokens), and resend with the X-Pow-Solution header.

import { createHash } from "node:crypto";
const lz = (b) => { let t = 0; for (const x of b) { if (!x) { t += 8; continue; } t += Math.clz32(x) - 24; break; } return t; };
const c = await (await fetch("https://agent402.tools/api/pow/challenge?slug=skill-rag-prep")).json();
let n = 0;
while (lz(createHash("sha256").update(c.challenge + ":" + n).digest()) < c.difficulty) n++;
await fetch("https://agent402.tools/api/skill/rag-prep", { method: "POST", headers: { "X-Pow-Solution": c.token + ":" + n, "Content-Type": "application/json" }, body: JSON.stringify({"doc":"Alice from acme@example.com filed a support ticket on 2026-06-21 about the checkout flow returning a 502 from api.acme.com/v2/orders. Engineer Bob investigated and found the issue was a connection-pool exhaustion in the order-service: postgres max_connections was 100 and the pool had been silently leaking since the rollout of feature flag #orders-2026. Fix landed in commit 9a3b2c1; deploy went out 2026-06-22. Follow-up: add pgbouncer in front of the order-service and an alert on pool.in_use / max_connections > 0.8 in PagerDuty. Slack thread: #incident-orders-502. Mentioned engineers: @alice @bob @carol."}) });

Related tools

Skill: Security audit

USDC $0.12 · POST /api/skill/security-audit

Bundled execution of the Security audit workflow — Enumerate a domain's external attack surface in one workflow: certs, …

Skill: Email deliverability

USDC $0.10 · POST /api/skill/email-deliverability

Bundled execution of the Email deliverability workflow — Diagnose why a domain's email lands in spam: SPF posture, DMARC…

Skill: Financial research

USDC $1.50 · POST /api/skill/financial-research

Bundled execution of the Financial research workflow — Pull SEC filings, real-time quotes, historical prices, and macro …