41 utilities found
Token Optimiser
Count tokens and estimate cost across every major model.
Standalone Python script that estimates token counts and cost for GPT-4, Claude, Llama, Mistral, Gemini, and DeepSeek models. Runs locally, no API key.
Prompt Compressor
Shrink prompts without losing meaning.
Reduce prompt length while preserving intent. Strip filler, deduplicate, compress whitespace. Runs locally.
Model Router
Pick the right model for the task and the budget.
Routing logic that selects the cheapest model capable of handling a given task class. Drop-in decision layer.
Prompt Chain Builder
Compose multi-step LLM workflows without a framework.
Lightweight chain builder for wiring a sequence of LLM calls with passing context. No LangChain required.
Agent Monitor
Watch what your agents are actually doing.
Drop-in observability layer for LLM agents. Logs prompts, responses, latency, and spend to local files.
JSON Repair
Fix the #1 LLM output problem: broken JSON.
Repairs malformed JSON from LLM output. Handles trailing commas, single quotes, unquoted keys, code fences, truncation, Python booleans, and more.
Response Validator
Validate LLM JSON output against a schema and auto-retry on failure.
Pure-stdlib JSON Schema subset validator + a retry harness that asks the LLM to fix its output when validation fails. Complements json-repair (text fixes) by validating structure.
Output Sanitizer
Strip the garbage LLMs add to their output.
Removes the preamble, postamble, filler, and formatting junk LLMs love to tack on so you get just the answer.
Retry with Backoff
Universal retry wrapper for LLM API calls.
Exponential backoff with jitter, rate-limit awareness, and error classification. Wraps any callable.
Rate Limit Handler
Queue LLM requests and respect per-provider limits.
Token-bucket queue that keeps you under per-provider RPM and TPM caps without getting 429'd.
Streaming Handler
Clean SSE parser for streaming LLM responses.
Parses Server-Sent Events from OpenAI/Anthropic/OpenRouter streaming endpoints. No aiohttp, no openai-sdk — just raw bytes in, tokens out.
System Prompt Builder
Build modular, composable system prompts from reusable blocks.
Compose system prompts from named blocks: role, constraints, format, examples. Swap blocks per task.
Prompt Templates
Battle-tested system prompts for common LLM tasks.
A library of system prompts for extraction, classification, summarization, reasoning, and formatting — copy/paste ready.
Few-Shot Builder
Pack the most effective examples into the fewest tokens.
Selects and compresses few-shot examples to maximise signal per token in your prompt.
Chain-of-Thought Wrapper
Wrap any prompt in a CoT reasoning frame.
Seven CoT templates (standard, critique, scientific, recursive, adversarial, JSON, zero-shot). Drop-in wrapper.
Token Budget Allocator
Plan token spend across a multi-turn conversation.
Allocate your context window across turns: even, front-loaded, back-loaded, or decreasing budgets.
Context Gauge
See how close you are to blowing the context window.
Measure context usage against a model's limit. Warns before you truncate.
Cost Calculator
What will this LLM call actually cost?
Per-provider cost calculator with up-to-date pricing for GPT, Claude, Gemini, and OSS models.
Cost Forecaster
Project monthly LLM spend before you ship.
Projects LLM costs from a sample of calls. Useful before a feature launch.
Latency Tester
Benchmark LLM endpoint latency.
Hammers an API endpoint and reports mean, median, p95, min, max latency.
Prompt Translator
Translate prompts between model families.
Rewrites prompts to match the conventions of a different model family (Claude XML, OpenAI system messages, etc.).
Prompt Version Diff
Diff two prompts and see what actually changed.
Word-level and structural diff for prompts. Spot regressions between prompt revisions.
Embedding Similarity
Cosine similarity without numpy.
Pure-stdlib vector similarity for embedding comparison. Cosine, dot-product, euclidean.
Prompt Injection Scanner
Scan untrusted input for prompt-injection patterns.
Detects prompt-injection, jailbreak, and data-exfiltration patterns in user-supplied text.
PII Scrubber
Strip PII before sending to a model provider.
Redacts emails, phone numbers, SSNs, credit cards, and names before they leave your box.
Recursive Summarizer
Summarize massive codebases for LLM context.
Walks a directory, summarises files, rolls them up into a hierarchical overview that fits in a context window.
Dependency Minimizer
Strip unused imports from bloated files.
Removes unused imports and dead symbols. Useful after an agent leaves a trail of leftovers.
Synthetic Data Generator
Generate training data from a schema.
Schema-driven synthetic data generator with deterministic seeds. JSON/JSONL/CSV export.
JSONL Converter
Convert between JSON, JSONL, and CSV at scale.
Streaming conversion between JSON array, JSONL, and CSV. Flattens nested structures.
JSON to CSV
Flatten any JSON blob into CSV.
Takes nested JSON and produces flat CSV with dotted column names. Handles arrays, missing fields, and mixed types.
Markdown Table Parser
Extract structured tables from LLM-generated markdown.
Parse markdown tables into dicts. Handles LLM quirks: inconsistent spacing, pipe escapes, empty cells.
Markdown to Schema
Extract a JSON schema from a markdown spec.
Parses a markdown spec (types, fields, enums) into a JSON Schema you can validate against.
PDF Text Stripper
Extract text from PDFs without external deps.
Stdlib-only PDF text extraction. Handles FlateDecode streams. No pypdf, no pdfplumber.
Response Cache
Stop paying twice for the same LLM call.
File-based cache wrapper for any LLM API. Exact-match and TTL support. Tracks estimated cost savings.
Batch Processor
Process thousands of prompts in parallel without rate-limit pain.
Concurrent batch runner for LLM APIs with built-in rate limiting, retry logic, and progress reporting.
Agent Error Recovery
Catch agent failures before they spiral.
Error handling and recovery wrapper for autonomous agents. Detects loops, hangs, and tool-call failures; retries with exponential backoff.
Context Window Manager
Smart trimming so long conversations stay under the limit.
Trim message history with strategies: smart, recent-only, summarize, system+recent. Per-model context limits built in.
Hallucination Checker
Catch fabricated facts before they reach the user.
Cross-references LLM output against source material. Flags invented numbers, names, and quotes.
Stripe Webhook Handler
Verified, idempotent, retry-safe Stripe webhooks in one file.
Express router that verifies Stripe-Signature, deduplicates events by ID with a file-backed store, dispatches to typed handlers, and uses status codes Stripe's retry logic respects.
Tailwind Button
Accessible, variant-driven Tailwind button — drop in and ship.
One React + Tailwind button component covering 10 variants × 4 sizes. Forward-ref, ARIA-busy on loading, focus-visible rings, prop-driven sizing, full TypeScript types.
Clerk + Prisma User Sync
Clerk webhooks → Prisma upsert. Drop-in Next.js route handler.
Next.js Route Handler that consumes Clerk's user.created/updated/deleted webhooks, verifies the Svix signature, and upserts into your Prisma User table. Idempotent on retry.