Source: https://crabglamp.com/docs/llm-proxy/reference
Last updated: 2026-06-09
Type: reference

This is the catalog for the LLM proxy and voice surface. For an end-to-end walkthrough, see the Tutorial.

## Supported providers and proxy base URLs

The proxy is a transparent pipe — it strips the `/openai`, `/anthropic`, or `/elevenlabs` prefix and forwards the rest to the upstream provider. The OpenAI and Anthropic SDKs don't auto-append `/v1`, so put it in the base URL. For ElevenLabs the version segment (`/v1` or `/v2`) goes in the request path, so the base URL ends at `/elevenlabs`.

| Provider | Proxy base URL | Auth header |
|---|---|---|
| OpenAI | `https://llm.crabglamp.dev/openai/v1` | `Authorization: Bearer cg-pk-…` |
| Anthropic | `https://llm.crabglamp.dev/anthropic/v1` | `x-api-key: cg-pk-…` (or `Authorization: Bearer cg-pk-…`) |
| ElevenLabs | `https://llm.crabglamp.dev/elevenlabs` | `xi-api-key: cg-pk-…` |

The proxy accepts any of `Authorization: Bearer`, `x-api-key`, or `xi-api-key` — pick whichever is the SDK default for the upstream provider you are calling.

## Supported models and prices

Prices are what you pay, per 1M tokens, reviewed monthly. The proxy passes through any model the provider offers; a model not listed below is billed at the provider's published rate.

Chat models — OpenAI:

<table>
  <thead>
    <tr><th>Model</th><th>Input ($/1M tokens)</th><th>Output ($/1M tokens)</th></tr>
  </thead>
  <tbody>
    <tr><td>gpt-5.4</td><td>3.75</td><td>22.50</td></tr>
    <tr><td>gpt-5.4-mini</td><td>1.125</td><td>6.75</td></tr>
    <tr><td>gpt-5.4-nano</td><td>0.30</td><td>1.875</td></tr>
    <tr><td>gpt-5.4-pro</td><td>45.00</td><td>270.00</td></tr>
    <tr><td>gpt-5.3-codex</td><td>2.625</td><td>21.00</td></tr>
    <tr><td>gpt-4.1</td><td>3.00</td><td>12.00</td></tr>
    <tr><td>gpt-4.1-mini</td><td>0.60</td><td>2.40</td></tr>
    <tr><td>gpt-4.1-nano</td><td>0.15</td><td>0.60</td></tr>
    <tr><td>gpt-4o</td><td>3.75</td><td>15.00</td></tr>
    <tr><td>gpt-4o-mini</td><td>0.225</td><td>0.90</td></tr>
    <tr><td>o3</td><td>3.00</td><td>12.00</td></tr>
    <tr><td>o4-mini</td><td>1.65</td><td>6.60</td></tr>
    <tr><td>o3-mini</td><td>1.65</td><td>6.60</td></tr>
  </tbody>
</table>

Chat models — Anthropic:

<table>
  <thead>
    <tr><th>Model</th><th>Input ($/1M tokens)</th><th>Output ($/1M tokens)</th></tr>
  </thead>
  <tbody>
    <tr><td>claude-opus-4-6</td><td>7.50</td><td>37.50</td></tr>
    <tr><td>claude-sonnet-4-6</td><td>4.50</td><td>22.50</td></tr>
    <tr><td>claude-opus-4-5</td><td>7.50</td><td>37.50</td></tr>
    <tr><td>claude-sonnet-4-5</td><td>4.50</td><td>22.50</td></tr>
    <tr><td>claude-haiku-4-5</td><td>1.50</td><td>7.50</td></tr>
    <tr><td>claude-opus-4-1</td><td>22.50</td><td>112.50</td></tr>
    <tr><td>claude-opus-4</td><td>22.50</td><td>112.50</td></tr>
    <tr><td>claude-sonnet-4</td><td>4.50</td><td>22.50</td></tr>
  </tbody>
</table>

Voice — ElevenLabs (per 1k characters):

<table>
  <thead>
    <tr><th>Model</th><th>Price ($/1k chars)</th></tr>
  </thead>
  <tbody>
    <tr><td>eleven_v3</td><td>0.18</td></tr>
    <tr><td>eleven_multilingual_v2</td><td>0.18</td></tr>
    <tr><td>eleven_flash_v2_5</td><td>0.09</td></tr>
    <tr><td>eleven_flash_v2</td><td>0.09</td></tr>
  </tbody>
</table>

## Spend-cap math

The cap is the maximum a key may spend across LLM and voice combined. Once your running spend reaches the cap, the next request is rejected with HTTP 429 (`Monthly spend limit reached`). Enforcement is best-effort: spend is checked against a total that refreshes every couple of minutes, so a request or two may slip through slightly over the cap.

## CLI

On-VM CLI for virtual keys:

<table>
  <thead>
    <tr><th>Command</th><th>Purpose</th></tr>
  </thead>
  <tbody>
    <tr><td><code>crabglamp keys create</code></td><td>Create this Agent's virtual key and point OpenClaw at the proxy</td></tr>
    <tr><td><code>crabglamp keys status</code></td><td>Show the key's status and this month's spend</td></tr>
    <tr><td><code>crabglamp keys configure</code></td><td>Switch the provider/model the key routes to (OpenClaw)</td></tr>
    <tr><td><code>crabglamp keys regenerate</code></td><td>Rotate the token; the old token stops working immediately</td></tr>
    <tr><td><code>crabglamp keys revoke</code></td><td>Revoke the key (regenerate to restore)</td></tr>
  </tbody>
</table>

## Limits

<table>
  <thead>
    <tr><th>Limit</th><th>Value</th></tr>
  </thead>
  <tbody>
    <tr><td>Minimum spend cap</td><td>$20</td></tr>
    <tr><td>Keys per Agent</td><td>1</td></tr>
    <tr><td>Usage reporting cadence</td><td>within ~2 minutes</td></tr>
    <tr><td>Token prefix</td><td><code>cg-pk-</code></td></tr>
  </tbody>
</table>
