The same cg-pk-… virtual key authenticates against the ElevenLabs proxy. Voice spend counts against the same combined cap as LLM spend.
Export the ElevenLabs base URL
export ELEVENLABS_API_KEY="cg-pk-..."
export ELEVENLABS_BASE_URL="https://llm.crabglamp.dev/elevenlabs"
The Python SDK reads ELEVENLABS_BASE_URL automatically. Other clients use the URL as the host in their requests.
Make a TTS request
curl -X POST "$ELEVENLABS_BASE_URL/v1/text-to-speech/{voice_id}" \
-H "xi-api-key: $ELEVENLABS_API_KEY" \
-H "content-type: application/json" \
-d '{
"text": "Hello from the CrabGlamp proxy.",
"model_id": "eleven_multilingual_v2",
"voice_settings": { "stability": 0.5, "similarity_boost": 0.75 }
}' \
-o output.mp3
The response body is the audio bytes; the proxy passes them through unchanged.
Spend tracking
The proxy counts spend on every TTS request, at the per-1k-character rate shown in the LLM proxy reference.
The dashboard at /dashboard/llm shows LLM and voice spend in one chart by default. Click the model dropdown to filter to voice-only or to a specific TTS model.
Verify routing
curl -sS -I "$ELEVENLABS_BASE_URL/v1/voices" \
-H "xi-api-key: $ELEVENLABS_API_KEY" | head -1
If you see HTTP/2 200 and the response comes from llm.crabglamp.dev, the routing is correct.