MCP-сервер FFmpeg Micro (обработка видео).
npx — no manual install needed. The first invocation downloads and caches the package. Claude Code claude mcp add gammainfra \ --env GAMMAINFRA_API_KEY=sk-gammainfra-... \ -- npx -y @gammainfra/mcp-server Or edit ~/.claude.json and add to the mcpServers block: { "mcpServers": { "gammainfra": { "commModel Context Protocol (MCP) server for GammaInfra — intelligent LLM routing across every major provider via one OpenAI-shape API.
Drop this server into Claude Code, Claude Desktop, Cursor, Cline, Continue, or any MCP-compatible host, and your agent gets direct tool access to:
chat_completions — call any supported model (or gammainfra/auto for smart routing) with cost, latency, and quality controls. Routing metadata (which provider served, exact cost in USD, fallback chain) is returned as a structured routing_meta field.list_models — full model catalog with pricing and capability flags.get_balance — managed + BYOK balances.get_status — overall + per-provider health, 24h request count.The server runs via npx — no manual install needed. The first invocation downloads and caches the package.
claude mcp add gammainfra \
--env GAMMAINFRA_API_KEY=sk-gammainfra-... \
-- npx -y @gammainfra/mcp-server
Or edit ~/.claude.json and add to the mcpServers block:
{
"mcpServers": {
"gammainfra": {
"command": "npx",
"args": ["-y", "@gammainfra/mcp-server"],
"env": { "GAMMAINFRA_API_KEY": "sk-gammainfra-..." }
}
}
}
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"gammainfra": {
"command": "npx",
"args": ["-y", "@gammainfra/mcp-server"],
"env": { "GAMMAINFRA_API_KEY": "sk-gammainfra-..." }
}
}
}
Restart Claude Desktop. The "GammaInfra" server should appear in the tools menu.
Edit ~/.cursor/mcp.json:
{
"mcpServers": {
"gammainfra": {
"command": "npx",
"args": ["-y", "@gammainfra/mcp-server"],
"env": { "GAMMAINFRA_API_KEY": "sk-gammainfra-..." }
}
}
}
Open Cline's settings (gear icon → MCP Servers tab) and add:
{
"gammainfra": {
"command": "npx",
"args": ["-y", "@gammainfra/mcp-server"],
"env": { "GAMMAINFRA_API_KEY": "sk-gammainfra-..." },
"disabled": false
}
}
| Var | Required | Default | Description |
|---|---|---|---|
GAMMAINFRA_API_KEY | yes | — | Your GammaInfra API key, format sk-gammainfra-{32_chars}. |
GAMMAINFRA_BASE_URL | no | https://api.gammainfra.com/v1 | Override for staging/dev. |
chat_completionsSend a chat completion request and receive the model response plus routing metadata.
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
model | string | yes | gammainfra/auto for smart routing, gammainfra/fast/gammainfra/cheap for tier shortcuts, or pin a specific model like openai/gpt-5-mini. |
messages | array | yes | OpenAI-shape conversation messages. |
temperature | number | no | 0..2. |
max_tokens | int | no | |
max_completion_tokens | int | no | GPT-5 family requires this instead of max_tokens. |
cost_quality | float | no | 0.0..1.0 continuous dial. Sent as X-GammaInfra-Cost-Quality. |
max_latency_ms | int | no | 60..600000. Caps total wall-clock incl. fallback retries. Also enforced client-side as a hard request abort. |
preference | string | no | quality, cost, or latency. |
region | string | no | us, eu, apac, or specific AWS region. |
tools, tool_choice, response_format, top_p, frequency_penalty, presence_penalty | various | no | Standard OpenAI fields, forwarded as-is. |
Returns: { response: <OpenAI response>, routing_meta: { provider, endpoint, cost_usd, input_cost_usd, output_cost_usd, router_version, logical_model, fallback_chain, attempted_count, request_id, ... } }
Timeout note: Every request has a 10-minute client-side hard timeout (via AbortController) so a hung upstream can't wedge the MCP process. For chat_completions, a supplied max_latency_ms replaces that default as the hard abort bound.
Streaming note: MCP tool responses are non-streaming. The server always sends stream: false to the upstream and does not accept a stream parameter on the tool input (it's rejected by schema validation). For streaming, use the GammaInfra HTTP API directly.
list_modelsNo parameters. Returns the full model catalog including direct-pin slugs, per-token pricing, and capability flags (supports_tools, supports_vision).
get_balance| Name | Type | Required | Description |
|---|---|---|---|
include_byok | boolean | no | Default false. Also fetch the BYOK balance. Off by default to avoid an extra request — and a guaranteed 404 — for customers without BYOK enrollment. |
Returns { managed_balance_usd, byok_balance_usd, currency }. With include_byok omitted/false, byok_balance_usd is null and no BYOK request is made (no byok_error). With include_byok: true, if BYOK isn't enrolled, byok_balance_usd is null and a byok_error field describes the cause.
get_statusNo parameters. Returns GammaInfra's current overall health, per-provider state and live p50 latency, and 24h request count.
git clone https://github.com/yuz0101/gammainfra-mcp-server.git
cd gammainfra-mcp-server
npm install
npm run test # 30 tests, ~1s
npm run build # tsc → dist/
npm run typecheck
MIT — see LICENSE.