One endpoint.
Open models.
OpenAI Chat Completions-compatible inference for open models.
Use familiar SDKs with api.inference.idyl.dev.
Drop-in compatible with the tools you already use
What is idyl.inference?
Why idyl
Built for how
developers actually work.
You already have a preferred IDE, coding agent, and workflow. idyl fits into them with a familiar API surface and no proprietary SDK.
Drop-in compatible
Use the OpenAI Chat Completions shape with familiar SDKs. In many tools, setup is as simple as changing the base URL and model name.
Open models
Run open models through a consistent API, starting with qwen3:8b and expanding across Qwen, DeepSeek, Llama, Mistral, Gemma, Phi, Kimi, and more.
Powered by idle compute
Inference is built to run across underutilized GPUs on the idyl network, reducing dependence on centralized GPU capacity as the network grows.
OpenAI-compatible chat API
Streaming responses, system prompts, JSON mode, and tool calls are supported where model and backend capabilities allow. If your tool supports configurable OpenAI-compatible chat endpoints, it can speak idyl.
// Familiar chat completions request shape const stream = await client.chat.completions.create( model: "qwen3:8b", messages: [ role: "user", content: "..." ], stream: true, tools: [ type: "function", ... ], );
Models
Live on Qwen3.
Built for the 2026 open model wave.
Start with qwen3:8b today. The roadmap is built around Qwen, DeepSeek, Llama, Mistral, Gemma, Phi, Kimi, and other open-weight families as model support and network capacity expand.
Getting Started
Three steps. That's it.
Get an API key
Sign up free. No credit card, no commitment. Your key is ready in seconds.
Point your tools
Set your base URL to api.inference.idyl.dev and start with qwen3:8b.
Start building
Use common OpenAI-compatible clients and tools with a familiar chat completions API.