Model Library
Access open-source AI models through a single, unified API.
DeepSeek's flagship MoE model. 685B total parameters (37B active) with MIT license. Achieves GPT-5-level performance across reasoning and coding benchmarks. Best-in-class for complex code generation and multi-step problem solving.
Specialized reasoning variant of DeepSeek V3.2. Surpasses GPT-5 on AIME and HMMT benchmarks, matching Gemini-3.0-Pro. Optimized for mathematics, logic, and scientific reasoning tasks.
Meta's natively multimodal MoE model. 109B total parameters (17B active, 16 experts) with an industry-leading 10M token context window. Strong all-rounder for chat, code, and image understanding.
Alibaba's latest model from the Qwen 3.5 series. Competes with Claude Sonnet 4.5 and GPT-5.2 on major benchmarks. Supports text, image, and video input with 262K native context.
State-of-the-art for software engineering and front-end development. Excels at image/video-to-code, visual debugging, and UI reconstruction. 256K context handles large codebases.
Meta's largest openly available Llama 4 model. 400B total MoE parameters (17B active, 128 experts) with natively multimodal image understanding and 1M token context.
Mistral's dedicated code model. 24B parameters focused on code generation, completion, and explanation. Optimized for IDE integration and agentic coding workflows.
Zhipu AI's flagship for complex systems engineering and long-horizon agentic tasks. 744B total parameters with 40B active per token. Trained on 28.5 trillion tokens.
Google's open model with multimodal capabilities. Handles text and image inputs. Strong general-purpose for chat, summarization, and visual question answering.
Ultra-fast reasoning model from Xiaomi. 309B total but only 15B active per token — exceptional speed-to-quality ratio. Designed for coding and agentic workflows.
General-purpose chat model. 24B parameters with 128K context and strong multilingual support. A reliable workhorse for production chat applications.
Mistral's Fill-in-the-Middle code model. 22B parameters with exceptional autocomplete and code infilling. Supports 80+ programming languages.
Compact model from the Qwen 3.5 series. Same architecture as the 27B with excellent quality-per-parameter. Great for latency-sensitive and cost-efficient deployments.
Microsoft's small but powerful reasoning model. Punches above its weight on math, science, and logical reasoning. Excellent choice for strong reasoning at low cost.
Distilled reasoning model. 8B parameters with chain-of-thought reasoning capabilities inherited from the larger R1. Best reasoning model in the sub-10B class.
Lightweight model from the Qwen 3.5 series. 4B parameters yet supports the full 262K context. Ideal for edge deployments, mobile, and high-throughput serving.
No models found
Try a different search term or filter.