Available Models
Browse and compare LLM models available through the LLM Gateway.
32 model(s)
| Model | Input | Output | Input Price | Output Price | Cache Read | Cache Write | Context | Max Output |
|---|---|---|---|---|---|---|---|---|
Claude Haiku 4.5 Claude Haiku 4.5 is Anthropic’s fastest and most cost-efficient model, delivering performance comparable to Sonnet 4 across coding, computer use, and agentic tasks. | TextImage | Text | $1 | $5 | $0.1 | $1.25 | 200K | 64K |
Claude Opus 4.5 Claude Opus 4.5 is Anthropic’s frontier reasoning model, designed for complex software engineering, agentic workflows, and long-running computer tasks. It delivers strong multimodal performance, excels on coding and reasoning benchmarks, and offers improved resistance to prompt injection attacks.The model allows developers to adjust inference effort based on task needs, balancing response speed, reasoning depth, and token usage. It also supports advanced tool use, extended context management, and multi-agent workflows, making it well suited for research, code debugging, planning, and browser or spreadsheet tasks.
| TextImage | Text | $5 | $25 | $0.5 | $6.25 | 200K | 64K |
Claude Opus 4.6 Anthropic has released the new-generation Claude Opus 4.6 model, featuring a 1M context window and increasing the maximum output token limit to 128K—double the previous generation’s 64K cap. The model introduces an adaptive thinking mode that dynamically adjusts reasoning depth based on task complexity, along with a new highest-level “max effort” parameter. | TextImage | Text | $5 | $25 | $0.5 | $6.25 | 1M | 128K |
Claude Sonnet 4.5 Claude Sonnet 4.5 is the world’s leading coding model and Anthropic’s most capable model for building complex agents and using computers. It also delivers significant improvements in reasoning and mathematics. | TextImage | Text | $3 | $15 | $0.3 | $3.75 | 200K | 64K |
Claude Sonnet 4.6 Sonnet 4.6 is Anthropic’s most powerful Sonnet model to date, achieving frontier-level performance in programming, AI agents, and professional work. It excels at iterative development, navigating complex codebases, end-to-end project management with memory, professional document writing, and reliable computer use for web-based question answering and workflow automation. | TextImage | Text | $3 | $15 | $0.3 | $3.75 | 1M | 64K |
DeepSeek V3.2 DeepSeek has released the official V3.2 version, significantly enhancing its agentic and reasoning capabilities. It reaches GPT-5-level performance on mainstream benchmarks and supports tool use in thinking mode. At the same time, the newly introduced Speciale experimental version has achieved gold-medal-level results in multiple international competitions. The model is now fully open for use. | Text | Text | $0.286 | $0.429 | $0.029 | - | 128K | 8K |
DeepSeek V3.2 Thinking DeepSeek-V3.2-thinking is our first model to integrate reasoning into tool use, and it serves as the thinking mode of DeepSeek-V3.2. | Text | Text | $0.286 | $0.429 | $0.029 | - | 128K | 64K |
GLM 4.7 GLM-4.7 is Zhipu’s latest flagship model. It is enhanced for Agentic Coding scenarios, with stronger coding ability, long-horizon task planning, and tool coordination, and it has achieved leading performance among open-source models on current rankings across multiple public benchmarks. Its general capabilities have also been improved, with more concise and natural responses and more immersive writing. | TextImage | Text | $0.571 | $2.286 | $0.114 | - | 200K | 128K |
GLM 5 GLM-5 is Zhipu’s next-generation flagship foundation model, built for Agentic Engineering and designed to deliver reliable productivity in complex systems engineering and long-horizon agent tasks. In coding and agent capabilities, GLM-5 achieves state-of-the-art performance among open-source models. In real-world programming scenarios, its user experience approaches that of Claude Opus 4.5. It excels at complex systems engineering and long-horizon agent tasks, making it an ideal foundation for general-purpose agent assistants. | Text | Text | $0.857 | $3.142 | $0.214 | - | 200K | 128K |
GPT 5.2 GPT-5.2 is the best model for coding and intelligence tasks across industries. | TextImage | Text | $1.75 | $14 | $0.175 | - | 400K | 128K |
GPT-5.3 Codex GPT-5.3-Codex redefines the role of AI in programming and general productivity through improved performance, broader capability generalization, and enhanced safety. | TextImage | Text | $1.75 | $14 | $0.175 | - | 400K | 128K |
GPT-5.4 Small and affordablGPT-5.4 is our frontier model for complex professional work.e model for lightweight tasks | TextImage | Text | $2.5 | $15 | $0.25 | - | 1M | 128K |
GPT-5.4 Pro GPT-5.4 Pro uses more compute to think more deeply and deliver consistently better answers. It is available exclusively through the Responses API, which supports multi-turn model interaction before generating a response and will enable additional advanced API capabilities in the future. | TextImage | Text | $30 | $180 | - | - | 1.1M | 128K |
Gemini 2.5 Pro Gemini 2.5 Pro is Google’s latest AI model and its most advanced model to date, excelling at coding and complex prompts. With “Deep Think,” it can reason before responding, improving performance and accuracy. The model performs exceptionally well across multiple benchmarks and ranks first on the LMArena leaderboard for reasoning and code generation. It supports multimodal input, including text, images, audio, video, and code. | TextImageAudioVideo | Text | $2.5 | $15 | $0.25 | $4.5 | 1M | 65.5K |
Gemini 3 Pro Gemini 3 Pro Preview is part of Google’s most intelligent model family to date, built on advanced reasoning capabilities. It is designed to turn ideas into reality through agentic workflows, autonomous coding, and complex multimodal tasks. | TextImageAudioVideo | Text | $2 | $12 | $0.2 | $4.5 | 1M | 64K |
Gemini 3.1 Pro Gemini 3.1 is Google’s most intelligent model family to date, built on advanced reasoning capabilities. It is designed to turn ideas into reality through agentic workflows, autonomous coding, and complex multimodal tasks. Gemini 3.1 Pro Preview is best suited for complex tasks requiring broad world knowledge and advanced cross-modal reasoning. | TextImageAudioVideo | Text | $2 | $12 | $0.2 | $4.5 | 1M | 64K |
Glm 5 Turbo GLM-5-Turbo is a foundation model deeply optimized for OpenClaw’s Lobster scenarios. From the training stage, it has been specially optimized around the core requirements of Lobster tasks, strengthening key capabilities such as tool calling, instruction following, scheduled and persistent task handling, and long-chain execution. This enables it to be truly actionable even in complex, dynamic, and long-horizon tasks. | Text | Text | $1 | $3.714 | $0.257 | - | 200K | 128K |
Grok 4.1 Fast Non Reasoning Grok-4-1-fast-non-reasoning is an AI model developed by xAI, optimized for maximum speed in generating responses and carrying out agentic tasks. Unlike its “reasoning” counterpart, this variant skips the use of “thinking tokens,” allowing it to deliver immediate, pattern-matched answers for simple and straightforward queries. | TextImage | Text | $0.2 | $0.5 | $0.05 | - | 2M | 64K |
Grok 4.1 Fast Reasoning Grok 4.1 excels in creative, emotional, and collaborative interactions, showing a sharper ability to capture subtle intent and delivering a more engaging conversational experience. It also maintains a highly consistent personality while fully inheriting the sharp intelligence and reliable performance of its predecessor. | TextImage | Text | $0.2 | $0.5 | $0.05 | - | 2M | 64K |
Kimi K2.5 Kimi K2.5 is Kimi’s smartest model to date, achieving open-source state-of-the-art performance in agents, coding, visual understanding, and a wide range of general intelligence tasks. It is also Kimi’s most versatile model so far, featuring a natively multimodal architecture that supports both visual and text inputs, thinking and non-thinking modes, as well as dialogue and agent tasks. | TextImage | Text | $0.571 | $3 | $0.1 | - | 256K | 32K |
Kimi k2 Kimi K2 is a breakthrough mixture-of-experts model designed for outstanding performance in frontier knowledge, reasoning, and coding tasks. It is built for autonomous action and intelligent problem-solving. | Text | Text | $0.571 | $2.286 | $0.143 | - | 262.1K | 32K |
Llama 4 Maverick Meta's latest mixture-of-experts model | TextImage | Text | $0.5 | $0.75 | $0.125 | $0.5 | 1M | 65.5K |
MiniMax M2.1 MiniMax-M2.1 is a lightweight, cutting-edge large language model optimized for coding, agentic workflows, and modern application development. Activating only 10 billion parameters, it delivers a major leap in real-world capabilities while maintaining excellent latency, scalability, and cost efficiency. | Text | Text | $0.3 | $1.2 | $0.03 | $0.375 | 204.8K | 1.3M |
MiniMax M2.5 MiniMax-M2.5 has achieved or surpassed the industry state of the art across productivity scenarios such as coding, tool use, search, and office work. | Text | Text | $0.3 | $1.2 | $0.06 | $0.375 | 204.8K | 131.1K |
Mistral Large Mistral's flagship model for complex tasks | Text | Text | $2 | $6 | $0.5 | $2 | 131.1K | 8.2K |
Nano Banana 2 Nano Banana 2 provides high-quality image generation and conversational editing at a mainstream price point and low latency. | TextImage | TextImage | $0.5 | $60 | - | - | 131.1K | 32.8K |
Nano Banana Pro Gemini 3 Pro Image Preview (Nano Banana Pro) is the next-generation AI image generation and editing model in Google’s Gemini family, serving as an upgraded version of Gemini 2.5 Flash Image (Nano Banana). Combining a multimodal Transformer with diffusion-based modeling, it natively supports 2K (2048×2048) and 4K output, with significant improvements in image quality, text rendering, and physical reasoning. | TextImage | TextImage | $2 | $120 | - | - | 65.5K | 32.8K |
Qwen 3.5 397B A17B Qwen3.5系列397B-A17B原生视觉语言模型,基于混合架构设计,融合了线性注意力机制与稀疏混合专家模型,实现了更高的推理效率。在语言理解、逻辑推理、代码生成、智能体任务、图像理解、视频理解、图形用户界面(GUI)等多种任务中,均展现出与当前顶尖前沿模型相媲美的卓越性能。具备强大的代码生成与智能体能力,对于各类智能体场景具有良好的泛化性。 | TextImageVideo | Text | $0.429 | $2.571 | - | - | 256K | 64K |
Qwen 3.5 Flash Qwen3.5原生视觉语言系列Flash模型,基于混合架构设计,融合了线性注意力机制与稀疏混合专家模型,实现了更高的推理效率。模型效果在纯文本与多模态方面相较3系列均实现飞跃式进步;响应速度快,兼具推理速度和性能。 | TextImage | Text | $0.171 | $1.714 | $0.017 | $0.214 | 1M | 64K |
Qwen 3.5 Plus The Qwen3.5 native vision-language Plus models adopt a hybrid architecture that combines linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. Across a wide range of benchmark tasks, the 3.5 series demonstrates outstanding performance comparable to today’s leading frontier models, while delivering a significant leap over the 3 series in both pure text and multimodal capabilities. | TextImageVideo | Text | $0.57 | $3.426 | $0.06 | $0.714 | 1M | 64K |
Qwen3 Max Preview The latest Qwen3-Max-Preview model is the preview version of the Max model in the Qwen3 series. Compared with the Qwen 2.5 series, it delivers substantial overall improvements in general capabilities, with significantly enhanced Chinese and English text understanding, complex instruction following, subjective open-ended task performance, multilingual ability, and tool-use capability. The model also produces fewer knowledge hallucinations. | Text | Text | $2.143 | $8.571 | $0.429 | - | 256K | 64K |
Step 3.5 Flash Step 3.5 Flash is the strongest open-source foundation model released by StepFun to date. Its capabilities in agent scenarios and mathematical tasks approach those of closed-source models, enabling it to handle complex, long-chain tasks effectively. | Text | Text | $0.1 | $0.3 | - | - | 256K | 128K |