Available Models

Browse and compare LLM models available through the LLM Gateway.

85 model(s)

Model	Input	Output	Input Price	Output Price	Cache Read	Cache Write	Context	Max Output
Claude Haiku 4.5 Claude Haiku 4.5 is Anthropic’s fastest and most cost-efficient model, delivering performance comparable to Sonnet 4 across coding, computer use, and agentic tasks.	TextImage	Text	$1	$5	$0.1	$1.25	200K	64K
Claude Opus 4.5 Claude Opus 4.5 is Anthropic’s frontier reasoning model, designed for complex software engineering, agentic workflows, and long-running computer tasks. It delivers strong multimodal performance, excels on coding and reasoning benchmarks, and offers improved resistance to prompt injection attacks.The model allows developers to adjust inference effort based on task needs, balancing response speed, reasoning depth, and token usage. It also supports advanced tool use, extended context management, and multi-agent workflows, making it well suited for research, code debugging, planning, and browser or spreadsheet tasks.	TextImage	Text	$5	$25	$0.5	$6.25	200K	64K
Claude Opus 4.6 Anthropic has released the new-generation Claude Opus 4.6 model, featuring a 1M context window and increasing the maximum output token limit to 128K—double the previous generation’s 64K cap. The model introduces an adaptive thinking mode that dynamically adjusts reasoning depth based on task complexity, along with a new highest-level “max effort” parameter.	TextImage	Text	$5	$25	$0.5	$6.25	1M	128K
Claude Opus 4.7 Opus 4.7 has made significant improvements over Opus 4.6 in advanced software engineering, showing clear progress on the most challenging tasks. It can confidently handle the most complex coding work—which previously required close supervision. Opus 4.7 processes complex and long-running tasks in a rigorous and consistent manner, follows instructions precisely, and devises methods to verify its output before reporting.	TextImage	Text	$5	$25	$0.5	$6.25	1M	128K
Claude Sonnet 4.5 Claude Sonnet 4.5 is the world’s leading coding model and Anthropic’s most capable model for building complex agents and using computers. It also delivers significant improvements in reasoning and mathematics.	TextImage	Text	$3	$15	$0.3	$3.75	200K	64K
Claude Sonnet 4.6 Sonnet 4.6 is Anthropic’s most powerful Sonnet model to date, achieving frontier-level performance in programming, AI agents, and professional work. It excels at iterative development, navigating complex codebases, end-to-end project management with memory, professional document writing, and reliable computer use for web-based question answering and workflow automation.	TextImage	Text	$3	$15	$0.3	$3.75	1M	64K
Claude Sonnet 5 Claude Sonnet 5 is built to be the most dominant Sonnet model to date. It can formulate plans, use tools such as browsers and terminals, and operate autonomously, reaching a level that just a few months ago would have required a much larger and more expensive model. For many developers, the era of agentic AI began with Sonnet-class models: Claude Sonnet 3.5,3.6 and 3.7 were the first to demonstrate astonishing capabilities in coding and tool use. Recently, however, the most notable advancement in agentic capabilities has emerged in our Opus-tier models.	TextImage	TextImage	$2	$10	$0.2	$2.5	1M	128K
Deepseek V4 Pro DeepSeek-V4-Pro is a high-performance open-source large model released by DeepSeek. It features top-tier reasoning and Agent capabilities, supports ultra-long context, is compatible with domestic Ascend chips, and offers exceptional cost-effectiveness.	Text	Text	$0.42857143	$0.85714286	$0.00357143	-	1M	384K
Deepseek V4 Flash DeepSeek-V4-Flash is the lightweight version of the DeepSeek V4 series, focusing on high cost-effectiveness and high throughput efficiency. It is suitable for general dialogue and basic text tasks, while supporting a million-Token long context and efficient inference.	Text	Text	$0.143	$0.286	$0.029	-	1M	384K
Doubao Seedance 2.0 260128 Seedance 2.0 is a new generation of professional-grade multimodal video creation model launched by the Doubao Model Team. It supports multiple modalities such as images, videos, and audio as reference inputs to generate videos, breaking the limitations of creating with single materials. It also has video editing and video extension capabilities, and can accurately reproduce the details, materials, timbre, visual effects style, and camera movement of objects. Character characteristics can also be stably maintained, giving creators control like a director.	Text	Video	$7.286	$7.286	-	-	-	-
Doubao Seedance 2.0 fast 260128 Seedance 2.0 Fast supports generating videos from multimodal materials such as images, videos, and audio, while also possessing video editing and extension capabilities, enabling video generation tools to enter a new industrialized stage of accurate generation and reusable iteration. The model's understanding of physical laws continues to deepen, becoming more closely aligned with the real world, and its intent understanding ability is significantly improved, strictly adhering to the constraints of instruction details, thereby ensuring the credibility of professional-grade narratives.	Text	Video	$5.286	$5.286	-	-	-	-
GLM 4.7 GLM-4.7 is Zhipu’s latest flagship model. It is enhanced for Agentic Coding scenarios, with stronger coding ability, long-horizon task planning, and tool coordination, and it has achieved leading performance among open-source models on current rankings across multiple public benchmarks. Its general capabilities have also been improved, with more concise and natural responses and more immersive writing.	TextImage	Text	$0.286	$1.143	$0.057	-	200K	128K
GLM 5 GLM-5 is Zhipu’s next-generation flagship foundation model, built for Agentic Engineering and designed to deliver reliable productivity in complex systems engineering and long-horizon agent tasks. In coding and agent capabilities, GLM-5 achieves state-of-the-art performance among open-source models. In real-world programming scenarios, its user experience approaches that of Claude Opus 4.5. It excels at complex systems engineering and long-horizon agent tasks, making it an ideal foundation for general-purpose agent assistants.	Text	Text	$0.571	$2.571	$0.143	-	200K	128K
GLM 5.2 GLM-5.2 is a flagship model designed for the era of long tasks. It supports a truly usable 1M context, and in actual tests, it can handle project-level engineering context, enabling more stable execution of long-range tasks, more reliable adherence to engineering specifications, and further improvement in the success rate of development scenarios. A single task can complete the entire development chain from requirements to multi-terminal deployable products.	Text	Text	$1.143	$4	$0.286	-	1M	128K
GLM-5.1 glm-5.1 is the latest flagship model launched by the Zhipu platform.	Text	Text	$0.857	$3.429	$0.186	-	200K	128K
GPT 5.1 Codex Max GPT-5.1-Codex-Max is built on an updated version of the base reasoning model, which has been trained to handle agent tasks in fields such as software engineering, mathematics, research, medicine, and computer applications. This is our first natively trained model that supports multiple context windows, capable of coherently processing millions of tokens in a single task through the process of compaction. Like its predecessor, GPT-5.1-Codex-Max has been trained on real-world software engineering tasks, such as Pull Request (PR) creation, code review, front-end development, and question answering.	TextImageAudioVideo	TextImageAudioVideo	$1.25	$10	$0.125	-	1M	256K
GPT 5.1 Codex Mini GPT-5.1-Codex mini is a lightweight, cost-efficient variant of GPT-5.1-Codex optimized for coding, software development, and interactive programming tasks.	TextImage	Text	$0.25	$2	$0.025	-	400K	64K
GPT 5.4 Nano Gpt-5.4-Nano is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$0.2	$1.25	-	-	400K	272K
GPT Image 1.5 GPT Image 1.5 is our latest image generation model, with better instruction tracking and adherence to prompts.	Text	Image	$7	$10	-	-	64K	16K
GPT Image 2 GPT Image 2 is a state-of-the-art image generation model that supports fast, high-quality image generation and editing. It supports flexible image sizes and high-fidelity image input.	TextImage	Image	$8	$30	$1.25	-	32K	32K
GPT- 4.1 GPT-4.1 is a high-performance multimodal model released by OpenAI on April 15, 2025, positioned as a comprehensive upgrade to GPT-4o. It excels in advanced code generation, long-context understanding, fast inference, and multimodal processing, with key strengths of a 1M-token context window and lower cost, optimized for software development, complex instruction execution, and long-document analysis.	TextImage	Text	$2	$8	$0.5	-	1M	32K
GPT-4.1 Mini GPT-4.1 Mini is a lightweight multimodal large language model launched by OpenAI. It balances strong reasoning capability, million-level ultra-long context window and ultra-low cost, with low latency and high throughput. It is suitable for large-scale high-concurrency services, long document processing, daily conversation and lightweight multimodal application scenarios.	TextImage	Text	$0.1	$0.4	$0.025	-	1M	32K
GPT-4.1 Nano GPT-4.1 Nano is the smallest and most cost-efficient model in OpenAI’s GPT-4.1 family. It is designed for very low-cost, high-throughput tasks such as classification, extraction, simple reasoning, lightweight coding assistance, data transformation, and large-scale automation. It supports long-context text processing and image understanding, but its reasoning and coding ability are generally weaker than GPT-4.1 Mini and the full GPT-4.1 model.	TextImage	Text	$0.1	$0.4	$0.025	-	1M	32K
GPT-5 Gpt-5 is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$1.25	$10	$0.125	-	400K	128K
GPT-5 Chat Latest GPT-5 Chat Latest is a new flagship multimodal conversational large model launched by OpenAI. It delivers powerful logical reasoning, ultra-low hallucinations, ultra-long context comprehension and excellent multimodal analysis capabilities, balancing response speed with professional-level generation quality. It is suitable for daily conversation, complex problem solving, long document analysis, creative content creation and multimodal intelligent interaction scenarios.	TextImage	Text	$1.25	$10	$0.125	-	400K	128K
GPT-5 Codex GPT-5-Codex is a professional large model released by OpenAI on September 15, 2025, deeply optimized for agentic coding and software engineering based on GPT-5OpenAI. Trained on real-world development scenarios, it delivers both fast interactive responses and long-duration independent task execution (over 7 hours of continuous work). It excels in code generation, review, debugging, refactoring, multi-language development, and system-level interaction, deeply integrated with IDE, CLI, cloud, and GitHub workflows—serving as an AI coding teammate for professional developers.	Text	Text	$1.25	$10	$0.125	-	400K	128K
GPT-5 Mini GPT-5 Mini is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$0.25	$2	$0.025	-	400K	128K
GPT-5 Nano Gpt-5-Nano is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$0.05	$0.4	$0.005	-	400K	128
GPT-5 Pro Gpt-5-Pro is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$15	$120	-	-	400K	128K
GPT-5.1 Gpt-5.1 is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$1.25	$10	$0.125	-	400K	128K
GPT-5.1 Chat Latest Gpt-5.1-Chat-Latest is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$1.25	$10	$0.125	-	400K	128K
GPT-5.2 GPT-5.2 is the best model for coding and intelligence tasks across industries.	TextImage	Text	$1.75	$14	$0.175	-	400K	128K
GPT-5.2 Chat Latest Gpt-5.2-Chat-Latest is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$1.75	$14	$0.175	-	400K	128K
GPT-5.2 Codex Gpt-5.2-Codex is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	Text	Text	$1.75	$14	$0.175	-	400K	128K
GPT-5.2 Pro Gpt-5.2-Pro is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$21	$168	-	-	400K	272K
GPT-5.3 Codex GPT-5.3-Codex redefines the role of AI in programming and general productivity through improved performance, broader capability generalization, and enhanced safety.	TextImage	Text	$1.75	$14	$0.175	-	400K	128K
GPT-5.4 Small and affordablGPT-5.4 is our frontier model for complex professional work.e model for lightweight tasks	TextImage	Text	$2.5	$15	$0.25	-	1M	128K
GPT-5.4 Pro GPT-5.4 Pro uses more compute to think more deeply and deliver consistently better answers. It is available exclusively through the Responses API, which supports multi-turn model interaction before generating a response and will enable additional advanced API capabilities in the future.	TextImage	Text	$30	$180	-	-	1.1M	128K
GPT-5.5 GPT-5.5 (codenamed "Spud") is OpenAI's flagship multimodal large model released on April 23, 2026. As the first complete ground-up retrain since GPT-4.5, it is positioned as an agentic model for real work, featuring native computer control, autonomous agents, 1M+ token context, deep reasoning, and advanced coding & scientific capabilities, optimized for complex professional workflows, autonomous programming, long-document analysis, and enterprise AI applications.	TextImage	Text	$5	$30	$0.5	-	1M	128K
Gemini 2.5 Flash Gemini 2.5 Flash is a multimodal large language model launched by Google DeepMind in April 2025. It features ultra-fast response, million-level context window, controllable reasoning (Thinking Budget) and high cost performance, serving as the flagship version of the Gemini 2.5 family for large-scale real-time applications.	Text	TextImage	$0.3	$2.5	$0.03	$1	1M	64K
Gemini 2.5 Flash Lite Gemini 2.5 Flash-Lite is an ultra-lightweight multimodal large model launched by Google DeepMind in June 2025. It features ultra-low cost, ultra-low latency, and high throughput, supports a 1M-token context window and controllable reasoning (Thinking Budget), serving as the cost-effective flagship of the Gemini 2.5 family for large-scale, high-concurrency, lightweight real-time applications (e.g., classification, translation, data processing).	Text	TextImageAudioVideo	$0.1	$0.4	$0.01	$1	1M	64K
Gemini 2.5 Pro Gemini 2.5 Pro is Google’s latest AI model and its most advanced model to date, excelling at coding and complex prompts. With “Deep Think,” it can reason before responding, improving performance and accuracy. The model performs exceptionally well across multiple benchmarks and ranks first on the LMArena leaderboard for reasoning and code generation. It supports multimodal input, including text, images, audio, video, and code.	TextImageAudioVideo	Text	$2.5	$15	$0.25	$4.5	1M	65.5K
Gemini 3 Flash Preview Gemini 3 Flash Preview is a preview multimodal large model released by Google DeepMind in December 2025. It features ultra-fast inference, near-Pro level intelligence, 1M-token context window, multimodal inputs (text/images/audio/video), and configurable thinking levels, designed specifically for high-concurrency real-time scenarios such as agentic workflows, interactive development, long-document analysis, and coding.	Text	TextImageAudioVideo	$0.5	$3	$0.05	$1	1M	64K
Gemini 3 Pro Gemini 3 Pro Preview is part of Google’s most intelligent model family to date, built on advanced reasoning capabilities. It is designed to turn ideas into reality through agentic workflows, autonomous coding, and complex multimodal tasks.	TextImageAudioVideo	Text	$2	$12	$0.2	$4.5	1M	64K
Gemini 3.1 Pro Gemini 3.1 is Google’s most intelligent model family to date, built on advanced reasoning capabilities. It is designed to turn ideas into reality through agentic workflows, autonomous coding, and complex multimodal tasks. Gemini 3.1 Pro Preview is best suited for complex tasks requiring broad world knowledge and advanced cross-modal reasoning.	TextImageAudioVideo	Text	$2	$12	$0.2	$4.5	1M	64K
Glm 5 Turbo GLM-5-Turbo is a foundation model deeply optimized for OpenClaw’s Lobster scenarios. From the training stage, it has been specially optimized around the core requirements of Lobster tasks, strengthening key capabilities such as tool calling, instruction following, scheduled and persistent task handling, and long-chain execution. This enables it to be truly actionable even in complex, dynamic, and long-horizon tasks.	Text	Text	$0.714	$3.143	$0.171	-	200K	128K
Grok 4 The Grok 4 model possesses deep reasoning capabilities and is trained on xAI's Colossus supercomputer, promising stronger logical reasoning and text generation abilities. xAI claims it is the world's most powerful AI model, achieving PhD-level performance in handling academic problems. It excels in real-time speed, reasoning ability, and advanced vision.	TextImage	Text	$3	$15	$0.75	-	256K	16K
Grok 4 Fast Non Reasoning Grok-4-Fast-Non-Reasoning is an xAI Grok-family model aimed at conversational reasoning, real-time style assistance, coding, and agentic workflows, with fast and reasoning variants optimized for different latency and depth trade-offs.	TextImage	Text	$0.2	$0.5	$0.05	-	2M	30K
Grok 4 Fast Reasoning Grok-4-Fast-Reasoning is an xAI Grok-family model aimed at conversational reasoning, real-time style assistance, coding, and agentic workflows, with fast and reasoning variants optimized for different latency and depth trade-offs.	TextImage	Text	$0.2	$0.5	$0.05	-	2M	30K
Grok 4.1 Fast Non Reasoning Grok-4-1-fast-non-reasoning is an AI model developed by xAI, optimized for maximum speed in generating responses and carrying out agentic tasks. Unlike its “reasoning” counterpart, this variant skips the use of “thinking tokens,” allowing it to deliver immediate, pattern-matched answers for simple and straightforward queries.	TextImage	Text	$0.2	$0.5	$0.05	-	2M	64K
Grok 4.1 Fast Reasoning Grok 4.1 excels in creative, emotional, and collaborative interactions, showing a sharper ability to capture subtle intent and delivering a more engaging conversational experience. It also maintains a highly consistent personality while fully inheriting the sharp intelligence and reliable performance of its predecessor.	TextImage	Text	$0.2	$0.5	$0.05	-	2M	64K
Grok Code Fast 1 Grok-Code-Fast-1 is an xAI Grok-family model aimed at conversational reasoning, real-time style assistance, coding, and agentic workflows, with fast and reasoning variants optimized for different latency and depth trade-offs.	Text	Text	$0.2	$1.5	$0.02	-	2M	30K
Kimi K2.5 Kimi K2.5 is Kimi’s smartest model to date, achieving open-source state-of-the-art performance in agents, coding, visual understanding, and a wide range of general intelligence tasks. It is also Kimi’s most versatile model so far, featuring a natively multimodal architecture that supports both visual and text inputs, thinking and non-thinking modes, as well as dialogue and agent tasks.	TextImage	Text	$0.6	$3	$0.1	-	256K	32K
Kimi K2.6 Kimi-K2.6 is a Moonshot AI Kimi model designed for long-context understanding, multilingual chat, coding, document analysis, and agentic tasks, with newer K2 variants emphasizing stronger reasoning and tool-use capability.	TextImage	Text	$0.95	$4	$0.16	-	256K	64K
Llama 4 Maverick Meta's latest mixture-of-experts model	TextImage	Text	$0.5	$0.75	$0.125	$0.5	1M	65.5K
MiMo V2.5 pro MiMo-V2.5-Pro. It is our most powerful model to date, achieving significant improvements over its predecessor, MiMo-V2-Pro, in terms of overall intelligence capabilities, complex software engineering, and long-term tasks. MiMo-V2.5-Pro is a mixture-of-experts model with 1.02T parameters, 42B active parameters, based on a hybrid attention architecture, and has a 1 million token context window.	Text	Text	$0.429	$0.857	$0.00357	-	1M	128K
Mimo V2.5 MiMo-V2.5 is a sparse MoE (15B activated) model with 310B parameters, trained using 48T tokens. Its language backbone inherits the MiMo-V2-Flash hybrid sliding window attention architecture and is equipped with dedicated visual and audio encoders (both pre-trained internal encoders), connected via a lightweight projector.	Text	Text	$0.143	$0.286	$0.00286	-	1M	128K
MiniMax M2.1 MiniMax-M2.1 is a lightweight, cutting-edge large language model optimized for coding, agentic workflows, and modern application development. Activating only 10 billion parameters, it delivers a major leap in real-world capabilities while maintaining excellent latency, scalability, and cost efficiency.	Text	Text	$0.3	$1.2	$0.03	$0.375	204.8K	1.3M
MiniMax M2.5 MiniMax-M2.5 has achieved or surpassed the industry state of the art across productivity scenarios such as coding, tool use, search, and office work.	Text	Text	$0.3	$1.2	$0.03	$0.375	204.8K	131.1K
MiniMax M2.7 MiniMax-M2.7 is a MiniMax model for multilingual chat, reasoning, coding, and agent workflows, typically positioned as a high-throughput text model with competitive cost and broad enterprise use cases.	Text	Text	$0.3	$1.2	$0.06	$0.375	204.8K	131.1K
MiniMax M3 M3 achieves state-of-the-art performance in professional tasks such as encoding and proxy work. It adopts the new attention architecture MSA (MiniMax Sparse Attention) proposed by our team and supports an ultra-long context window of up to 1 million tokens. Excitingly, it is also a native MultiModal Machine Learning model that supports image and video input and can operate desktop computers.	Text	Text	$0.3	$1.2	$0.06	-	1M	128K
Mistral Large Mistral's flagship model for complex tasks	Text	Text	$2	$6	$0.5	$2	131.1K	8.2K
Nano Banana Gemini 2.5 Flash Image (codename "Nano Banana") is an image generation and editing model launched by Google DeepMind in August 2025. It features ultra-fast text-to-image, precise natural language editing, multi-image fusion, character consistency, and real-world physical reasoning, serving as the professional imaging model in the Gemini 2.5 family for creative design, e-commerce, and content creation.	Text	TextImage	$0.3	$2.5	$0.03	$1	32K	32K
Nano Banana 2 Nano Banana 2 provides high-quality image generation and conversational editing at a mainstream price point and low latency.	TextImage	TextImage	$0.5	$60	-	-	131.1K	32.8K
Nano Banana Pro Gemini 3 Pro Image Preview (Nano Banana Pro) is the next-generation AI image generation and editing model in Google’s Gemini family, serving as an upgraded version of Gemini 2.5 Flash Image (Nano Banana). Combining a multimodal Transformer with diffusion-based modeling, it natively supports 2K (2048×2048) and 4K output, with significant improvements in image quality, text rendering, and physical reasoning.	TextImage	TextImage	$2	$120	$0.2	$4.5	65.5K	32.8K
O3 O3 is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$2	$8	$0.5	-	200K	100K
O3 Mini O3-Mini is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	Text	Text	$1.1	$4.4	$0.55	-	200K	100K
O3 pro O3-Pro is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$20	$80	-	-	200K	100K
O4 Mini O4-Mini is an OpenAI model for general-purpose reasoning, coding, instruction following, tool use, and production chat workloads, with the exact speed, reasoning depth, and cost depending on the selected variant.	TextImage	Text	$1.1	$4.4	$0.275	-	200K	100K
QwQ Plus The QwQ inference model trained on the Qwen2.5-32B model significantly improves its inference capabilities through reinforcement learning. Core metrics such as mathematical code (AIME 24/25, livecodebench) and some general metrics (IFEval, LiveBench, etc.) reach the full-fledged level of DeepSeek-R1, and all metrics significantly surpass those of DeepSeek-R1-Distill-Qwen-32B, which is also based on Qwen2.5-32B.	Text	Text	$0.229	$0.571	-	-	128K	7K
Qwen 3.5 397B A17B Qwen3.5系列397B-A17B原生视觉语言模型，基于混合架构设计，融合了线性注意力机制与稀疏混合专家模型，实现了更高的推理效率。在语言理解、逻辑推理、代码生成、智能体任务、图像理解、视频理解、图形用户界面（GUI）等多种任务中，均展现出与当前顶尖前沿模型相媲美的卓越性能。具备强大的代码生成与智能体能力，对于各类智能体场景具有良好的泛化性。	TextImageVideo	Text	$0.429	$2.571	-	-	256K	64K
Qwen 3.5 Flash Qwen3.5原生视觉语言系列Flash模型，基于混合架构设计，融合了线性注意力机制与稀疏混合专家模型，实现了更高的推理效率。模型效果在纯文本与多模态方面相较3系列均实现飞跃式进步；响应速度快，兼具推理速度和性能。	TextImage	Text	$0.171	$1.714	$0.017	$0.214	1M	64K
Qwen 3.5 Plus The Qwen3.5 native vision-language Plus models adopt a hybrid architecture that combines linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. Across a wide range of benchmark tasks, the 3.5 series demonstrates outstanding performance comparable to today’s leading frontier models, while delivering a significant leap over the 3 series in both pure text and multimodal capabilities.	TextImageVideo	Text	$0.57	$3.426	$0.06	$0.714	1M	64K
Qwen3 Max Preview The latest Qwen3-Max-Preview model is the preview version of the Max model in the Qwen3 series. Compared with the Qwen 2.5 series, it delivers substantial overall improvements in general capabilities, with significantly enhanced Chinese and English text understanding, complex instruction following, subjective open-ended task performance, multilingual ability, and tool-use capability. The model also produces fewer knowledge hallucinations.	Text	Text	$2.143	$8.571	$0.429	-	256K	64K
Step 3.5 Flash Step 3.5 Flash is the strongest open-source foundation model released by StepFun to date. Its capabilities in agent scenarios and mathematical tasks approach those of closed-source models, enabling it to handle complex, long-chain tasks effectively.	Text	Text	$0.1	$0.3	-	-	256K	128K
Text Embedding 3 Large OpenAI's third-generation embedding model, more powerful than the small version, suitable for tasks requiring the highest performance	Text	Text	$0.13	$0.13	-	-	-	-
claude-sonnet-5 Claude Sonnet 5 is built as the most dominant sonnet model to date. It can devise plans, utilize tools such as browsers and terminals, and operate autonomously, achieving a level that was previously only attainable by larger and more expensive models just a few months ago. For many developers, the era of agent-based AI began with Sonnet-like models: Claude Sonnet 3.5, 3.6, and 3.7 were the first models to demonstrate impressive coding and tool usage capabilities. However, recently, the most significant advancements in agent capabilities have emerged in our Opus-level models.	Text	Text	$2	$10	$0.2	$2.5	-	-
gpt-5.6-luna You can start with GPT-5.6 Sol for complex reasoning and coding, choose GPT-5.6 Terra to balance intelligence and cost, or use GPT-5.6 Luna to handle cost-sensitive, high-traffic tasks.	TextImage	Text	$1	$6	$0.1	$1.25	-	-
gpt-5.6-sol You can start with GPT-5.6 Sol for complex reasoning and coding, choose GPT-5.6 Terra to balance intelligence and cost, or use GPT-5.6 Luna to handle cost-sensitive, high-traffic tasks.	TextImage	Text	$5	$30	$0.5	$6.25	-	-
gpt-5.6-terra You can start with GPT-5.6 Sol for complex reasoning and coding, choose GPT-5.6 Terra to balance intelligence and cost, or use GPT-5.6 Luna to handle cost-sensitive, high-traffic tasks.	TextImage	Text	$2.5	$15	$0.25	$3.125	-	-
kimi-k2.7-code Kimi K2.7 Code is Kimi’s most intelligent Coding model, capable of completing programming tasks with higher success rates in long context. It features a native multimodal architecture that supports text, image, video input, and thinking modes, and dialogue and agent tasks. Kimi K2.7 Code HighSpeed is the high-speed version of Kimi K2.7 Code, the same model as Kimi K2.7 Code, but with an output speed of approximately 180 Tokens/s and up to 260 Tokens/s in short context scenarios, delivering a more extreme coding experience. Context length 256k, supports long thinking and deep reasoning. Supports automatic context caching functionality, ToolCalls, JSON Mode, Partial Mode.	TextImageVideo	Text	$0.95	$4	$0.19	-	-	-
qvq-max The Qianwen QVQ visual reasoning model supports visual input and thought chain output, demonstrating stronger abilities in mathematics, programming, visual analysis, creativity, and general tasks.	TextImageAudioVideo	TextImageAudioVideo	$1.14	$4.57	-	-	-	-
qwen-coder-plus The Qianwen series code and programming model is a language model specifically designed for programming and code generation, with excellent performance and outstanding effects.	TextImageAudioVideo	TextImageAudioVideo	$0.5	$1	-	-	-	-
qwen3-max The Tongyi Qianwen 3 series Max model has undergone a special upgrade in the direction of intelligent agent programming and tool calling compared to the preview version. The official version of the model released this time has reached the level of SOTA in the field and is suitable for more complex intelligent agent requirements in scenarios.	TextImageAudioVideo	TextImageAudioVideo	$0.3571	$1.4286	$0.714	$0.4464	-	-
qwen3.6-plus The Qwen3.6 native visual language series Plus model demonstrates outstanding performance comparable to current cutting-edge models, with significantly improved model performance compared to the 3.5 series. The model has coding capabilities in Agent coding, front-end programming, Vibe coding, multimodal object recognition, and more OCR、 Significant enhancement in object positioning and other abilities.	TextImageAudioVideo	TextImageAudioVideo	$0.167	$1.714	$0.286	$0.3571	-	-