284B parameter (13B activated) Mixture-of-Experts language model with 1M token context length. Features hybrid attention architecture combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA), Manifold-Constrained Hyper-Connections (mHC), and Muon optimizer for faster convergence. Supports three reasoning effort modes: Non-think, Think High, and Think Max. Pre-trained on 32T+ tokens with comprehensive post-training via GRPO and on-policy distillation.
Modalities
Context window
1,000,000
Pricing
input / output per 1M
Reasoning
Streaming
Real-time token-by-token response streaming
Function calling
Connect the model to external tools and systems
Structured outputs
Return responses in JSON schema format
Fine-tuning
Custom model training on your data
Reasoning
Extended thinking before responding
| MMLU PRO | 86.4 |
| SIMPLEQA VERIFIED | 34.1 |
| GPQA DIAMOND | 88.1 |
| HLE | 34.8 |
| LIVECODEBENCH | 91.6 |
| CODEFORCES RATING | 3052 |
| HMMT 2026 FEB | 94.8 |
| IMO ANSWERBENCH | 88.4 |
| SWE BENCH VERIFIED | 79.0 |
| SWE PRO | 52.6 |
| SWE MULTILINGUAL | 73.3 |
| TERMINALBENCH 2 | 56.9 |
| BROWSECOMP | 73.2 |
| HLE WITH TOOLS | 45.1 |
| MCP ATLAS | 69.0 |
| TOOLATHLON | 47.8 |
| MRCR 1M | 78.7 |
| CORPUSQA 1M | 60.5 |
| Release date | 2026-05-01 |
| Model ID | deepseek-v4-flash |
| Provider | DeepSeek |
Get detailed information about DeepSeek-V4-Flash, including its context window of 1000000 tokens, pricing per million tokens, supported input and output modalities, and benchmark scores. This model from DeepSeek offers specific capabilities for natural language processing, code generation, and complex reasoning tasks that set it apart from alternatives.
Compare input and output token pricing for DeepSeek-V4-Flash against other models in its class. Understanding LLM pricing is essential for budgeting your AI applications at scale. We break down the cost per million tokens for both input and output so you can estimate the total cost of your workloads and compare value across providers.
Review benchmark performance data for DeepSeek-V4-Flash across key evaluation metrics. Compare its reasoning, coding, and language understanding capabilities against competing models to determine if it is the right fit for your specific requirements, whether that involves complex analysis, creative generation, or efficient inference at scale.