Reasoning surface
DeepSeek V4 is credible, but not the public leader.
DeepSeek V4 Pro Max reports GPQA Diamond 90.1 and HLE 37.7. In the same public snapshot set, Gemini 3.1 Pro reports GPQA 94.3 and ARC-AGI-2 77.1.
Public benchmark snapshot
A static keyword page for people searching deepseek v4 benchmark, built from vendor-published model cards, API docs, and launch posts captured on April 24, 2026.
Snapshot
The page focuses on published signals that matter for buyers, builders, and people comparing model families quickly.
Reasoning surface
DeepSeek V4 Pro Max reports GPQA Diamond 90.1 and HLE 37.7. In the same public snapshot set, Gemini 3.1 Pro reports GPQA 94.3 and ARC-AGI-2 77.1.
Coding pressure
DeepSeek V4 Pro Max posts LiveCodeBench 93.5, Codeforces 3206, and SWE-Bench Verified 80.6, which is enough to keep it in the same conversation as the top closed models.
Long context
DeepSeek V4, Claude Opus 4.6, and Gemini 3.1 Pro all publish 1M token context support. OpenAI's GPT-5.4 API page lists a 1,050,000 token context window.
Cost view
OpenAI, Anthropic, and Google expose public API rates. DeepSeek's current public pricing pages still describe legacy model IDs, so the DeepSeek V4 row stays marked as Not publicly disclosed.
Comparison
These are published highlights, not normalized lab reruns. Use them to understand positioning, then verify the original source before making a technical or purchasing decision.
| Model | Public status | Published highlights | Context and price | Read |
|---|---|---|---|---|
| DeepSeek V4 Pro Max DeepSeek | Preview released on April 24, 2026. |
|
|
|
| GPT-5.4 OpenAI | Released on March 5, 2026. |
|
|
|
| Claude Opus 4.6 Anthropic | Released on February 5, 2026. |
|
|
|
| Gemini 3.1 Pro Google | Preview released on February 19, 2026. |
|
|
Reading tip: if you need a single headline, DeepSeek V4's public strength is coding. If you need the strongest public reasoning snapshot in this set, Gemini 3.1 Pro and GPT-5.4 still publish the sharper top-end numbers.
Methodology
The page avoids invented rollups, hidden scoring, and synthetic averages. Everything here maps back to a public vendor page.
FAQ
It is a static landing page that aggregates public benchmark, context window, and pricing signals for DeepSeek V4 and nearby frontier models so the search intent lands on something factual instead of speculation.
Yes. DeepSeek announced the V4 preview on Friday, April 24, 2026, and its API docs now list
deepseek-v4-pro and deepseek-v4-flash.
Not across the board. DeepSeek V4 Pro Max looks strongest in coding-heavy public benchmarks, while GPT-5.4 and Gemini 3.1 Pro still publish stronger reasoning and agentic-search results, and Claude Opus 4.6 remains very competitive on long-running coding and knowledge work.
Vendors use different harnesses, different reasoning budgets, and sometimes different tool stacks. That is why this page uses the phrase "public snapshot" instead of pretending these rows are a single normalized leaderboard.
Because the public DeepSeek pricing pages still describe legacy model IDs and do not yet publish a dedicated V4 API table. Making up an inferred price would be less useful than leaving the field empty.
Sources
Open each source directly if you need the exact benchmark table, pricing page, or launch date context.