LLM Models/Gemma 4 31B

Gemma 4 31B by Google DeepMind — 256K Context

30.7B-parameter dense multimodal model from Google DeepMind with 256K token context. Handles text, image, and video input with text output. Features hybrid attention (sliding window + global), configurable thinking mode, native function calling, and multilingual support in 140+ languages. Optimized for coding, reasoning, and agentic workflows.

At a glance

Modalities

Context window

256,000

Pricing

/

input / output per 1M

Reasoning

Enabled

Capabilities

Streaming

Real-time token-by-token response streaming

Function calling

Connect the model to external tools and systems

Structured outputs

Return responses in JSON schema format

Fine-tuning

Custom model training on your data

Reasoning

Extended thinking before responding

Benchmarks

MMLU PRO 85.2
AIME 2026 89.2
LIVECODEBENCH V6 80.0
CODEFORCES ELO 2150
GPQA DIAMOND 84.3
TAU2 AVG 76.9
HLE NO TOOLS 19.5
HLE WITH SEARCH 26.5
BIGBENCH EXTRA HARD 74.4
MMMLU 88.4
MMMU PRO 76.9
OMNIDOCBENCH 1 5 0.131
MATH VISION 85.6
MEDXPERTQA MM 61.3
LONG CONTEXT MRCR V2 66.4

Details

Release date 2026-05-01
Model ID gemma-4-31b
Provider Google DeepMind