LLM Models/Gemma 4 31B

Gemma 4 31B by Google DeepMind — 256K Context

30.7B-parameter dense multimodal model from Google DeepMind with 256K token context. Handles text, image, and video input with text output. Features hybrid attention (sliding window + global), configurable thinking mode, native function calling, and multilingual support in 140+ languages. Optimized for coding, reasoning, and agentic workflows.

At a glance

Modalities

Context window

256,000

Pricing

— / —

input / output per 1M

Reasoning

Enabled

Capabilities

Streaming

Real-time token-by-token response streaming

Function calling

Connect the model to external tools and systems

Structured outputs

Return responses in JSON schema format

Fine-tuning

Custom model training on your data

Reasoning

Extended thinking before responding

Benchmarks

MMLU PRO	85.2
AIME 2026	89.2
LIVECODEBENCH V6	80.0
CODEFORCES ELO	2150
GPQA DIAMOND	84.3
TAU2 AVG	76.9
HLE NO TOOLS	19.5
HLE WITH SEARCH	26.5
BIGBENCH EXTRA HARD	74.4
MMMLU	88.4
MMMU PRO	76.9
OMNIDOCBENCH 1 5	0.131
MATH VISION	85.6
MEDXPERTQA MM	61.3
LONG CONTEXT MRCR V2	66.4

Details

Release date	2026-05-01
Model ID	gemma-4-31b
Provider	Google DeepMind