8B-parameter end-to-end unified multimodal model based on NEO-Unify architecture. Native unified text and image understanding + generation without separate visual encoder or VAE. Supports visual Q&A, text-to-image generation, image editing, and interleaved text-image generation. Context length up to 32K tokens. Available in both 8B dense MoT and A3B MoE variants. Optimized for infographic generation and complex visual reasoning.
Modalities
Context window
32,000
Pricing
input / output per 1M
Reasoning
Streaming
Real-time token-by-token response streaming
Function calling
Connect the model to external tools and systems
Structured outputs
Return responses in JSON schema format
Image generation
Generate images from text descriptions
| Release date | 2026-05-01 |
| Model ID | sensenova-u1-8b-mot |
| Provider | SenseNova |