GPT-4.1 excels at instruction following and tool calling, with broad knowledge across domains. Features 1M token context window and low latency without reasoning.
Modalities
Context window
1,047,576
Pricing
input / output per 1M
Reasoning
Streaming
Real-time token-by-token response streaming
Function calling
Connect the model to external tools and systems
Structured outputs
Return responses in JSON schema format
Fine-tuning
Custom model training on your data
Web search
Search the internet for real-time information
File search
Search and retrieve from uploaded files
Code execution
Execute code in a sandboxed environment
Image generation
Generate images from text descriptions
| Release date | 2025-04-14 |
| Knowledge cutoff | 2024-06-01 |
| Model ID | gpt-4.1 |
| Provider | OpenAI |
| Tier | RPM | TPM | Batch queue |
|---|---|---|---|
| Free Tier | 3 | 40,000 | — |
| Tier1 | 500 | 200,000 | 2,000,000 |
| Tier2 | 5,000 | 2,000,000 | 20,000,000 |
| Tier3 | 5,000 | 4,000,000 | 40,000,000 |
| Tier4 | 10,000 | 10,000,000 | 1,000,000,000 |
| Tier5 | 30,000 | 150,000,000 | 15,000,000,000 |