Model › Mira

Mira. Built for speed.

Sub-400ms responses, high-volume throughput, and pricing that works at scale. Mira is the model for the moments where latency matters more than depth.

Join waitlist API reference →

128K context4K max outputText only26+ languages<400ms first token₹29/1M input · ₹182/1M output

Capabilities

Mira is built for scale.

Real-time chat

Sub-400ms to first token means Mira feels instant. Build chatbots, support agents, and conversational interfaces where any perceptible lag breaks the experience.

High-volume classification

Tagging, routing, moderation, sentiment, intent - Mira handles the small decisions that happen millions of times a day. At ₹29 per million input tokens, the economics hold up at high volume.

Autocomplete and suggestion

Fast enough to live inside an editor or search box, Mira powers the inline AI experiences where response time is the product.

Pricing

Priced to run at scale.

Input

₹29per 1M tokens

For prompts at real-time-chat volume. $0.30 per 1M in USD.

Output

₹182per 1M tokens

For generated responses, at any volume. $1.90 per 1M in USD.

Per-token pricing that holds up at high volume, billed in your currency.

Choose well

When to pick Mira.

Pick Mira when speed and cost are the product - real-time chat, classification, autocomplete, anything that happens thousands of times per minute.

Pick Vaani when you need reasoning quality or vision, and the workload isn't latency-critical. Most interactive apps fit Vaani, not Mira.

Pick Kavi when the task is rare, hard, and accuracy-critical.

Built for volume.

When latency matters, Mira answers first.

Join waitlist API reference →