.webp&w=3840&q=75)

Google has introduced the Gemini 3.1 Flash‑Lite ఏఐ model, describing it as the fastest and most cost-efficient model in the Gemini 3 series. The Mountain View-based tech giant said the model is designed for high-volume developer workloads. Currently, it is not available to general users and is limited to developers and enterprises. The company also noted that the new model delivers faster output speed compared with the Gemini 2.5 series.
According to Google, the model is available in preview through the Gemini API in Google AI Studio and through Vertex AI for enterprise users. The company claims that Gemini 3.1 Flash-Lite offers a 2.5-times faster “Time to First Answer Token” and a 45 percent improvement in output speed compared to Gemini 2.5 Flash. It reportedly achieved an Elo score of 1432 on the Arena.ai leaderboard and is said to outperform models like GPT‑5 mini, Claude 4.5 Haiku, and Grok 4.1 Fast in output speed.
The model supports both standard and thinking modes, allowing developers to control the reasoning time for tasks. Google said it can handle high-volume translation, content moderation, and complex tasks such as generating user interfaces, dashboards, and simulations. The company also highlighted its affordability, pricing input tokens at $0.25 per million and output tokens at $1.5 per million, making it more cost-efficient than the Gemini 2.5 Flash model.
.webp&w=3840&q=75)











Comments (0)
No comments yet
Be the first to comment!