The Gemini API additionally now offers extra granular management over multimodal imaginative and prescient processing, with a media_resolution parameter for configuring what number of tokens are used for picture, video, and doc inputs. Builders can stability visible constancy with token utilization. Decision may be set utilizing media_resolution_low, media_resolution_medium, or media_resolution_high. Increased decision boosts the mannequin’s capacity to learn positive textual content or establish small particulars, Google mentioned.
Beginning with Gemini 3, Gemini API additionally brings again thought signatures to enhance perform calling and picture era. Thought signatures are encrypted representations of the mannequin’s inner thought course of. By passing these signatures again to the mannequin in subsequent API calls, builders can be certain that Gemini 3 maintains its chain of reasoning throughout a dialog. That is essential for complicated, multi-step agentic workflows the place preserving the “why” behind a call is as essential as the choice itself, Google mentioned.
Moreover, builders now can mix structured outputs with Gemini-hosted instruments, particularly Grounding with Google Search and URL Context. Combining structured outputs is very highly effective for constructing brokers that should fetch dwell info from the net or particular net pages and extract the information right into a JSON format for downstream duties, Google mentioned, noting that it has up to date the pricing of Grounding with Google Search to higher help agentic workflows. The pricing mannequin adjustments from a flat fee of $35 per 1K prompts to a usage-based fee of $14 per 1,000 search queries.
