Google released a research paper describing the SAGE agentic AI system. The paper explains how Google generates training data for deep research agents. SAGE stands for Steerable Agentic Data Generation for Deep Search with Execution Feedback. The system focuses on improving multi-step reasoning during complex search tasks.
Limits of Existing Research Datasets
The research identifies problems in commonly used question-answer datasets. These include Musique, HotpotQA, and Natural Questions. Most questions in these datasets require only a few reasoning steps. Many can be solved with limited searching. This makes them less effective for training advanced research agents.
Researchers observed that AI systems often completed tasks too easily. In other cases, questions failed because required information was unavailable. Both outcomes reduced training value.
How the SAGE System Works
SAGE uses two AI agents. One agent generates questions. The second agent attempts to answer them using search. The system measures how many steps are needed to complete each task. If a question is too simple or unsolvable, feedback is returned to the generator. The process continues until the question requires deeper reasoning. This approach produces more complex training data.
Search Patterns Identified in the Study
The paper highlights several shortcuts that reduced research depth. One example is information co-location, where all facts appear on one page. Another is multi-query collapse, where a single search retrieves multiple answers. Other cases involved surface-level complexity or overly narrow questions.
Search Ranking Observations
During testing, agents frequently used information from the top three search results. This detail shows how ranking position influenced data retrieval during training. The researchers note that the experiment does not fully represent live search systems. Real-world behavior may differ.
Source: https://www.searchenginejournal.com/googles-sage-agentic-ai-research-what-it-means-for-seo/566215/
