📑 arXiv 3d ago
IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning
IG-Search introduces step-level information gain rewards for search-augmented reasoning, measuring how retrieved documents improve model confidence in answers relative to random baselines. This addresses the gradient collapse problem in trajectory-level RL when all sampled trajectories fail and enables distinguishing precise queries from vague ones within rollout groups.