🍡 feedmeAI
← All topics
Reinforcement-learning 2 items

Everything Reinforcement-learning

📑 arXiv 3d ago

IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning

IG-Search introduces step-level information gain rewards for search-augmented reasoning, measuring how retrieved documents improve model confidence in answers relative to random baselines. This addresses the gradient collapse problem in trajectory-level RL when all sampled trajectories fail and enables distinguishing precise queries from vague ones within rollout groups.