Route to Rome Attack (R²A) exploits LLM routers by using adversarial suffix optimization to force expensive model selection, increasing costs. Uses hybrid ensemble surrogate routers to mimic black-box routing logic, demonstrating new attack surface in cost-aware inference systems.
TRACER trains lightweight ML surrogates on LLM production traces to route classification traffic, activating them only when agreement with the base LLM exceeds a user-specified threshold. This approach converts logged inference data into a continuously growing training set that handles routine traffic at near-zero marginal cost while deferring edge cases to the full model.
Source-available AI gateway from 35m.ai supporting unified access to text, image, video, audio, and music generation APIs with intelligent multi-provider routing and hybrid BYOK (bring-your-own-key) workflows. Optimizes compute utilization across heterogeneous provider backends.