📑 arXiv 2d ago
AST: Adaptive, Seamless, and Training-Free Precise Speech Editing
AST is a training-free speech editing framework using pre-trained autoregressive TTS models with Latent Recomposition to precisely edit speech segments while preserving speaker identity and acoustic context. Eliminates trade-offs between editing quality and consistency by selectively stitching preserved and synthesized segments without task-specific training.