🍡 feedmeAI
← All topics
Multimodal 5 items

Everything Multimodal

💬 Reddit Apr 22

Qwen3 TTS is seriously underrated - I got it running locally in real-time and it's one of the most expressive open TTS models I've tried

Qwen3 TTS achieves real-time local inference with notably expressive output, integrated into the open-source Persona Engine project (ASR→LLM→TTS pipeline with lip-synced avatar). The author positions it as a meaningful step up from prior local TTS options like Sesame for latency-sensitive, fully offline deployments.

💬 Reddit Apr 16

Qwen3.6-35B-A3B released!

Qwen3.6-35B-A3B is a sparse MoE model with 35B total and only 3B active parameters, released under Apache 2.0. Claims agentic coding performance on par with models 10× its active size, with both multimodal thinking and non-thinking modes. Efficient active-parameter footprint makes it practical for inference on constrained hardware.

🔶 Anthropic Apr 16
⭐ Editor's Pick

Introducing Claude Opus 4.7

Anthropic's official Claude Opus 4.7 GA post confirms same pricing as 4.6, image resolution raised to 2,576px long edge (~3.75 MP, 3× prior), and a new xhigh effort tier. Coding benchmarks: +13% task resolution on internal 93-task harness, 70% on CursorBench (vs. 58%), 98.5% on XBOW visual-acuity (vs. 54.5%). First model shipped with real-time cyber safeguards derived from the restricted Mythos Preview testbed.

Ⓜ️ Meta AI Apr 8

Meta Muse Spark: first model from Meta Superintelligence Labs, proprietary pivot from Llama

Meta Superintelligence Labs' first model, Muse Spark, is a small, fast proprietary model with native multimodal perception and multi-agent parallel subagent execution—a sharp departure from Meta's Llama open-source strategy. Led by Alexandr Wang, it powers the revamped Meta AI app with Instant and Thinking modes and is rolling out across WhatsApp, Instagram, Facebook, Messenger, and Ray-Ban glasses. API access is restricted to select partners only.