SRMU introduces relevance-gated updates for Vector Symbolic Architectures to prevent stale information in streaming sequential associative memories. Traditional additive updates reinforce old observations even when no new information arrives, causing failures in non-stationary environments; this work addresses imbalanced sampling and temporal dynamics in real-world incremental learning.
Three-Phase Transformer (3PT) partitions hidden states into cyclic channels maintained by phase-respecting operations including per-channel normalization and 2D Givens rotations between attention and FFN layers. Creates a self-stabilizing architecture with a DC subspace for absolute position encoding orthogonal to RoPE, representing a structural prior rather than an added module.
Leaked Claude Code source reveals three-layer memory architecture (file-read deduplication, structured session memory), dedicated repository navigation tools (Grep, Glob, LSP) instead of relying on model context, and forked subagents for parallelized background analysis. Demonstrates that coding agent performance stems from careful harness engineering around the model rather than just model intelligence alone.
Comprehensive visual reference documenting LLM architectures from GPT-2 through March 2026, including standardized fact sheets, decoder block diagrams, and architectural lineage tracking. Covers recent innovations like DeepSeek V3's MLA and Qwen3.5's Gated DeltaNet hybrid. Available as 182-megapixel poster with source data on GitHub, serving as canonical resource for understanding architectural evolution.