💬 Reddit 6d ago
I scaled a pure Spiking Neural Network (SNN) to 1.088B parameters from scratch. Ran out of budget, but here is what I found [R]
Independent researcher trained a 1.088B parameter pure Spiking Neural Network for language modeling from random initialization, achieving 4.4 loss and 93% activation sparsity at 27k steps before running out of compute budget. This challenges conventional wisdom that billion-scale SNNs require ANN-to-SNN conversion due to vanishing gradients, demonstrating direct spike-domain training is viable. Cross-lingual emergence appeared around step 25K despite no explicit multilingual objective.