ML Seminar Bojian Yin (Institute of Automation, Chinese Academy of Sciences, Beijing)

Selective-Update RNNs: A New Architecture for Long-Range Sequence Modeling

When
18 Feb 2026 from 4 p.m. to 18 Feb 2026 5 p.m. CET (GMT+0100)
Where
CWI, room L016
Add

Selective-Update RNNs: A New Architecture for Long-Range Sequence Modeling

Abstract: Real-world sequential signals contain critical information that is often embedded within long periods of silence or noise ; however, in standard recurrent modeling, it can be difficult to preserve these sparse but crucial events without them being overwritten by continuous, incoming data. Therefore, we asked: how can recurrent models dynamically filter their own activity to facilitate the retention of distant, critical events? We designed Selective-Update Recurrent Neural Networks (suRNNs) to model sequences with highly non-uniform information density, using a neuron-level binary switch that only opens for informative events. RNNs equipped with this selective mechanism decouple their recurrent updates from the raw sequence length, maintaining an exact, unchanged memory of the past during low-information intervals. The rigid design of standard RNNs suggests that integrating information at every moment is necessary for continuous sequence processing ; however, we demonstrate that this constant activity forces the model to overwrite its own memory and actively hinders the learning signal from reaching back to distant past events. We validate our architectural findings across multiple modalities, showing that suRNNs match or exceed the accuracy of much more complex models like Transformers on benchmarks including the Long Range Arena and WikiText. Additionally, by allowing each neuron to learn its own update timescale, our approach resolves the fundamental mismatch between sequence length and information content, establishing a highly efficient and principled framework for sequence modeling.