Robust Cognitive–Flexible Filtering
under Noisy Innovation Scores

Structure selection that stays reliable — even when the scores are noisy.

Submitted to IEEE Signal Processing Letters, 2026

slide: beamer-SPL.pdf

Julia-powered Julia source code:

Detect structural mismatch,  Switch only when evidence is clear,  and Stabilise under noise — guaranteed.


Belief \(\mathfrak{B}_t\) Noisy Score \(\hat{\Phi}_t(s)\) Margin Rule \(\delta>2\bar{\varepsilon}\) Structure \(s_{t+1}\) Belief Update \(\mathfrak{B}_{t+1}\)

A hysteresis-based switching rule that provably suppresses spurious transitions under bounded score noise.

What Is the Problem?

Cognitive Flexibility (CF) [Nuchkrua, Boonto & Liu, arXiv:2604.08130] selects, at each step, the latent structure \(s\in\mathcal{S}\) minimising an innovation-based predictive score \(\Phi_t(s):=-\log\ell_{\theta,s}(y_{t+1}|\mathfrak{B}_t,u_t)\). Under exact scores, three guarantees hold: descent in expectation, finite switching, and non-chattering.

In practice, scores are estimated from a particle filter and carry additive perturbations \(\epsilon_t(s)\) satisfying \(\mathbb{E}[\epsilon_t(s)\mid\mathcal{I}_t]=0\) and \(|\epsilon_t(s)|\leq\bar{\varepsilon}\) a.s. When score differences are of the same order as \(\bar{\varepsilon}\), direct score minimisation induces spurious switching — noise-driven transitions that degrade predictive accuracy and inflate computational cost. Whether CF remains stable under corrupted scores was entirely open.


⚙ The Margin-Based Solution

We introduce the margin-based switching rule:

\[ s_{t+1} = \begin{cases} \hat{s}_t, & \hat{\Phi}_t(s_t) - \hat{\Phi}_t(\hat{s}_t) > \delta, \\ s_t, & \text{otherwise,} \end{cases} \quad \hat{s}_t\in\arg\min_{s}\hat{\Phi}_t(s), \]

with \(\delta > 2\bar{\varepsilon}\). This single design choice restores all three stability properties of the noiseless theory, with no modification to the underlying filter recursion. Inspired by hysteresis switching control [Morse 1996; Hespanha & Morse 1999].

Lock-in and switching regimes (let \(\gamma_t=\Phi_t(s_t)-\Phi_t(s^\star)\)):

  • \(\gamma_t > \delta+2\bar{\varepsilon}\): switching guaranteed
  • \(\gamma_t-2\bar{\varepsilon}\leq\delta <\gamma_t+2\bar{\varepsilon}\): switching possible but not guaranteed
  • \(\delta\geq\gamma_t+2\bar{\varepsilon}\): switching impossible (noise-only switch suppressed)

📐 Main Theoretical Guarantees

  • Theorem 1 (Descent in Expectation): \(\mathbb{E}[\Phi_t(s_{t+1})\mid\mathcal{I}_t] \leq\mathbb{E}[\Phi_t(s_t)\mid\mathcal{I}_t]\) for all \(t\geq 0\) — the expected predictive score never increases after a switch, regardless of noise realisations.
  • Theorem 2 (Finite Expected Switching): \[ \mathbb{E}[N_T]\leq\frac{1}{\delta-2\bar{\varepsilon}} \sum_{t=0}^{T-1} \mathbb{E}\!\left[\Phi_t(s_t)-\min_{s\in\mathcal{S}} \Phi_t(s)\right] \] — total switches scale as \((\delta-2\bar{\varepsilon})^{-1}\); finite whenever score suboptimality is summable.
  • Theorem 3 (Non-Chattering): Once \(\limsup_{t\to\infty}[\Phi_t(s_t)-\min_s\Phi_t(s)] <\delta-2\bar{\varepsilon}\) a.s., no further switching occurs a.s. — CF stabilises after finitely many transitions.
Setting \(\delta=\alpha\bar{\varepsilon}\) with \(\alpha\in(2,4]\) provides a practical operating range: \(\alpha\) close to 2 maximises responsiveness to genuine structural change, while larger \(\alpha\) provides greater noise immunity. The experiments use \(\alpha=2.5\); \(W_0=50\) steps is recommended for estimating \(\bar{\varepsilon}\) in practice.

Numerical Experiments

Three experiments validate Theorems 1–3 on the canonical nonlinear stochastic growth model [Gordon 1993; Arulampalam 2002]: \[ z_{t+1}=\tfrac{1}{2}z_t+\tfrac{25z_t}{1+z_t^2} +8\cos(1.2t)+w_t, \quad y_t=\tfrac{1}{20}z_t^2+v_t, \] with \(w_t\sim\mathcal{N}(0,10)\), \(v_t\sim\mathcal{N}(0,1)\), \(z_0\sim\mathcal{N}(0,5)\), \(N_p=500\) particles, \(M=100\) runs, \(T=200\). Candidate structures: \(\mathcal{S}=\{s_\mathrm{nl}, s_\mathrm{lin}\}\). Score perturbations \(\epsilon_t(s)\sim\mathrm{Uniform}(-\bar{\varepsilon}, \bar{\varepsilon})\). Three methods: Exact CF (oracle), CF without margin (\(\delta=0\)), and Robust CF (\(\delta=2.5\bar{\varepsilon}\)).

Table I: Switching and Score Results

Method \(\bar{\varepsilon}=0.5\) \(\bar{\varepsilon}=1.5\) \(\bar{\varepsilon}=3.0\)
\(\mathbb{E}[N_T]\) \(\bar{\Phi}_T\) \(\mathbb{E}[N_T]\) \(\bar{\Phi}_T\) \(\mathbb{E}[N_T]\) \(\bar{\Phi}_T\)
Exact CF (oracle) 0.32.71 0.32.71 0.32.71
CF w/o margin (\(\delta=0\)) 83.79.60 81.19.58 79.29.55
Robust CF (proposed) 0.32.83 1.32.95 7.93.48
Thm. 2 bound 8.2 11.4 17.6

Lower \(\mathbb{E}[N_T]\) and \(\bar{\Phi}_T\) indicate better performance. Robust CF stays well below Theorem 2 bound ✓

Experiment Results

  • Exp. I (Descent, Fig. 2): Running-average \(\bar{\Phi}_t\) of Robust CF tracks the oracle throughout; CF without margin remains elevated by \(\approx3\times\), confirming Theorem 1.
  • Exp. II (Finite switching, Table I): Robust CF empirical \(\mathbb{E}[N_T]\) lies well below the Theorem 2 bound at all noise levels, confirming Theorem 2.
  • Exp. III (Non-chattering, Fig. 3): Robust CF maintains \(\hat{p}_t<0.05\); CF without margin exhibits \(\hat{p}_t\approx0.4\) persistently, confirming Theorem 3.
  • Bound tightness (Fig. 4): Empirical \(\mathbb{E}[N_T]\) decays monotonically with \(\alpha=\delta/\bar{\varepsilon}\) and lies strictly below the Theorem 2 bound for all \(\alpha>2.5\).

Impact and Applications

  • Adaptive state estimation under sensor noise and model uncertainty
  • Particle filter-based systems where score approximation errors are unavoidable
  • Neural-network-aided filtering (KalmanNet, Split-KalmanNet, Latent-KalmanNet) where learned scores carry structured errors
  • Supervisory control with noisy performance metrics
  • Fault detection in drifting systems

Reproducibility

All Julia source code is publicly available:

📦 Full repository