Causality- and Passivity-Constrained Nonnegative Attention for Interpretable Structure-Borne Road-Noise Prediction in Battery Electric Vehicles
Abstract
In battery electric vehicles, structure-borne road noise in the 20–300 Hz band becomes more audible because the engine-masking component is largely absent, and conventional transfer-path formulations can be sensitive to suspension nonlinearity and ill-conditioned inversions. This paper presents a physics-informed non-negative multi-modal fusion network (NN-MMFNet) that predicts in-cabin sound pressure from multi-point chassis excitations while keeping the mapping physically plausible and interpretable. The model combines a dual-stream encoder to separate transient impact signatures from steady resonance content with a strictly causal fusion/decoding pathway. A passivity-motivated spectral gain cap is applied to prevent non-physical amplification while preserving phase. To enable additive path attribution, the cross-modal attention weights are constrained to be non-negative. Training follows a sim-to-real workflow, using virtual-fleet pretraining and short fine-tuning on measured data. On a production BEV, NN-MMFNet reproduces the 20–300 Hz spectrum with a 1.12 dB(A) global RMSE at 60 km/h and a 0.14 dB error at the 128 Hz boom, outperforming TPA/FTM/ARMA baselines. Impulse-response checks show negligible passivity-violation rate (<0.01%). The learned attention consistently points to a rear subframe-to-body mounting path near 128 Hz, and a targeted stiffness adjustment at this location reduces the measured cabin noise by 4.2 dB(A).

