Authors: Tao Zheng、Liejun Wang、Yinfeng Yu Paper: https://arxiv.org/abs/2408.06911 Introduction Speech communication is a fundamental mode of human interaction, but environmental noise often degrades the quality and clarity of speech data. Speech enhancement (SE) technology aims to mitigate the impact of noise while preserving the integrity of the original signal. This paper introduces a novel speech enhancement framework, HFSDA, which integrates heterogeneous spatial features and incorporates a dual-dimension attention mechanism to significantly enhance speech clarity and quality in noisy environments. Related Work Self-Supervised Learning Models Self-supervised learning (SSL) models have shown significant progress in speech tasks. Early methods like Contrastive Predictive Coding…
Read More