About 50 results
Open links in new tab
  1. [1706.03762] Attention Is All You Need - arXiv.org

    Jun 12, 2017 · The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also …

  2. A General Survey on Attention Mechanisms in Deep Learning

    Mar 27, 2022 · Attention is an important mechanism that can be employed for a variety of deep learning models across many different domains and tasks. This survey provides an overview of the most …

  3. Gianni Brauwers and Flavius Frasincar Abstract—Attention is an important mechanism that can be employed for a variety of deep learning models across many different domains and tasks. This …

  4. Attention Mechanisms in Computer Vision: A Survey

    Nov 15, 2021 · Humans can naturally and effectively find salient regions in complex scenes. Motivated by this observation, attention mechanisms were introduced into computer vision with the aim of …

  5. Attention Mechanism in Neural Networks: Where it Comes and Where …

    Apr 27, 2022 · A long time ago in the machine learning literature, the idea of incorporating a mechanism inspired by the human visual system into neural networks was introduced. This idea is named the …

  6. The situation parallels the challenges that recurrent neural networks (RNNs) faced over the sequence dimension before attention mechanism provided an alternative. We observe a formal duality between …

  7. Abstract The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect …

  8. [2502.11089] Native Sparse Attention: Hardware-Aligned and Natively ...

    Feb 16, 2025 · Sparse attention offers a promising direction for improving efficiency while maintaining model capabilities. We present NSA, a Natively trainable Sparse Attention mechanism that integrates …

  9. ATTENTION2D: Communication Efficient Distributed Self-Attention …

    Mar 20, 2025 · However, the self-attention mechanism also incurs significant computational and memory costs, particularly for long sequences. In this paper, we introduce ATTENTION2D, a novel approach …

  10. Towards understanding how attention mechanism works in deep …

    Dec 24, 2024 · Attention mechanism has been extensively integrated within mainstream neural network architectures, such as Transformers and graph attention networks. Yet, its underlying working …