skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

Transformer with Memory Replay

arXiv.org, 2022-05

2022. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://creativecommons.org/licenses/by-nc-nd/4.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2205.09869

Full text available

Citations Cited by
  • Title:
    Transformer with Memory Replay
  • Author: Liu, Rui ; Mozafari, Barzan
  • Subjects: Computer Science - Learning ; Natural language processing ; Transformers
  • Is Part Of: arXiv.org, 2022-05
  • Description: Transformers achieve state-of-the-art performance for natural language processing tasks by pre-training on large-scale text corpora. They are extremely compute-intensive and have very high sample complexity. Memory replay is a mechanism that remembers and reuses past examples by saving to and replaying from a memory buffer. It has been successfully used in reinforcement learning and GANs due to better sample efficiency. In this paper, we propose \emph{Transformer with Memory Replay} (TMR), which integrates memory replay with transformer, making transformer more sample-efficient. Experiments on GLUE and SQuAD benchmark datasets show that Transformer with Memory Replay achieves at least \(1\%\) point increase compared to the baseline transformer model when pretrained with the same number of examples. Further, by adopting a careful design that reduces the wall-clock time overhead of memory replay, we also empirically achieve a better runtime efficiency.
  • Publisher: Ithaca: Cornell University Library, arXiv.org
  • Language: English
  • Identifier: EISSN: 2331-8422
    DOI: 10.48550/arxiv.2205.09869
  • Source: arXiv.org
    Free E Journals
    ROAD: Directory of Open Access Scholarly Resources
    ProQuest Central

Searching Remote Databases, Please Wait