Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Cu, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.12281  [pdf, other

    cs.CL cs.LG

    Lifelong Language Pretraining with Distribution-Specialized Experts

    Authors: Wuyang Chen, Yanqi Zhou, Nan Du, Yanping Huang, James Laudon, Zhifeng Chen, Claire Cu

    Abstract: Pretraining on a large-scale corpus has become a standard method to build general language models (LMs). Adapting a model to new data distributions targeting different downstream tasks poses significant challenges. Naive fine-tuning may incur catastrophic forgetting when the over-parameterized LMs overfit the new data but fail to preserve the pretrained features. Lifelong learning (LLL) aims to en… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

    Comments: ICML 2023