Research/NLP_Paper
-
Prefix-Tuning: Optimizing Continuous Prompts for GenerationResearch/NLP_Paper 2024. 7. 27. 11:11
https://arxiv.org/pdf/2101.00190AbstractFine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks. However, it modifies all the language model parameters and therefore necessitates storing a full copy for each task. In this paper, we propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks, which keeps la..
-
[LoRA] Low-rank Adaptation of Large Language ModelsResearch/NLP_Paper 2024. 7. 27. 00:44
https://arxiv.org/pdf/2106.09685AbstractAn important paradigm of natural language processing consists of large-scale pretraining on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example – deploying independent instances of fine-tuned models, eac..
-
[AdapterFusion] Non-Destructive Task Composition for Transfer LearningResearch/NLP_Paper 2024. 7. 26. 17:40
https://arxiv.org/pdf/2005.00247AbstractSequential fine-tuning and multi-task learning are methods aiming to incorporate knowledge from multiple tasks; however, they suffer from catastrophic forgetting and difficulties in dataset balancing. To address these shortcomings, we propose AdapterFusion, a new two stage learning algorithm that leverages knowledge from multiple tasks. First, in the knowl..
-
[Adapter] Parameter-Efficient Transfer Learning for NLPResearch/NLP_Paper 2024. 7. 26. 11:48
https://arxiv.org/pdf/1902.00751AbstractFine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we propose transfer with adapter modules. Adapter modules yield a compact and extensible model; they add only a few traina..
-
Large Language Models are Zero-Shot ReasonersResearch/NLP_Paper 2024. 7. 26. 00:22
https://arxiv.org/pdf/2205.11916Abstract Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars. Notably, chain of thought (CoT) prompting, a recent technique for eliciting complex multi-step reasoning through step-by-step answer examples, achieved the state-of..
-
[CoT] Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsResearch/NLP_Paper 2024. 7. 25. 15:24
https://arxiv.org/pdf/2201.11903AbstractWe explore how generating a chain of thought—a series of intermediate reasoning steps—significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerge naturally in sufficiently large language models via a simple method called chain-of-thought prompting, where a few chain of..
-
[FLAN] Finetuned Language Models Are Zero-shot LearnersResearch/NLP_Paper 2024. 7. 25. 10:49
https://arxiv.org/pdf/2109.01652AbstractThis paper explores a simple method for improving the zero-shot learning abilities of language models. We show that instruction tuning—finetuning language models on a collection of datasets described via instructions—substantially improves zeroshot performance on unseen tasks. We take a 137B parameter pretrained language model and instruction tune it on o..
-
(2/2) Pre-train, Prompt, and Predict: Prompting Methods in Natural Language ProcessingResearch/NLP_Paper 2024. 7. 25. 00:14
https://arxiv.org/pdf/2107.135867. Training Strategies for Prompting MethodsWith the methods in the above sections, it is now clear how to obtain an appropriate prompt (or prompts) and corresponding answers. Now we discuss about methods that explicitly train models in concert with prompting methods, as outlined in the “Training Strategies” section of Fig.1.7.1. Training SettingsIn many cases, pr..