전체 글
-
Prompt Engineering Guide (1/2)Research/NLP_reference 2024. 7. 24. 10:34
https://www.promptingguide.ai/https://github.com/dair-ai/Prompt-Engineering-Guidehttps://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-the-openai-api Prompt engineering is a relatively new discipline for developing and optimizing prompts to efficiently use language models (LMs) for a wide variety of applications and research topics. Prompt engineering skills help..
-
[T5] (3/3) Exploring the Limits of Transfer Learning with a Unified Text-to-Text TransformerResearch/NLP_Paper 2024. 7. 22. 18:02
https://arxiv.org/pdf/1910.106833.7. Putting It All TogetherWe now leverage the insights from our systematic study to determine how far we can push performance on popular NLP benchmarks. We are also interested in exploring the current limits of transfer learning for NLP by training larger models on large amounts of data. We start with our baseline training approach and make the following changes..
-
[T5] (2/3) Exploring the Limits of Transfer Learning with a Unified Text-to-Text TransformerResearch/NLP_Paper 2024. 7. 22. 12:09
https://arxiv.org/pdf/1910.10683자, 심호흡 하고 시작하자!! ㅎㅎㅎㅎㅎ 갈 길이 어마어마하게 멀다 ㅋㅋㅋㅋㅋGoogle 연구진들이 엄청 심혈을 기울여서 experiments를 진행하고, T5를 완성하셨어 ㅋㅋㅋ이거 보다가 나 쓰러질지도 몰라. T5 자세히 봐야겠다 생각한 게 1년이 넘었다 ㅋㅋㅋ이제 마지막 기회야. 지금 지나면 이제 이렇게 시간을 들여서 볼 기회가 없을 것 같아. 후 =3 후 =3 심호흡!! 할 수 있어!! 완주할 수 있어!! ㅎㅎ3. Experiments Recent advances in transfer learning for NLP have come from a wide variety of developments, such as new pre-training..
-
[T5] (1/3) Exploring the Limits of Transfer Learning with a Unified Text-to-Text TransformerResearch/NLP_Paper 2024. 7. 22. 09:00
https://arxiv.org/pdf/1910.10683AbstractTransfer learning, where a model is first pre-trained on a data-rich task before being finetuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer le..
-
[GPT-3] (3/3) Language Models are Few-Shot LearnersResearch/NLP_Paper 2024. 7. 21. 15:58
https://arxiv.org/pdf/2005.141654. Measuring and Preventing Memorization Of Benchmarks Since our training dataset is sourced from the internet, it is possible that our model was trained on some of our benchmark test sets. Accurately detecting test contamination from internet-scale datasets is a new area of research without established best practices. While it is common practice to train large mo..
-
[GPT-3] (2/3) Language Models are Few-Shot LearnersResearch/NLP_Paper 2024. 7. 21. 12:35
https://arxiv.org/pdf/2005.141653. ResultsIn Figure 3.1 we display training curves for the 8 models described in Section 2. For this graph we also include 6 additional extra-small models with as few as 100,000 parameters. As observed in [KMH+20], language modeling performance follows a power-law when making efficient use of training compute. After extending this trend by two more orders of magni..
-
[GPT-3] (1/3) Language Models are Few-Shot LearnersResearch/NLP_Paper 2024. 7. 21. 11:16
https://arxiv.org/pdf/2005.14165AbstractRecent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally pe..
-
[GPT-2] Language Models are Unsupervised Multitask LearnersResearch/NLP_Paper 2024. 7. 20. 22:06
https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdfAbstract Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on tasks-pecific datasets. We demonstrate that language models begin to learn these tasks without any explicit s..