Research/NLP_reference
-
[Attention Rollout] Explainability for Vision TransformersResearch/NLP_reference 2024. 11. 22. 09:16
https://jacobgil.github.io/deeplearning/vision-transformer-explainabilityhttps://github.com/jacobgil/vit-explainBackgroundIn the last few months before writing this post, there seems to be a sort of a breakthrough in bringing Transformers into the world of Computer Vision. To list a few notable works about this:An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,Training d..
-
KV CacheResearch/NLP_reference 2024. 9. 29. 11:19
https://medium.com/@joaolages/kv-caching-explained-276520203249Transformers KV Caching Explained- How caching Key and Value states makes transformers fasterCaching the Key (K) and Value (V) states of generative transformers has been around for a while, but maybe you need to understand what it is exactly, and the great inference speedups that it provides. The Key and Value states are used for cal..
-
Gemma 2Research/NLP_reference 2024. 9. 8. 23:41
(June 27, 2024) https://huggingface.co/blog/gemma2#knowledge-distillationWelcome Gemma 2 - Google's new open LLMGoogle released Gemma 2, the latest addition to its family of state-of-the-art open LLMs, and we are excited to collaborate with Google to ensure the best integration in the Hugging Face ecosystem. You can find the 4 open-weight models (2 base models & 2 fine-tuned ones) on the Hub. Am..
-
Llama 3.1Research/NLP_reference 2024. 9. 8. 17:38
https://huggingface.co/blog/llama31 (July 23, 2024)Llama 3.1 - 405B, 70B & 8B with multilinguality and long contextLlama 3.1 is out! Today we welcome the next iteration of the Llama family to Hugging Face. We are excited to collaborate with Meta to ensure the best integration in the Hugging Face ecosystem. Eight open-weight models (3 base models and 5 fine-tuned ones) are available on the Hub. L..
-
Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRAResearch/NLP_reference 2024. 9. 8. 03:05
https://huggingface.co/blog/4bit-transformers-bitsandbytesLLMs are known to be large, and running or training them in consumer hardware is a huge challenge for users and accessibility. Our LLM.int8 blogpost showed how the techniques in the LLM.int8 paper were integrated in transformers using the bitsandbytes library. As we strive to make models even more accessible to anyone, we decided to colla..
-
How to Successfully Run a LLM Fine-Tuning ProjectResearch/NLP_reference 2024. 9. 7. 11:36
https://levelup.gitconnected.com/how-to-successfully-run-a-llm-fine-tuning-project-my-personal-insights-on-choosing-the-right-c3640d00665dhttps://levelup.gitconnected.com/a-step-by-step-guide-to-runing-mistral-7b-ai-on-a-single-gpu-with-google-colab-274a20eb9e40https://levelup.gitconnected.com/unleash-mistral-7b-power-how-to-efficiently-fine-tune-a-llm-on-your-own-data-4e4386a6bbdcWhen should yo..
-
RoPEResearch/NLP_reference 2024. 8. 27. 12:16
https://www.slideshare.net/slideshow/roformer-enhanced-transformer-with-rotary-position-embedding/250482951https://medium.com/ai-insights-cobet/rotary-positional-embeddings-a-detailed-look-and-comprehensive-understanding-4ff66a874d83https://www.youtube.com/watch?v=o29P0Kpobz0https://medium.com/@ngiengkianyew/understanding-rotary-positional-encoding-40635a4d078eThe Need for Positional Embeddings ..
-
Prompt Engineering Guide (2/2)Research/NLP_reference 2024. 7. 24. 13:45
https://www.promptingguide.ai/https://github.com/dair-ai/Prompt-Engineering-Guidehttps://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-the-openai-apiTechniquesZero-shot PromptingLarge language models (LLMs) today, such as GPT-3.5 Turbo, GPT-4, and Claude 3, are tuned to follow instructions and are trained on large amounts of data. Large-scale training makes these..