'분류 전체보기' 카테고리의 글 목록 (24 Page)

Research/NLP_CMU 2024. 7. 8. 21:58

※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=MueCRSZ3RQ0&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=14https://phontron.com/class/anlp2024/assets/slides/anlp-14-multimodel.pdf Combining multiple models, this is really important and useful if you want to get an extra few points of accuracy, because it's a pretty reliable way to get i..

Reinforcement Learning from Human Feedback

Research/NLP_CMU 2024. 7. 8. 07:29

※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=s9yyH3RPhdM&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=11https://phontron.com/class/anlp2024/assets/slides/anlp-11-distillation.pdf reinforcement learning은 묘하게 Bayesian inference랑 닮았다..! 그나저나 이 엄청난 강의 퀄리티에 어울리지 않는 마이크로폰 퀄리티 어쩔 ㅠㅠㅠㅠ What we want to do is we want to maximize the likelihoo..

Quantization, Pruning, Distillation

Research/NLP_CMU 2024. 7. 7. 13:25

※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=s9yyH3RPhdM&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=11https://phontron.com/class/anlp2024/assets/slides/anlp-11-distillation.pdfNLP models now are really deployed at a large scale and training big model is expensive. But something that is overlooked is that inference, so once you have..

Long-context Transformers

Research/NLP_CMU 2024. 7. 7. 11:34

※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=WQYi-1mvGDM&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=10https://phontron.com/class/anlp2024/assets/slides/anlp-10-rag.pdf These are models that are explicitly trained in a way that allows you to attend to longer contexts in an efficient manner. One way that we can train over longer con..

Retrieval & RAG

Research/NLP_CMU 2024. 7. 6. 16:02

※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=WQYi-1mvGDM&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=10https://phontron.com/class/anlp2024/assets/slides/anlp-10-rag.pdfIf we look at our standard prompting templete with an input, we could get a response from a language model, but there are several problems with this. The first is acc..

Fine-tuning & Instruction Tuning

Research/NLP_CMU 2024. 7. 6. 07:50

※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=KLJ3EEo8aPU&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=8https://phontron.com/class/anlp2024/assets/slides/anlp-08-instructiontuning.pdfYou have some shared parameters between the models that are trained on all tasks. If you're just training a big language model then you'll probably be sh..

Prompting

Research/NLP_CMU 2024. 7. 5. 16:05

※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=T1YrTbTkUb4&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=7https://phontron.com/class/anlp2024/assets/slides/anlp-07-prompting.pdf Prompting is a new paradigm as of a few years ago with interacting with models. It's now kind of the standard in doing so and basically what we do is we encoura..

Generation Algorithms

Research/NLP_CMU 2024. 7. 5. 11:43

※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=96MMXDA7F74&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=6https://phontron.com/class/anlp2024/assets/slides/anlp-06-generation.pdfA model M gives you a probability distribution over all tokens in its vocabulary to predict what token you would output next. Given some input X and everything ..

ABOUT ME

밤에 쓰는 편지 밤에 쓰는 편지

티스토리툴바