Research/NLP_CMU
-
Complex ReasoningResearch/NLP_CMU 2024. 7. 11. 07:20
※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=mPd2hFmzjWE&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=19https://phontron.com/class/anlp2024/assets/slides/anlp-21-reasoning.pdfWhat is reasoning? The basic idea is using evidence and logic to arrive at conclusions and make judgements. From the philosophical standpoint, there are two var..
-
Code GenerationResearch/NLP_CMU 2024. 7. 10. 13:04
※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=bN2ZZieBXsE&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=16https://phontron.com/class/anlp2024/assets/slides/anlp-17-codegen.pdfI'm going to be talking about code generation and this is a research topic that I've worked on for a long time, now I like a lot it's become very useful nowadays ..
-
Large Language ModelsResearch/NLP_CMU 2024. 7. 9. 15:20
※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=2rOSrDtg7HQhttps://phontron.com/class/anlp2024/assets/slides/anlp-15-tourofllms.pdfI'll be talking about a tour of modern LLMs and the idea here is that, there is many many large language models available nowadays but I wanted to go through some of the ones that are particularly interesting..
-
Ensembling & Mixture of ExpertsResearch/NLP_CMU 2024. 7. 8. 21:58
※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=MueCRSZ3RQ0&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=14https://phontron.com/class/anlp2024/assets/slides/anlp-14-multimodel.pdf Combining multiple models, this is really important and useful if you want to get an extra few points of accuracy, because it's a pretty reliable way to get i..
-
Reinforcement Learning from Human FeedbackResearch/NLP_CMU 2024. 7. 8. 07:29
※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=s9yyH3RPhdM&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=11https://phontron.com/class/anlp2024/assets/slides/anlp-11-distillation.pdf reinforcement learning은 묘하게 Bayesian inference랑 닮았다..! 그나저나 이 엄청난 강의 퀄리티에 어울리지 않는 마이크로폰 퀄리티 어쩔 ㅠㅠㅠㅠ What we want to do is we want to maximize the likelihoo..
-
Quantization, Pruning, DistillationResearch/NLP_CMU 2024. 7. 7. 13:25
※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=s9yyH3RPhdM&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=11https://phontron.com/class/anlp2024/assets/slides/anlp-11-distillation.pdfNLP models now are really deployed at a large scale and training big model is expensive. But something that is overlooked is that inference, so once you have..
-
Long-context TransformersResearch/NLP_CMU 2024. 7. 7. 11:34
※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=WQYi-1mvGDM&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=10https://phontron.com/class/anlp2024/assets/slides/anlp-10-rag.pdf These are models that are explicitly trained in a way that allows you to attend to longer contexts in an efficient manner. One way that we can train over longer con..
-
Retrieval & RAGResearch/NLP_CMU 2024. 7. 6. 16:02
※ Summaries after taking 「Advanced NLP - Carnegie Mellon University」 coursehttps://www.youtube.com/watch?v=WQYi-1mvGDM&list=PL8PYTP1V4I8DZprnWryM4nR8IZl1ZXDjg&index=10https://phontron.com/class/anlp2024/assets/slides/anlp-10-rag.pdfIf we look at our standard prompting templete with an input, we could get a response from a language model, but there are several problems with this. The first is acc..