NLP
-
[MaPLe] Multi-modal Prompt LearningNLP/NLP_Paper 2024. 12. 5. 21:27
https://arxiv.org/pdf/2210.03117https://github.com/muzairkhattak/multimodal-prompt-learning(CVPR 2023)Abstract Pre-trained vision-language (V-L) models such as CLIP have shown excellent generalization ability to downstream tasks. However, they are sensitive to the choice of input text prompts and require careful selection of prompt templates to perform well. Inspired by the Natural Language Proc..
-
[DPLCLIP] Domain Prompt Learning for Efficiently Adapting CLIP to Unseen DomainsNLP/NLP_Paper 2024. 12. 5. 15:28
https://arxiv.org/pdf/2111.12853v3https://github.com/shogi880/DPLCLIP?tab=readme-ov-fileAbstract Domain generalization (DG) is a difficult transfer learning problem aiming to learn a generalizable model for unseen domains. Recent foundation models (FMs) are robust to many distribution shifts and, therefore, should substantially improve the performance of DG. In this work, we study generic ways t..
-
[DomainBed] In Search of Lost Domain GeneralizationNLP/NLP_Paper 2024. 12. 5. 10:53
https://arxiv.org/pdf/2007.01434https://github.com/facebookresearch/DomainBed?tab=readme-ov-file어떻게 실험 setting을 해서 domain generalization ability를 증명할 것인가. 평소에 뭔가 개운하지 않았던, 간과하였던 부분. model 간의 performance 차이가 정말로 model의 generalization capability에 기인하는 것인지, 아니면 hyperparameter search 혹은 다른 실험적 요소에 의한 차이인지 명확하게 구분할 수가 없어서 답답했던 부분. 이게 정말 fair comparison인가. 어디다가 어떻게 비교를 해야 기존 대비 성능이 향상되었다고 말할 수 있는가. 어..
-
Layer의 재사용에 대하여NLP/NLP_Paper 2024. 12. 3. 23:48
3개의 논문에서 제시하는 모델은 각각 다른 쓰임새와 독특한 특징을 보여주지만기저에 관통하는 공통된 concept이 있어서 흥미롭다."Reusing early layers" early layer의 feature representation을 leveraging함으로써 efficiency & performance improvement를 추구한다. 내가 동경하는 이상적 논문 형태"simple but effective!"1. Efficient Transfer Learning driven by Layer-wise Features Aggregationhttps://openreview.net/pdf?id=Q0tfRYadhchttps://github.com/MLAI-Yonsei/LFA* MotivationTransfe..
-
A High-level Overview of Large Language ModelsNLP/NLP_Paper 2024. 12. 1. 08:55
https://rbcborealis.com/research-blogs/a-high-level-overview-of-large-language-models/Jul, 12, 2023Since 2022, a series of AI systems have been introduced that enable machines to read, analyze, interpret, and derive meaning from human language. One such system is ChatGPT, which gained a over a hundred million users within a mere two months of its launch in November 2022. Its successor, GPT-4 was..
-
[Attention Rollout] Explainability for Vision TransformersNLP/NLP_reference 2024. 11. 22. 09:16
https://jacobgil.github.io/deeplearning/vision-transformer-explainabilityhttps://github.com/jacobgil/vit-explainBackgroundIn the last few months before writing this post, there seems to be a sort of a breakthrough in bringing Transformers into the world of Computer Vision. To list a few notable works about this:An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,Training d..
-
[Project Proposal] Improving the performance of machine-generated text (MGT) detection by identifying the significance of individual tokensNLP/NLP_Paper 2024. 11. 11. 14:49
※ 수정 중..!!※ This is the project proposal for Team 5 in the 2024 NLP class.※ The main idea for this project was provided by D.H. Lee.※ The content of this proposal is based on discussions with our team members: S.J.Kim, D.H.Lee, S.J.Lee, S.Y.Park.※ The final proposal PPT will be created in collaboration with S.J.Lee.※ The paper review presentation will be given by S.J.Kim.※ The proposal & project..
-
Causal Interpretation of Self-Attention in Pre-Trained TransformersNLP/NLP_Paper 2024. 11. 11. 10:38
https://arxiv.org/pdf/2310.20307(Oct 2023, NeurIPS) ※ 2024 NLP class team project's subjectAbstractWe propose a causal interpretation of self-attention in the Transformer neural network architecture. We interpret self-attention as a mechanism that estimates a structural equation model for a given input sequence of symbols (tokens). The structural equation model can be interpreted, in turn, as a ..