Paper Writing 1
-
[GPT4MTS] Prompt-Based Large Language Model for Multimodal Time-Series ForecastingPaper Writing 1/Related_Work 2024. 11. 3. 17:01
https://doi.org/10.1609/aaai.v38i21.30383(March, 2024)AbstractTime series forecasting is an essential area of machine learning with a wide range of real-world applications. Most of the previous forecasting models aim to capture dynamic characteristics from uni-modal numerical historical data. Although extra knowledge can boost the time series forecasting performance, it is hard to collect such i..
-
[Time-MoE] Billion-Scale Time Series Foundation Models with Mixture of ExpertsPaper Writing 1/Related_Work 2024. 11. 1. 10:52
https://arxiv.org/pdf/2409.16040https://github.com/Time-MoE/Time-MoE(Sep 2024)AbstractDeep learning for time series forecasting has seen significant advancements over the past decades. However, despite the success of large-scale pre-training in language and vision domains, pre-trained time series models remain limited in scale and operate at a high cost, hindering the development of larger capab..
-
[Chronos] Learning the Language of Time SeriesPaper Writing 1/Related_Work 2024. 10. 30. 23:38
https://arxiv.org/pdf/2403.07815https://github.com/amazon-science/chronos-forecastingAbstractWe introduce Chronos, a simple yet effective framework for pretrained probabilistic time series models. Chronos tokenizes time series values using scaling and quantization into a fixed vocabulary and trains existing transformer-based language model architectures on these tokenized time series via the cro..
-
[MOMENT] A Family of Open Time-series Foundation ModelsPaper Writing 1/Related_Work 2024. 10. 30. 17:16
(Feb 2024 ICML 2024)https://arxiv.org/pdf/2402.03885https://github.com/moment-timeseries-foundation-model/momentAbstractWe introduce MOMENT, a family of open-source foundation models for general-purpose time series analysis. Pre-training large models on time series data is challenging due to (1) the absence of a large and cohesive public time series repository, and (2) diverse time series charac..
-
strict=FalsePaper Writing 1/Experiments 2024. 10. 30. 02:53
미스테리..strict=False 하면 되야하는 거 아녀..? 원래 vision checkpoint는 model 내에서 loading 했었는데, base model checkpoint로 finetuning하려고 load_state_dict하면서 strict=False해도,vision checkpoint 내놓으라고 해서.. model 내에서 loading하던 거랑 똑같이 해줬는데, glob.glob도 못읽고.. 그래서 hard coding해서 주소 넣어줬는데, 또, strict=False 안하고, mismatch나서.. 그래서 결국 내가 어떻게 했냐면.. ㅋㅋㅋㅋㅋ "vision_tower.vision_model.embeddings.patch_embedding.weight", "vision_tower..
-
멘붕Paper Writing 1/Experiments 2024. 10. 28. 22:26
아 진짜 우울하다.. 좌절 그 잡채. 기껏 코드 잘 짜놓고 다 잘 돌아가는데.. Siglip을 갖다 붙히기만 하면 너무 무거워져서.. training 시간이 엄청 길어지고, memory를 계속 잡아먹어서 학습을 진행할 수가 없다. vision_prompt=False argument 받아서, vision_prompting 기능을 turn off 해버려서, dataloader batch내에서 image 처리 하지 않도록 하고, model에서도 vision tower build하지 않도록 해도, (그래서 base model만 돌아가도록 해도) siglip이 붙어있는 거 만으로도 엄청나게 느려진다. base model 만 training한 checkpoint를 가져온 후에, vision_prompting 기능..
-
[breaktime] LLM's pattern recognitionPaper Writing 1/Experiments 2024. 10. 28. 15:00
Just for fun! I'm just curious about how can model recognize the patterns in the sequences. Later on, I'm planning to research on how the model with mixture of attention and s6 layer can recoginize various patterns in the sequences. I think it will be interesting. I trained my base model* (gpt2 frozen, wo/ additional information injection, only alignment(cross-attn) & head parameters updated) on..