ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Sanity check
    Paper Writing 1/Experiments 2024. 10. 27. 16:17

    < supervised long-term forecasting results of my base model* >

     

    * base model: GPT-2 without injecting any additional information

     

    The backbone model can be any LLM, but I used GPT-2 with 6 layers as default for simplicity. 

     

    I may conduct an ablation study on different LLM model variants and sizes. Several previous studies have demonstrated that the scaling law also applies to time-series forecasting in relation to the number of model parameters and the size of the training corpus.

     

    content length 512 / forecasting horizon 96 


    1) ETTh1 : training epochs 10

    512_96_MyModel_ETTh1_sl512_pl96_dm32_nh8_df128_0 
    test on the ETTh1 dataset: mse: 0.3996824, mae: 0.4219979

     

    2) ETTm1: training epochs 10

    512_96_MyModel_ETTm1_sl512_pl96_dm32_nh8_df128_0 
    test on the ETTm1 dataset: mse: 0.3175505, mae: 0.3626745

     

    3) Weather : training epochs 1 

    512_96_MyModel_Weather_sl512_pl96_dm32_nh8_df32_0 
    test on the weather dataset: mse: 0.1589350, mae: 0.2111652

     

    4) Electricity: training epochs 1

    512_96_MyModel_ ECL _sl512_pl96_dm32_nh8_df32_0 
    test on the electricity dataset: mse: 0.1420454 , mae: 0.2483649


    Some visualization (cherry picking)

    1) ETTh1

    2) ETTm1

    3) Weather

    'Paper Writing 1 > Experiments' 카테고리의 다른 글

    멘붕  (0) 2024.10.28
    [breaktime] LLM's pattern recognition  (0) 2024.10.28
    Glimpse of dataset - (2) real-data  (0) 2024.10.27
    Glimpse of dataset - (1) synthetic time series generation  (0) 2024.10.25
    issue #1  (0) 2024.10.24
Designed by Tistory.