-
strict=FalsePaper Writing 1/Experiments 2024. 10. 30. 02:53
미스테리..
strict=False 하면 되야하는 거 아녀..?
원래 vision checkpoint는 model 내에서 loading 했었는데,
base model checkpoint로 finetuning하려고 load_state_dict하면서 strict=False해도,
vision checkpoint 내놓으라고 해서..
model 내에서 loading하던 거랑 똑같이 해줬는데, glob.glob도 못읽고..
그래서 hard coding해서 주소 넣어줬는데,
또, strict=False 안하고, mismatch나서..
그래서 결국 내가 어떻게 했냐면.. ㅋㅋㅋㅋㅋ
"vision_tower.vision_model.embeddings.patch_embedding.weight", "vision_tower.vision_model.embeddings.patch_embedding.bias", "vision_tower.vision_model.embeddings.position_embedding.weight", "vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.0.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.0.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.0.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.0.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.1.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.1.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.1.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.1.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.2.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.2.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.2.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.2.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.3.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.3.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.3.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.3.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.4.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.4.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.4.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.4.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.5.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.5.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.5.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.5.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.6.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.6.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.6.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.6.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.7.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.7.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.7.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.7.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.8.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.8.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.8.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.8.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.9.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.9.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.9.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.9.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.10.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.10.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.10.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.10.layer_norm2.bias", "vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight", "vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias", "vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight", "vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias", "vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight", "vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias", "vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight", "vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias", "vision_tower.vision_model.encoder.layers.11.layer_norm1.weight", "vision_tower.vision_model.encoder.layers.11.layer_norm1.bias", "vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight", "vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias", "vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight", "vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias", "vision_tower.vision_model.encoder.layers.11.layer_norm2.weight", "vision_tower.vision_model.encoder.layers.11.layer_norm2.bias", "vision_tower.vision_model.post_layernorm.weight", "vision_tower.vision_model.post_layernorm.bias", "projector.linear.weight", "projector.linear.bias", "vision_reprogramming_layer.query_projection.weight", "vision_reprogramming_layer.query_projection.bias", "vision_reprogramming_layer.key_projection.weight", "vision_reprogramming_layer.key_projection.bias", "vision_reprogramming_layer.value_projection.weight", "vision_reprogramming_layer.value_projection.bias", "vision_reprogramming_layer.out_projection.weight", "vision_reprogramming_layer.out_projection.bias"
ㅋㅋㅋ 이만큼의 mismatch를 한땀한땀 심어줘야 하는 건가 ㅋㅋㅋㅋㅋㅋㅋㅋㅋㅋㅋㅋㅋㅋㅋ
잠시 고민했는데..
그냥 1epoch 돌려서 checkpoint 만든 다음에..
basemodel checkpoint로 update했어...................
하................
진짜....
별 짓을 다한다 오만 생쑈 ㅠㅠㅠㅠ 엉어엉하엏 어ㅏㅓ아ㅓ이어어어ㅓㄷ ㅓ앟어하어엉 ㅠㅠㅠ
그래서 지금 상황이 어디까지 진전이 되었냐면..
base model로 training한 후에 checkpoint를 가져와서, vision prompting을 넣어서 fine-tuning하는 거는 가능해졌어..
근데 이게 naive하게 생각해낸 해결책인데.. 이렇게 해서 유의미한 결과가 나올까..?
그리고 vision prompting을 넣어서 training을 하려면, 아주 작게 작게 밖에 못해.. memory 꽉차서..
이렇게 계속 삽질하면서 끌고가고 있는데, 이게 맞는 건지 모르겠다 ㅠㅠ
나 이렇게 하는 게 맞는건가..? ㅠㅠㅠㅠ
'Paper Writing 1 > Experiments' 카테고리의 다른 글
Experimental results # 1 (0) 2024.11.06 change log + issues + thinking about recent advancement of Time Series FMs (0) 2024.11.02 멘붕 (0) 2024.10.28 [breaktime] LLM's pattern recognition (0) 2024.10.28 Sanity check (1) 2024.10.27