Loading pretrained weight from any model #514
SeriousJ55
started this conversation in
General
Replies: 3 comments 3 replies
-
Thanks for the kind words! And yes you can! Actually, I helped doing that for many models in the context of LitGPT ( https://github.com/Lightning-AI/litgpt) |
Beta Was this translation helpful? Give feedback.
0 replies
-
Wow! LitGPT is impressive and very useful!!!
Just out of curiosity, can you briefly explain how you can fine tune a model without knowing the full architecture?
From your book, in chapters 5 and 7, it that you need to know the architecture.
…________________________________
De : Sebastian Raschka ***@***.***>
Envoyé : Tuesday, February 4, 2025 8:35:43 PM
À : rasbt/LLMs-from-scratch ***@***.***>
Cc : SeriousJ55 ***@***.***>; Author ***@***.***>
Objet : Re: [rasbt/LLMs-from-scratch] Loading pretrained weight from any model (Discussion #514)
Thanks for the kind words! And yes you can! Actually, I helped doing that for many models in the context of LitGPT ( https://github.com/Lightning-AI/litgpt)
—
Reply to this email directly, view it on GitHub<#514 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGNJVK7XE7YCI2UOIMZX52L2OEJA7AVCNFSM6AAAAABWPEWWEGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEMBVHA4TOOA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
-
In understand. But the question remains: how do you find out the architecture? Is it always published in research papers? Do you reverse engineer the weights?
…________________________________
De : Sebastian Raschka ***@***.***>
Envoyé : Wednesday, February 5, 2025 1:38:51 PM
À : rasbt/LLMs-from-scratch ***@***.***>
Cc : SeriousJ55 ***@***.***>; Author ***@***.***>
Objet : Re: [rasbt/LLMs-from-scratch] Loading pretrained weight from any model (Discussion #514)
Yes, we had to know the architectures to implement them in LitGPT and to be also able to load the pretrained weights. This is all done in these 2 files:
* https://github.com/Lightning-AI/litgpt/blob/main/litgpt/model.py
* https://github.com/Lightning-AI/litgpt/blob/main/litgpt/config.py
Often, it was a lot of work to find out the implementation details. I remember spending days to a week on some.
—
Reply to this email directly, view it on GitHub<#514 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGNJVKZKRW7WUH26W66JFD32OIA5XAVCNFSM6AAAAABWPEWWEGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEMBWHEYDSNQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Thank you again @rasbt for this amazing book!!! I'm still reading and re-reading it.
In order to use weights from a model like GPT-2 or Llama, you need to:
I'm amazed that you could do that! It's not an easy task.
My question are:
Beta Was this translation helpful? Give feedback.
All reactions