You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can I have your WeChat for further discussion? I have some exciting concepts in mind for this project. I'm also willing to make a small contribution.
My WeChat ID is the lowercase of my GitHub nickname.
Scales have been statically applied to the weights (offline). The differences between W8A8BFP32OFP32LinearWithQuantScale and W8A8BFP32OFP32Linear are engineering-related, not algorithmic.
for example here:
https://github.com/AniZpZ/AutoSmoothQuant/blob/main/autosmoothquant/models/llama.py#L89
Is the difference whether it involves
quant_scale
or not?quant_scale
is for activitionx
anddequant_scale
is forweight
, right ?The text was updated successfully, but these errors were encountered: