You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
C:\git\llama2.c>export.py stories-llama2-50k.bin --version 2 --hf stories-llama2-50k
Traceback (most recent call last):
File "C:\git\llama2.c\export.py", line 561, in <module>
model = load_hf_model(args.hf)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\git\llama2.c\export.py", line 475, in load_hf_model
layer.attention.wk.weight = nn.Parameter(permute_reverse(hf_dict[f'model.layers.{i}.self_attn.k_proj.weight']))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\git\llama2.c\export.py", line 469, in permute_reverse
return w.view(n_heads, 2, dim1 // n_heads // 2, dim2).transpose(1, 2).reshape(dim1, dim2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: shape '[3, 2, 1, 6]' is invalid for input of size 12
I've also converted some other models, apparently successfully, but they either don't work properly (like outputting garbage tokens) in llama2.c, or even outright crash it.
The text was updated successfully, but these errors were encountered:
Here's an example. It's this 50k TinyStories model from HF.
https://huggingface.co/delphi-suite/stories-llama2-50k
I've also converted some other models, apparently successfully, but they either don't work properly (like outputting garbage tokens) in llama2.c, or even outright crash it.
The text was updated successfully, but these errors were encountered: