bugfix: unexpected behavior when hidden_dim % group_size != 0 #532

EmreAdabag · 2024-07-10T21:14:03Z

This fixes two bugs that cause unexpected behavior when the hidden dim isn't evenly divisible by the quantization group size like in Stories42M which has hidden dim 1376 and group size 64.

Matmul uses the wrong scaling factors when performing matmul(_, _, _, hidden_dim, _);
When quantizing vectors of length hidden_dim the tail-end hidden_dim % group_size elements aren't quantized.

This fix enables inference to be run with quantized models exported by export.py regardless of hidden_dim % group_size. This has been tested with Stories42M and validated against a python implementation of quantized inference. There will be a negligible performance hit caused by smaller group sizes during the matrix multiplication, otherwise the performance of quantized inference should remain unchanged.

Alternatively/additionally, just ensure that hidden_dim % group_size == 0 in export.py.

Both bugs cause unexpected behavior when the model's hidden dim isn't evenly divisible by the group size, like Stories42M which has hidden dim 1376 and group size 64.

EmreAdabag · 2024-07-10T21:16:56Z

Fixing this in export.py
#533

Fixed two bugs in inference of quantized models

0b7d1ac

Both bugs cause unexpected behavior when the model's hidden dim isn't evenly divisible by the group size, like Stories42M which has hidden dim 1376 and group size 64.

EmreAdabag mentioned this pull request Jul 10, 2024

bugfix: export.py allows hidden_dim % group_size != 0 #533

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix: unexpected behavior when hidden_dim % group_size != 0 #532

bugfix: unexpected behavior when hidden_dim % group_size != 0 #532

EmreAdabag commented Jul 10, 2024

EmreAdabag commented Jul 10, 2024

bugfix: unexpected behavior when hidden_dim % group_size != 0 #532

Are you sure you want to change the base?

bugfix: unexpected behavior when hidden_dim % group_size != 0 #532

Conversation

EmreAdabag commented Jul 10, 2024

EmreAdabag commented Jul 10, 2024