Update modeling_plamo.py

#7
Preferred Networks, Inc. org
edited May 22 by yhirokawa

seq len が attention window size と同一なとき、attention mask を作らずに forward できるはずですが、現状 require_attn_mask の条件が厳しすぎるため
https://huggingface.co/pfnet/plamo-2-1b/blob/main/modeling_plamo.py#L1120
の条件の not に対応するようにしました

When seq_len matches the attention window size, forwarding should proceed without creating an attention mask. However, the current require_attn_mask condition is too strict, so we modified it to match the not condition at https://huggingface.co/pfnet/plamo-2-1b/blob/main/modeling_plamo.py#L1120

Preferred Networks, Inc. org

LGTM

yhirokawa changed pull request status to merged

Sign up or log in to comment