Update modeling_plamo.py
#7
by
sokada
- opened
seq len が attention window size と同一なとき、attention mask を作らずに forward できるはずですが、現状 require_attn_mask の条件が厳しすぎるため
https://huggingface.co/pfnet/plamo-2-1b/blob/main/modeling_plamo.py#L1120
の条件の not に対応するようにしました
When seq_len matches the attention window size, forwarding should proceed without creating an attention mask. However, the current require_attn_mask condition is too strict, so we modified it to match the not condition at https://huggingface.co/pfnet/plamo-2-1b/blob/main/modeling_plamo.py#L1120
LGTM
yhirokawa
changed pull request status to
merged