Update modeling_plamo.py

by sokada - opened May 22

base: refs/heads/main

←

from: refs/pr/7

Discussion Files changed

-2

sokada

Preferred Networks, Inc. org May 22

•

edited May 22 by

yhirokawa

seq len が attention window size と同一なとき、attention mask を作らずに forward できるはずですが、現状 require_attn_mask の条件が厳しすぎるため
https://huggingface.co/pfnet/plamo-2-1b/blob/main/modeling_plamo.py#L1120
の条件の not に対応するようにしました

When seq_len matches the attention window size, forwarding should proceed without creating an attention mask. However, the current require_attn_mask condition is too strict, so we modified it to match the not condition at https://huggingface.co/pfnet/plamo-2-1b/blob/main/modeling_plamo.py#L1120

Update modeling_plamo.pya071714c

yhirokawa

Preferred Networks, Inc. org May 22

LGTM

yhirokawa changed pull request status to merged May 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment