Nothing Special   »   [go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The value of each transformerblock in the infer phase is particularly large #573

Open
lgs00 opened this issue Dec 3, 2024 · 1 comment

Comments

@lgs00
Copy link
lgs00 commented Dec 3, 2024

When I infer with cogvideoX-1.5-t2v, I found an interesting thing, after the processing ofself.norm1 = CogVideoXLayerNormZero, the value of norm_hidden_states was particularly large. What is the reason? In this line of diffusers:(https://github.com/huggingface/diffusers/blob/30f2e9bd202c89bb3862c8ada470d0d1ac8ee0e5/src/diffusers/models/transformers/cogvideox_transformer_3d.py#L127)
norm_hidden_states, norm_encoder_hidden_states, gate_msa, enc_gate_msa = self.norm1(hidden_states, encoder_hidden_states, temb)
image

@lgs00 lgs00 closed this as completed Dec 3, 2024
@lgs00 lgs00 reopened this Dec 3, 2024
@lgs00
Copy link
Author
lgs00 commented Dec 3, 2024

but when I infer at cogvideoX1.0-5b, norm_hidden_states look normal. I wonder why the scale shrinks a lot in 1.5, while the result of 1.0 is more consistent with our cognition. Curious what causes this change, norm_hidden_states shouldn't normally be above 100.
The result of 1.0 is as follows:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant